N3 has Cac reduction vs N4.
Doesn't seem all that large tbh

But also how does improvements in active power help them reduce idle/SOC power usage that would make up the much lower reported core V/F curve?
I'm referring to Huang's performance profiling for the BPU comments, not the mis predict penalty graph
Pipeline flushes mean L2 fetches with tiny L1's.
I don't even think this graph would show that for 3 reasons:
1, I don't think the array length or branch count are high enough to necessitate a fetch from the L2
2, even if it did, they would require that from both the predictable and random cases, meaning that it should be eliminated from the difference,
3, and if it wasn't, missing the L1i and requiring an L2 hit would cause a latency penalty in cycles almost as much as the mis predict penalty already is. It doesn't make any sense, the graph can't be showing that.
Yes?
AMD front-end latency is down to having a miserably small 32K L1i (with a much nicer core otherwise).
That would be reflected in a graph like this:

Or this:

But not on the graph I posted.
Doesn't seem likely when NVL-S has 75% higher TDP than OR
Way more cores though.