The outlook is that the performance gap with the Apple M-series CPUs keeps widening, right? Those are also due another node jump, and the compounded annualized performance growth rate over say 2023-2027 doesn’t seem to be even close to Apple’s - or am I underestimating the clock • ipc improvements?
Mainstream yes, anything with margins is N2p though.though in Laptops it's a single node jump I think
In mobile doubt and Apple/QCOM will have gains as well so I think they will lead by single digit % IN STI don't think it is widening, AMD will probably get around 25% 1t uplift with Zen 6 with the double node jump, though in Laptops it's a single node jump I think. I expect Apple to match that 25% uplift between M4 and M6.
Apple achieves higher ST performance through:The outlook is that the performance gap with the Apple M-series CPUs keeps widening, right? Those are also due another node jump, and the compounded annualized performance growth rate over say 2023-2027 doesn’t seem to be even close to Apple’s - or am I underestimating the clock • ipc improvements?
In mobile doubt and Apple/QCOM will have gains as well so I think they will lead by single digit % IN ST
Mainstream yes, anything with margins is N2p though.
Cinebench 2024:I am not commenting about leading, Apple already leads in st, I am talking about the margin widening, that's what I don't think will happen. Apple is not exactly showing huge gains in general. I expect the margin between Apple M4 and Zen 5 to be more or less maintained with M6 vs Zen6.
Zen 6 will not have a node advantage cause M6 on N2 exists and the gap won't be this large maybe around ~10-15%>Apple is currently 27% ahead in CB 2024 ST vs Zen 5. Gap will probably close quite a bit with Zen 6 vs M5, as Zen 6 will have full node advantage, but revert back to the ~25% or so advantage for M6 if that uses N2 or A16.
In the Mac Studio it scores 186-190 ST in CB 2024 due to better cooling and it reaching the 4.5GHz clock and sustaining it.Cinebench 2024:
M4 Max 177 1t
285K 145 1t
9950X 139 1t
Apple is currently 27% ahead in CB 2024 ST vs Zen 5. Gap will probably close quite a bit with Zen 6 vs M5, as Zen 6 will have full node advantage, but revert back to the ~25% or so advantage for M6 if that uses N2 or A16.

Server, desktop, desktop replacement is n2 class. Bulk mobile is n3 class.Which parts would be N2P? I would think something like Medusa Point is a part with margins, like Strix Point is, would that use N2P?
But muh 7 GHz!! lol, yeah that is a hell of a gap. It would take a 7GHz boost with +10% IPC in 2024 to make 187. Thats not happening, so it looks like the gap will be widening. I didnt know M6 was coming on N2 next year as well.In the Mac Studio it scores 186-190 ST in CB 2024 due to better cooling and it reaching the 4.5GHz clock and sustaining it.
M5 Max would be >190. Looking at Geekbench 6 and 5 results it looks like Apple beefed up FP even more than M3 to M4.
View attachment 132299
The common speculation and rumor (especially in this forum) is that everything Zen 6 will be on N2 (and some saying N2P which I haven't figured out how that is possible for a 2026 launch) except for a nebulous "bargain laptop" market.Which parts would be N2P? I would think something like Medusa Point is a part with margins, like Strix Point is, would that use N2P?
Apple achieves higher ST performance through:
1) Largely ignoring MT performance
2) Single die vs multi-chip-module
3) Higher performance and more expensive lithography node.
yeah they still have skinnier cores with server-focused caching strategy. next.They're on board with N2 though so you won't have that excuse for much longer, though I'm sure you'll find another one!
doesn’t Qualcomm manage similar to Zen area but yeah it comes with a trade off. Skinny cores are not made for high IPC, low frequency.yeah they still have skinnier cores with server-focused caching strategy. next.
How wide a core has a relation but not the sole driver of area. So it's not that amd isn't spending area but they aren't spending it on core width or OOE resources in the same way ost of there competitors do.doesn’t Qualcomm manage similar to Zen area but yeah it comes with a trade off. Skinny cores are not made for high IPC, low frequency.
Apple does manage to get high IPC with lower frequency compared to the competition so at least the fat is used well.
I hope this is what Intel is going for with Unified Core either skinny cores or a fat core thats actually fat for a reason and not cause it’s bad design.
Apple cores are not made to scale like Zen 5 cores are (192c). As you point out, it isn't a design flaw in anyway, it is simply a design decision for a different market.How does Apple "ignore" MT performance? Yes the Max could have more CPU cores, but it would come at the cost of a weaker GPU or a larger more expensive chip. They're responding to what their customers demand.
Anyway, not having a Threadripper / EpyC / Xeon type chip that's all CPU without "wasting" space on GPU, NPU, display controllers and so forth is not increasing their ST. It isn't as if their P cores are too large to make a chip like that if they wanted, and their peak 1T power is less than AMD/Intel so they certainly have the power budget.
Not sure what you even mean with #2. Both AMD and Intel use chiplets, and have some designs where they closely couple LPDDR5X. How are those different than what Apple is doing. And yes Apple is using the best node they have access to. Intel was right alongside them with N3B, I didn't see them competing on ST though. AMD has deliberately chosen hang back on nodes, it isn't Apple's fault they've been cheap in the past. They're on board with N2 though so you won't have that excuse for much longer, though I'm sure you'll find another one!
I think it is strategic and intentional. If you could take over 1 market in computing, DC would be the one you would want most to win over as it has the highest margins.What i find interesting with all the zen cores, is AMD in effect doesn't chase high IPC at the expense of overall server socket performance. What i mean by that is they actually have a quite small OOE engine in all the Zen CPU's relative to the CPU's they are compared to. AMD spends its core xtor budget in efficiently getting data/ops in and out of the Core.
Having a large OOE engine means more bandwidth pressure to memory, if you arent as efficient in terms of bytes of data into the core per retired op then at large core counts you would see regressions unless you then spent even more xtor budget/power on the memory sub system.
I don’t think tablet chips need chiplets, the Mx Pro/Max chips might use chiplets.Also, with the exponentially rising cost of node shrinks and the corresponding increases in wafer costs, the use of chiplets is a necessity for profit (and yields). Monolithic designs are unrivaled in their efficiencies at lower core counts, but given the minimal transistor density increases we are looking at from here to eternity, chiplet designs is a necessity.
Tablet chips are no different when it comes to maximizing profit by increasing yields with small chiplets vs a large monolithic design.I don’t think tablet chips need chiplets, the Mx Pro/Max chips might use chiplets.
Not sure if a large OOE will really degrade performance, sounds illogical. At least for server, where peak clock-rate is not a factor (because much lower what the core actually can do). It might not yield in many performance gains because you choke it elsewhere. But a large OOE can hide memory bottlenecks because of extended reorder capabilities. That will not work for all applications, but some. In the end, performance gets determined by a factor of IPC * Frequency. If a wide OOE cannot clock that high, it might still deliver the same performance and therefore memory pressure. But you have to spend more, because the chip area gets bigger. But it might be more efficient, because it clocks lower. So there you have it, the intricate balance of PPA.Having a large OOE engine means more bandwidth pressure to memory, if you arent as efficient in terms of bytes of data into the core per retired op then at large core counts you would see regressions unless you then spent even more xtor budget/power on the memory sub system.
Tablet chips are no different when it comes to maximizing profit by increasing yields with small chiplets vs a large monolithic design.
Additionally, chiplets allow the non-core logic to be created on a less expensive process node.
What chiplets don't do in a laptop market is provide higher performance than a monolithic design does.
Interesting to hear that the PRO/Max chips might use them though!