Question Zen 6 Speculation Thread

Page 243 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
27,235
16,106
136
The problem I see is no one has the full spectrum of benchmarks, mostly avx-512/ Also, benchmarks don't cover all cases. Right now in DC we are running tasks that don't want SMT and don't want to cross ccd lines, so we use tak pinning. Failure to do so, makes the tasks go from one hour to 10 or sometimes days,

Temps when you run a fully loaded serious use of avx-512 can quickly get out of control.
 

511

Diamond Member
Jul 12, 2024
4,508
4,124
106
The problem I see is no one has the full spectrum of benchmarks, mostly avx-512/ Also, benchmarks don't cover all cases. Right now in DC we are running tasks that don't want SMT and don't want to cross ccd lines, so we use tak pinning. Failure to do so, makes the tasks go from one hour to 10 or sometimes days,
What are these Tasks? Won't a monolith large CPU like GNR would be better for 44C? If you want to minimize Cross CCD talk.
 

basix

Senior member
Oct 4, 2024
241
495
96
This is a all core benchmark not peak and ~10-11% IPC over a 2 years is not a tall order
If you can do a higher frequency at more cores, it is very likely that you can do a higher frequency at lower core counts. This is just how physics works (unless you hit a hard by-design frequency wall). The core seems to scale well regarding frequency. I might get proven wrong, but for now that seems to be the case. Same as Zen 5 has delivered over Zen 4. And it makes very much sense for most use case. It is hard to scale to that many cores.

As I see it, 4.5 GHz Fmax of Zen 6c CPUs can happen. Not that it must happen, but it is not unreasonable to assume that. 4.0 GHz is the lower end of the expectance spectrum. 5.0 GHz probably too far fetched. 4.5 GHz as nice middle ground ;)
 
  • Like
Reactions: Tlh97 and 511

OneEng2

Senior member
Sep 19, 2022
835
1,104
106
It is simple: 128C is not required with "P" cores
Frequency scaling is a tremendously bad idea to overcome core scaling..... especially in DC.

I too am saying that AMD will not be limited to 96c for the full Zen 6 variants. It simply doesn't make any sense. I don't care what leak you quote.
 

BorisTheBlade82

Senior member
May 1, 2020
707
1,130
136
It is simple: 128C is not required with "P" cores
  • 96C will clock higher, which is better for many applications. 128C+ SKUs / Applications / Use-Cases will not require super high ST performance in very most cases
  • Same applies for any V-Cache SKU and use case
  • Zen 6c in N2 hits probably quite decent clock rates, let's say 4.5+ GHz instead of 3.7 GHz of Zen 5 (N2 FinFlex / NanoFlex for the win; 2nd generation "c" cores and respective learnings)
  • 128C SKUs will probably not hit >4.5 GHz (EPYC 9755 peaks at 4.1 GHz)
  • 4.5 GHz Zen 6 kills already all Zen 5 EPYC SKUs as well as any Intel counterpart in ST Benchmarks (5.0 GHz max. on F-SKUs compared to Zen 6 with >10% IPC increase) --> Good enough
  • Zen 6c features the full amount of L3-Cache (128 MByte per 32C chiplet)

So there you have it:
No reason for the "P" Cores. You can deliver 128C Zen 6 with 32C Chiplets. This will be cheaper and probably more energy efficient. And if you like (and memory bandwidth / PCIe Lanes are not important for you) also with one IOD (8ch memory saves space in your server rack).

Zen 7 might increase then from 12C to 16C and you have additional +33% cores and 128C with "P" cores again ;)
Not arguing your points, merely adding:
IIRC it is speculated that the new IOD supports up to 4 CCDs but can also be used twice on an SKU via a big bad interconnect. So you get up to 2*4*12c Zen6 = 96c or 2*4*32c = 256c Zen6c.
Having fewer CCD per SKU makes sense to me, as the new IFoX might have more restrictions regarding placement or spacing.
 

basix

Senior member
Oct 4, 2024
241
495
96
Frequency scaling is a tremendously bad idea to overcome core scaling..... especially in DC.

If you say so.

Have you ever tried to scale applications beyond 64C? Or even 128C? It is quite hard. You can look at Phoronix benchmark suites, where high core count SKU performance scaling is deep in the sub-linear range.
Frequency will always scale, no matter what amount of cores you throw at your problem. It is just the reality of many applications.

The other reason for 96C is simply the limitation to max. 8 CCDs as BorisTheBlade82 pointed out. And another point could be that you hit a power wall before you exceed Zen 6c clock rates at 128C. Then it would not make sense to use "P" cores unless you expect low usage rates of your CPU (what you do not want as a server host / provider due to economic reasons).
 
  • Like
Reactions: Tlh97 and marees

Josh128

Golden Member
Oct 14, 2022
1,318
1,983
106
I too am saying that AMD will not be limited to 96c for the full Zen 6 variants. It simply doesn't make any sense. I don't care what leak you quote.
For the record let me just say that I never said anything about Zen 6 classic being limited any specific number of core counts until I was told by someone else that I did. After which I only questioned why it would drop from current Zen 5 counts.

Apparently I missed this roadmap leak everyone is talking about. The most recent one I can find is the one for mobile mentioning "Gator Rande".



Fish Hit 2.gif
 
  • Haha
Reactions: Tlh97 and Kaluan

DZero

Golden Member
Jun 20, 2024
1,623
629
96
Atom Consistently did more than that.
should specify the P cores.😂😂
If we talk small cores I see AMD doing the job with the Zen c cores. So, x86 is covered.

Meanwhile Apple pulled big time with the e cores that are near P core performance. Made me think on which tier will be compared to x86 cores... Tiger lake tier maybe?

And ARM stock small cores? Well... can't reach Haswell tier.
 
  • Like
Reactions: Tlh97

511

Diamond Member
Jul 12, 2024
4,508
4,124
106
If we talk small cores I see AMD doing the job with the Zen c cores. So, x86 is covered.

Meanwhile Apple pulled big time with the e cores that are near P core performance. Made me think on which tier will be compared to x86 cores... Tiger lake tier maybe?

And ARM stock small cores? Well... can't reach Haswell tier.
You are forgetting the fact that Apple is on latest node while Intel/AMD are on 2-3 year old nodes.
 
  • Like
Reactions: Tlh97

Joe NYC

Diamond Member
Jun 26, 2021
3,633
5,174
136
It is simple: 128C is not required with "P" cores
  • 96C will clock higher, which is better for many applications. 128C+ SKUs / Applications / Use-Cases will not require super high ST performance in very most cases
  • Same applies for any V-Cache SKU and use case
  • Zen 6c in N2 hits probably quite decent clock rates, let's say 4.5+ GHz instead of 3.7 GHz of Zen 5 (N2 FinFlex / NanoFlex for the win; 2nd generation "c" cores and respective learnings)
  • 128C SKUs will probably not hit >4.5 GHz (EPYC 9755 peaks at 4.1 GHz)
  • 4.5 GHz Zen 6 kills already all Zen 5 EPYC SKUs as well as any Intel counterpart in ST Benchmarks (5.0 GHz max. on F-SKUs compared to Zen 6 with >10% IPC increase) --> Good enough
  • Zen 6c features the full amount of L3-Cache (128 MByte per 32C chiplet)

So there you have it:
No reason for the "P" Cores. You can deliver 128C Zen 6 with 32C Chiplets. This will be cheaper and probably more energy efficient. And if you like (and memory bandwidth / PCIe Lanes are not important for you) also with one IOD (8ch memory saves space in your server rack).

Zen 7 might increase then from 12C to 16C and you have additional +33% cores and 128C with "P" cores again ;)

It probably just was not worthwhile for AMD to come up with 3rd CCD (16 core) in addition to 12 core and 32 core.

Additionally, there may be only 1 V-Cache die needed to address this 12 core CCD, in server, desktop and mobile.

If AMD is going to target highest per core performance, we will probably see the return of V-Cache to Venice servers. Which would also further distinguish it from Zen6c cores, that will match the L3 of full cores.
 

Joe NYC

Diamond Member
Jun 26, 2021
3,633
5,174
136
Where is this magical gain coming from we are comparing N3E to N2/N2P not N4P to N2 it's likely going to be 4-4.2Ghz.

Good point. Probably less in gains from lithography on Zen6c vs. Zen5c, and more from L3, memory bandwidth, memory latency. And rest from IPC that's independent of the other variables I mentioned.
 
  • Like
Reactions: 511

511

Diamond Member
Jul 12, 2024
4,508
4,124
106
If we can assume that the full 256 Zen6c and full 96 Zen6 CPUs both use the big SP7 socket, with 8 CCDs, does it then mean that SP8 socket will be 1/2 of that, 4 CCDs?
96 Zen 6 Core would likely be limited to SP8 and 8 Ch memmory and 256C would require SP7 you can easily pack 8 Zen6 CCX and it doesn't require 16 Ch memory.
 

Kaluan

Senior member
Jan 4, 2022
515
1,092
106
Good point. Probably less in gains from lithography on Zen6c vs. Zen5c, and more from L3, memory bandwidth, memory latency. And rest from IPC that's independent of the other variables I mentioned.
Well, for what it's worth, L3/SRAM scaling on all of TSMC's non-specialized libraries from N5 all the way to N3E have had a big fat zero in SRAM scaling improvement.
N2 is the first node after a long time that improves on it. A relatively generous ~17% shrink.

I'm betting this was one of the big factors in AMD wanting to splurge all-in for N2 across the board with Zen6 designs.
You can have your extra cores shrink all you want for your CCDs, they're tiny anyway, but if that 24/48MB/64MB/whatever L3 needs 50% more die space, that ain't good.