Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 614 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Jul 27, 2020
21,410
14,881
146
Great, now I understand the Z5 chipsets better. What about that NVMe RAID 5? AMD not including it on 7950X product page means that Zen 5 has special hardware instructions to speed up XOR operations?
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,980
3,663
136
Intel did 18% from SNC to GLC by blasting the core area (with L2) from 4.36mm^2 for SNC to 7.53mm^2 for GLC, that's a seventy-fricking-two percent increase in area. And don't get me started on power.

We don't have a precise measurement for Zen 5 yet, but without taking into account the L2, Zen 5 STX is only 3.46mm^2 compared to Zen 4's 2.73mm^2. (a 27% increase in area) GNR core is likely a bit bigger, but accounting for L2 I'd be surprised if it was any more than 35% bigger.

And as David Huang discovered, Zen 5 isn't even a straight up increase in some areas compared to Zen 4, the uop cache looks outright smaller for example.

I'm all for being disappointed by mediocre increase, but bringing up Golden Cove of all things as an example is just lol.
All things being equal I don't care about area cost. Increasing resources in a core isn't linear in cost, so once you have spent that powder it's gone. Zen5 spent a lot of powder.


Your partial quoting of me is also semi annoying because I did say I will change my mind if we find out if amd did major trade offs.
 

jamescox

Senior member
Nov 11, 2009
644
1,105
136
I'm looking forward to the SP6 Zen5 products. I wonder if, for this generation, AMD will bless us with a 4 CCD full fat Zen5 product with 32 Zen5 cores running at decent speeds? It would be a nice replacement for older Threadrippers out there. Even the 64 core Zen5c parts would provide a lot of grunt for workstations that use AVX-512 heavily. There are some nice SP6 workstation ATX boards out there...
The 8004 (siena) processors seem a bit weird. I was trying to determine the chiplet and IO die organization. I had a similar issue with the 9124 and 9224. The epyc wikipedia article initially listed the 9124 as a 2 CCD device and the 9224 as a 3 CCD device. That seemed very unlikely to me. The wiki also had the caches listed as 64 and 96 MB, which is wrong. Both are 64 MB L3. I ended up changing the wiki page myself, so hopefully I am correct in assuming that both 9124 and 9224 are 4 CCD devices with 16 MB L3 enabled rather than 32 MB per CCD. It makes sense that if you have a 4 quadrant IO die, then using 4, 8, or 12 chiplets would be optimal.

For siena, this does not seem to be the case. It seems that they must have asymmetries somewhere. I am assuming that it is the same IO die as 9004? They have 32, 64, or 128 MB L3 cache versions, so they have 2, 4, or 8 CCX devices, but since it is 2 CCX per CCD, the 4 CCX device could be 2 CCD or 4 CCD with 1 CCX active per CCD. The 2 CCX device could be 1 or 2 CCD also. How are the 6 channel memory controllers arranged? I initially thought that they just disabled one quadrant of the IO die and then only used 2 channels out of 3 for the remaining quadrants. The 96 pci-e lanes would indicate possibly a fully disabled quadrant of the IO die. This leads to a lot of asymmetric organizations with either quadrants with no cpus attached or cpus with no local memory controller. I guess they could also have a case where a full quadrant is disabled and the 4 CCD device has 1 CCD per quadrant for 2 quandrants and a third quadrant with 2 CCD. I haven't found anything indicating how it is actually organized.

Asymmetric architecures are probably fine for low performance edge servers, but this might not be good for workstation performance. Also, current 8004 processors top out at 200 W. SP6 is meant to be a low power socket.
 

jamescox

Senior member
Nov 11, 2009
644
1,105
136
7k TR's are SP6 for both platforms and they're 350W just fine.
Oops. I don't follow threadripper much. The company I am working for is spec'ing new systems now, so I am mostly interested in epyc. It looks like turin will not come out in time though, so I will be working with genoa for the next few years. The chassis are power limitied on the cpu, so turin may have been very good.

It would have been nice to be able to get siena processors in a more compact chassis for some of the accessory nodes, but we are using OCP chassis and our supplier doesn't seem to have anything other than the compute chassis we are using. We end up wasting some rack space for accessory nodes. Full genoa processors are very wasteful here. We are using 9124 and 9224 processors for the accessory nodes, but the compute chassis is dual socket, so the minimum configuration is dual 16 core (32 core / 64 thread), 64 MB L3, which seems like overkill for non-compute nodes. The 9124 and 9224 are cut down: 16 MB L3 per CCX instead of 32 MB unless I am wrong about the CCD configuration. Hopefully that is irrelevant for accessory nodes, but I was thinking it may be better to standardize on 9254 (24-core, 128 MB cache) and consolidate some of the accessory nodes. This may be too much of a single point of failure though.
 
  • Like
Reactions: lightmanek

PJVol

Senior member
May 25, 2020
728
684
136
Just wondering if his CO setting actually improved the core clocks?
Anyway, a couple of thoughts:
- max DRAM read/write bandwidth. Before 2nd scene starts they should peak at the assets upload
- max Vddcr_Vdd looks suspiciously rounded. Could this (1.200V) really be the Vmax for the GNR core complex ?
 
Last edited:

SteinFG

Senior member
Dec 29, 2021
692
809
106
The MT score is almst the same as 9900X. Does GB scale past 12 cores? If it does, there's some other limitation it seems. 1720588392733.png
Edit: 7950X and 7900X also have similar issues, so it seems that GB doesn't scale past 12
 
  • Like
Reactions: igor_kavinski

Abwx

Lifer
Apr 2, 2011
11,613
4,472
136
The MT score is almst the same as 9900X. Does GB scale past 12 cores? If it does, there's some other limitation it seems. View attachment 102713

GB6 scale at 18% from 6 to 8 cores, that is 33% more cores, and only by 8% when adding 50% more cores from 8 to 12, even ST subscores that are known to be vastly parralel on real world have limited scalability on the MT subscores, so that s an irrelevant bench when it comes to MT.
 

Gideon

Golden Member
Nov 27, 2007
1,875
4,519
136

Det0x

Golden Member
Sep 11, 2014
1,352
4,573
136
Enabling PBO alone means nothing for Zen4 / Zen5 ST score in GB6
Its only curve optimizer settings, +offset FMAX and temps what would affect ST score

So in my opinion, i kinda doubt those are tuned/opimized/overclocked (PBO) settings for that 9950X
Think the difference in scores is down to different system configurations (windows install+silicon quality+cooling+bios settings etc -> all are stock runs 🤔

Just my 2cent :)
 
Last edited:

Nothingness

Diamond Member
Jul 3, 2013
3,154
2,184
136
The MT score is almst the same as 9900X. Does GB scale past 12 cores? If it does, there's some other limitation it seems. View attachment 102713
Edit: 7950X and 7900X also have similar issues, so it seems that GB doesn't scale past 12
The only test that scales well in GB6 is the ray tracing one. If you're only interested in that property, you should only look at that test. Thing is that in real life most applications alas don't scale as rendering ones.
 

Nothingness

Diamond Member
Jul 3, 2013
3,154
2,184
136
GB6 scale at 18% from 6 to 8 cores, that is 33% more cores, and only by 8% when adding 50% more cores from 8 to 12, even ST subscores that are known to be vastly parralel on real world have limited scalability on the MT subscores, so that s an irrelevant bench when it comes to MT.
As I wrote above, if you look at the ray tracing score, you get more or less what you expected.

As I wrote dozens of times: don't concentrate on aggregated scores, study subtests results. GB6 global MT score doesn't reflect perfect scaling at all, that was a huge departure from GB5, and IIRC this was explained by PrimateLabs.