Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 527 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
805
1,394
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

Untitled2.png


What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts! :)
 
Last edited:
  • Like
Reactions: richardllewis_01

LightningZ71

Golden Member
Mar 10, 2017
1,678
1,958
136
Doesn't the cIOD already support x4x4x4x4 bifurcation of the "GPU" PCIe lanes?

And for NVME fanout: ASMedia Promontory 21, also known as AMD B650, can already be cascaded 1+1 (also known as AMD X670), which gives up to 20 fanout lanes (minus up to 8 SATA lanes) plus a bunch of USB ports. If four B650s could be combined into a 1+3 tree, you'd get 36 fanout lanes (minus up to 12 SATA lanes) plus more USB ports than any server needs. The performance loss due to the PCIe cascade might be a tolerable compromise for entry level servers.
It supports the fanout, but, it appears that some mobos don't have that feature supported/implemented/exposed Some boards properly support the 4 slot NVME pass through boards, some don't.
 

LightningZ71

Golden Member
Mar 10, 2017
1,678
1,958
136
It's literally cheaper to get a threadripper 24-core + a 700 dollar motherboard, than a gen5 PLX
Eh, what's $748 between friends? (mouser quotes ~$746 for the PM50084B1 in small lots, no ideas on the PEX89088)

But, in bulk, that price should be sub $500. With higher end AM5 boards already in that range, that could work for the expected board price for a "server" class product for AM5.
 
  • Like
Reactions: lightmanek

Thunder 57

Platinum Member
Aug 19, 2007
2,818
4,123
136
One thing I noticed is that they seem to have priced them to push sales on the 12-core. Possibly they have excess 6-core dies, with both servers and most client users favoring 8-core dies?



The problem with this is that many important databases are sold on per-core licenses, where you pay for all the cores in the machine running it (and all the cores in your hot spare...), not just the ones that are actually enabled. Lack of a 8-core vcache option seems to be an actual oversight to me.

I think that's already been becoming the case. With 6 cores per CCD instead of 8 you are losing out on some of the advantages of the 8 core CCD. We already see where the 7900X3D does worse than the 7800X3D, and no longer holds a premium over the 7900X. Between those, I would probably go with the non-3D variant.

Right now it seems to really only effect games, but that could change. With a six core CPU it doesn't really matter, so those will stay around. I could see a world where 12 core variants fall out of favor completely though.
 

Joe NYC

Platinum Member
Jun 26, 2021
2,337
2,957
106
Intel does not have the FCF for anything like that.
It's a very poor company due to bad business and high capex.

Seems like a perfect storm for Intel.

The same companies that have been force-feeding the market Intel products (Dell, Microsoft Surface) are the same companies about to switch from force feeding Intel to force feed the QCOM Arm chips, at expense of Intel.

In China, 100% Intel Huawei that could only sell Intel has been cut off, benefiting other OEMs (Lenovo, HP) who sell more AMD.

100% Intel entry level server market all of the sudden has competition beating Intel hands down.

There are just very few or no Intel monopoly segments left for Intel to take profits from and use the proceeds to finance anticompetitive practices in other segments.

And this, just before AMD starts transition from Zen 4 to Zen 5 across the board in all segments at once.
 

Shmee

Memory & Storage, Graphics Cards Mod Elite Member
Super Moderator
Sep 13, 2008
7,573
2,562
146
Do we know yet if existing AM5 boards will support these Epyc 4004 chips? I watched Ian's video, and it wasn't clear on this. Also, to my understanding, we don't know yet if the 3D cache variants have one or both CCDs with 3D cache. If they both have the 3D cache, that makes the 16 core very appealing. It would be great for gamers who want 3D cache and more than 8 cores.
 

SteinFG

Senior member
Dec 29, 2021
554
640
106
Jul 27, 2020
18,297
12,052
116
I suspect these are old CPUs that AMD saved for later. Why give free perf when you can sell it later as a higher clocked SKU? Lisa is wiser than most people (including AMD users) want her to be :D
 

MadRat

Lifer
Oct 14, 1999
11,925
259
126
Why do the progressive Zen4 releases on the AM4 platform only tease 5 GHz, but seem to avoid it? Zen4 is already approaching 6 GHz on AM5 with a significant memory edge. Just seems like an artificial limitation of the Zen4 on the AM4 platform.
 

biostud

Lifer
Feb 27, 2003
18,441
5,007
136
He probably meant AM5 but hey, they just gotta pair the Zen 4 core with the appropriate IOD to make it work on AM4, no?
If understand it correctly then zen 3 isn't competing with zen 4 production capacity as they are not the same node family (6/7nm) vs (4/5nm), while continuing producing zen 4 products will be in direct competition with zen 5, and will probably be made as long as there are a demand for them in the EPYC/Threadripper
 

Abwx

Lifer
Apr 2, 2011
11,222
3,931
136
Looking at the numbers for Zen 4 at Computerbase i found a discrepancy in their IPC test at same frequency calculations wich they say are using a geometric mean.

Computing the numbers the 7700X has 5.26% better MT IPC than the 12900K and 10.5% better than the 5800X while Computerbase state that it s 2% and 12% respectively.

They also state that the 12900K has 10% better IPC than the 5800X while the geo mean is only 5.2%, indeed the 7700X being 5.26% ahead of the 12600K both cant have the same 10% difference with the 5800X.

If someone can confirm the numbers since such an error is not without consequences, the bar charts for MT IPC at 3.6GHz can be found here, beware that some charts are with higher being better while others have lower as being better :


If i take the 9 percentages relatively to the 5800X i get for the 7700X :

(112 x 105 x 111 x 112 x 113 x 106 x 109 x 115 x 112)^(1/9) = 110.5

Now the 12900K relatively to the 5800X :

( 93 x 108 x 111 x 102 x 118 x 106 x 96 x 105 x 110 )^(1/9) = 105.2

And the 7700X relatively to the 12900K :

( 121 x 98 x 100 x 109 x 96 x 100 x 114 x 110 x 102 )^(1/9) = 105.266



Edit : In a way this mistake is very good news for Zen 5...
 
Last edited:

Saylick

Diamond Member
Sep 10, 2012
3,423
7,258
136
C&C tested Bergamo and wrote and article about it. I haven't had a chance to dig in, but here it is for you to peruse:
 

StefanR5R

Elite Member
Dec 10, 2016
5,725
8,336
136
From the Zen 5 speculation thread:
The 8004 (siena) processors seem a bit weird. I was trying to determine the chiplet and IO die organization. I had a similar issue with the 9124 and 9224. The epyc wikipedia article initially listed the 9124 as a 2 CCD device and the 9224 as a 3 CCD device. That seemed very unlikely to me. The wiki also had the caches listed as 64 and 96 MB, which is wrong. Both are 64 MB L3. I ended up changing the wiki page myself, so hopefully I am correct in assuming that both 9124 and 9224 are 4 CCD devices with 16 MB L3 enabled rather than 32 MB per CCD. It makes sense that if you have a 4 quadrant IO die, then using 4, 8, or 12 chiplets would be optimal.
heise.de says 9124 = 4x 16 MB, 9224 = 4x 16 MB.
WikiChip says 9124 = 2x 32 MB, 9224 = 4x 16 MB.

I don't know where they took these from and if either of it is true.

For siena, this does not seem to be the case. It seems that they must have asymmetries somewhere. I am assuming that it is the same IO die as 9004?
AMD said that Siena uses the same IOD as Genoa. (Source: STH's Siena launch report)

They have 32, 64, or 128 MB L3 cache versions, so they have 2, 4, or 8 CCX devices, but since it is 2 CCX per CCD, the 4 CCX device could be 2 CCD or 4 CCD with 1 CCX active per CCD. The 2 CCX device could be 1 or 2 CCD also. How are the 6 channel memory controllers arranged? I initially thought that they just disabled one quadrant of the IO die and then only used 2 channels out of 3 for the remaining quadrants. The 96 pci-e lanes would indicate possibly a fully disabled quadrant of the IO die. This leads to a lot of asymmetric organizations with either quadrants with no cpus attached or cpus with no local memory controller. I guess they could also have a case where a full quadrant is disabled and the 4 CCD device has 1 CCD per quadrant for 2 quandrants and a third quadrant with 2 CCD. I haven't found anything indicating how it is actually organized.
I don't know either. There may even be only 2 quadrants active for xGMI and IMCs. (But 3 or 4 active quadrants for PCIe?) However, routing on the package might be easier if all four quadrants are active.

PS,
AMD EPYC™ 9004 Series Memory Population Recommendations (PDF)
For Genoa, AMD of course recommends populations which utilize all 4 quadrants. This says little though about how it's done with Siena.
 
Last edited: