Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 283 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
799
1,351
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

Untitled2.png


What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts! :)
 
Last edited:
  • Like
Reactions: richardllewis_01

inf64

Diamond Member
Mar 11, 2011
3,697
4,015
136
Still doesn't change what I said. Zen 4 L3 cache isn't 50% faster. It's definitely faster though, just not 50%. 50% faster would be around the 2 TB/s mark.
Yes I agree, it's not 50% faster, but 12C having the same BW as 16C Zen 3 is pretty impressive. Some of it is due to higher base/boost clocks.

On unrelated note, UB "tweaked" the formula for fastest CPU again, 6C "Advanced Marketing Devices ES" part is again number 1 :D
 

coercitiv

Diamond Member
Jan 24, 2014
6,184
11,845
136
Alder Lake has only 30mb of L3 cache to share with 16 cores (and the L3 cache isn't very fast) which is why it benefits so heavily from fast DDR5 memory, likely due to all the cache misses in games.
Some reminders when looking at gaming:
  • 32+32MB L3 is not the same as 64MB L3.
  • Zen3 saw tangible gains from dual ranked DDR4, which in essence was a faster memory structure than SR DDR4
  • 5800X3D has shown Zen3 core can scale with a lot more cache, which means it can also scale with faster RAM instead
According to your argument, in gaming the 7700X /w 32MB L3 will be comparable to 12900K /w 30MB L3 with regard to potential for relative DDR5 scaling.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
According to your argument, in gaming the 7700X /w 32MB L3 will be comparable to 12900K /w 30MB L3 with regard to potential for relative DDR5 scaling.

I think you're misrepresenting my argument. I never claimed at any time that Zen 4 wouldn't benefit from faster DDR5, just not as much as Alder Lake.

Alder Lake has demonstrated absurdly high gains from high speed DDR5 in games, the likes of which I've never seen before from any CPU architecture. Even at 4K resolutions in some games.

I think the reason for this is as I said earlier, the relatively smaller and less performant L3 cache which has to be shared between 16 cores, and now that I think of it, Alder Lake also has a significantly wider core with more OoO resources than Zen 3 and likely the upcoming Zen 4 as well.

Alder Lake is bandwidth hungry to say the least.

Here is a good comparison between Alder Lake with DDR4 and high speed DDR5 as well as the 5800x3D with DDR4-3200 and DDR4 3800. I'm not claiming this is going to be representative of Raptor Lake and Zen 4 of course, but I just found it interesting:

12900KS vs 5800X3D
 

fleshconsumed

Diamond Member
Feb 21, 2002
6,483
2,352
136
it might be dead for gaming, not for work though.
And they have 4 boards, not one or 2. Surely they could have made one of them like that. Did not even have to be pci-e 5.0.
Who knows, maybe GIgabyte will make a Vision board like they did in the past that will be more work oriented.
 

Timmah!

Golden Member
Jul 24, 2010
1,417
630
136
For Multi GPU work you need to look for a Workstation class MB.

No, i dont, i can buy Asus X670e board for example, which has 2 8x slots. That said, my last 4 boards were Gigabyte and i was generally satisfied, did not have many issues, except maybe this last one, which has a wonky boot, bottom line, GB would be my default go-to option.
 

RnR_au

Golden Member
Jun 6, 2021
1,705
4,152
106

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,541
14,495
136
So many boards... and not a single m-itx?!?! what are they smoking at <insert mb manufacturer>?

Bless Biostar with their retro graphics and font choices... :hearteyes:
Just a guess, but with higher power consumption, and a lot more features, the smaller motherboards are probably going away, except maybe OEM and with the bottom end skews.
 
  • Like
Reactions: Tlh97 and Drazick

RnR_au

Golden Member
Jun 6, 2021
1,705
4,152
106
Just a guess, but with higher power consumption, and a lot more features, the smaller motherboards are probably going away, except maybe OEM and with the bottom end skews.
Just came across this tweet - apparently its hard to squeeze stuff into the small space.


Which could mean I won't be a player on launch day :(
 
Last edited:
  • Wow
  • Like
Reactions: lightmanek and ZGR

tomatosummit

Member
Mar 21, 2019
184
177
116
Just came across this tweet - apparently its hard to squeeze stuff into the small space.


Which could mean I won't be a player on launch day :(
How exactly is AM5 harder than anything before it regarding ITX problems?
I could understand if the statement was for 670e only and trying to actually expose all the io
Or is there some stupid requirement from amd saying that top am5 motherboards need more power components than sense? You know because they look cool in marketing images.
 
  • Like
Reactions: Kaluan

maddie

Diamond Member
Jul 18, 2010
4,738
4,667
136
How exactly is AM5 harder than anything before it regarding ITX problems?
I could understand if the statement was for 670e only and trying to actually expose all the io
Or is there some stupid requirement from amd saying that top am5 motherboards need more power components than sense? You know because they look cool in marketing images.
PCIe 5 signal integrity & trace spacing?
 
  • Like
Reactions: Tlh97 and scannall

Joe NYC

Golden Member
Jun 26, 2021
1,930
2,269
106
The initial Zen 4 seems to be very conservative, looking very similar to Zen 3; just MCMs, maybe with some advanced packaging tech (RDL, etc), but no silicon bridges. I am thinking that the Zen 4 refresh and Bergamo (Zen 4c) may start to include more stacking tech; silicon bridges or perhaps an infinity cache base die. It is still unclear whether they have some modular base die that can be used across multiple products or if they are just going to put essentially v-cache die under other chips. I thought they were talking about the HPC APU coming with Zen 4, so that seems to indicate that stacking will be used with Zen 4, but we already know that the initial releases are MCM, unless there is some stuff hiding in the package. HBM seems to be coming to Epyc, possibly with Zen 4. It doesn’t seem like it can really consume that level of bandwidth. Perhaps it is only with products containing gpu chiplets? It is unclear how they would make use of HBM otherwise; would the CPU chiplet have an HBM interface? HBM is significantly more power efficient than going out to system memory, so even if they can’t take full advantage of the bandwidth, it may still be worthwhile for power consumption.

Bergamo may end up as a conservative increment, using the same Genoa or Sienna I/O die and use 8 x 16 core CCDs.

But from the MLID Mi300 leak, it looked like the base 6nm die will be able to accept interchangeable compute modules, Zen4 CPU core module being one of them.

Timing of this - I find interesting:
- For Mi300, AMD needs a new form factor Zen4 CCD that sits on top of N6 base, mid 2023
- For Bergamo, AMD will need a new Zen4 CCD, mid 2023

There could, in theory, be 3 Zen 4 CCDs:
1. 8 core Genoa / Raphael, SerDes links to IO
2. 16 core Bergamo, SerDes links to IO
3. (likely) 16 core Bergamo to stack on top of base I/O die

Knowing AMD's commitment to efficiency and reusability, I think it is possible 2 and 3 will be the same die, which would imply that Bergamo may be part of Mi300 architecture.

(Just a speculation)

HBM seems to be coming to Epyc, possibly with Zen 4. It doesn’t seem like it can really consume that level of bandwidth. Perhaps it is only with products containing gpu chiplets? It is unclear how they would make use of HBM otherwise; would the CPU chiplet have an HBM interface? HBM is significantly more power efficient than going out to system memory, so even if they can’t take full advantage of the bandwidth, it may still be worthwhile for power consumption.

I could see cost efficiency and space efficiency too, removing local, mobo based DIMMs, in addition to power efficiency.
 

Joe NYC

Golden Member
Jun 26, 2021
1,930
2,269
106
I don’t think cache chips are that unlikely. In fact, if Global Foundries ends up making HBM-type memory for AMD, I would wonder if they will include some special sauce rather than standard HBM.

It would most likely be the DRAM makers who are already making HBM memory, which is already stacked.

Putting an SRAM cache die at the bottom of the stack or some kind of processor in memory thing would be interesting. It is possible that the Zen 4 refresh of Bergamo will do away with SerDes based, on package IFOP connections and use silicon bridges instead or some combination.

If AMD were to go crazy with the concept of the base I/O+cache die, making it even bigger and instead of putting the HBM on interposer and connecting through 2.5D connection, another possibility would be to put the HBM memory directly on top of the base die.

HBM has a logic die at the bottom of the stack, and the base die could fulfill that role. And the 2.5D connection and its various overhead would be eliminated, lowering latency further and cutting power consumption further. And possibly even increasing the bandwidth further.

But I think this would push the size of the base die to uneconomical size. I think the HBM chip die sizes are ~100 mm2.