Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 104 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
799
1,351
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

Untitled2.png


What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts! :)
 
Last edited:
  • Like
Reactions: richardllewis_01

leoneazzurro

Senior member
Jul 26, 2016
905
1,430
136
There are hints of "more" configurations for Zen5 server in that discussion. As far as I understand, there will be a high level of customization possible with Zen5 packaging, whatever it will be.
 

moinmoin

Diamond Member
Jun 1, 2017
4,934
7,619
136
Should we have ZEN5 thread ?
I'd say yes. Or do we want to wait until the answers to this thread's poll are finally known?

There are hints of "more" configurations for Zen5 server in that discussion. As far as I understand, there will be a high level of customization possible with Zen5 packaging, whatever it will be.
Will be interesting what the "platform" will be then. Many different ones with classic sockets? A modular one where CPU modules are just another OAM connected through IF/CCIX/CXL?
 

leoneazzurro

Senior member
Jul 26, 2016
905
1,430
136
What is interesting is that there is again this rumor of desktop Zen5 having a sort of hybrid architecture, with Zen5 cores and "Zen4 little" cores. For reusability in the server world, these are quite probably on their own die.
 

Hitman928

Diamond Member
Apr 15, 2012
5,182
7,632
136
Make of the rumor what you will: https://videocardz.com/newz/amd-patents-a-task-transition-method-between-big-and-little-processors

This is all based on patents AMD submitted.

What is interesting is that there is again this rumor of desktop Zen5 having a sort of hybrid architecture, with Zen5 cores and "Zen4 little" cores. For reusability in the server world, these are quite probably on their own die.

Just to be clear, nothing in the AMD patent suggests the hybrid approach uses Zen5 + Zen4, it just mentions a more powerful and less powerful processor. The Zen5 + Zen4 combination is pure rumor at this point.
 

leoneazzurro

Senior member
Jul 26, 2016
905
1,430
136
Just to be clear, nothing in the AMD patent suggests the hybrid approach uses Zen5 + Zen4, it just mentions a more powerful and less powerful processor. The Zen5 + Zen4 combination is pure rumor at this point.

I am not saying this will be the reality, what I am saying is that the leakers are saying this in the discussion linked above.
 

Hitman928

Diamond Member
Apr 15, 2012
5,182
7,632
136
I am not saying this will be the reality, what I am saying is that the leakers are saying this in the discussion linked above.

Yes, I understood. I just didn't want there to be any confusion as others might not have actually looked at the patent and assumed the Zen4 inclusion was coming from the patent itself.
 

andermans

Member
Sep 11, 2020
151
153
76
What is interesting is that there is again this rumor of desktop Zen5 having a sort of hybrid architecture, with Zen5 cores and "Zen4 little" cores. For reusability in the server world, these are quite probably on their own die.

I think we've seen these leaks for mobile Zen5 right? Nothing has explicitly pointed to desktop also having this?
 

leoneazzurro

Senior member
Jul 26, 2016
905
1,430
136
I think we've seen these leaks for mobile Zen5 right? Nothing has explicitly pointed to desktop also having this?

I think we had seen this rumor for both, as I've seen leakers pointing this for "Zen5" instead of "mobile Zen5". BTW, we had also some hints for the APU after Phoenix being MCM. But, these are rumors for now, these designs are probably yet to be finalized.
 
Last edited:

DisEnchantment

Golden Member
Mar 3, 2017
1,590
5,722
136
I'd say yes. Or do we want to wait until the answers to this thread's poll are finally known?
No point of a Zen5 thread at the moment, we don't even know what will be in Zen4, will just end up duplicating info in both threads.

What is interesting is that there is again this rumor of desktop Zen5 having a sort of hybrid architecture, with Zen5 cores and "Zen4 little" cores. For reusability in the server world, these are quite probably on their own die.
This rumor is a bit strange, because using Zen4 as "little" core is quite stretching it a bit. Zen4 has full blown AVX512 support for example.

On top of that Robert Hallock said it explicitly that AMD will not do big.LITTLE design. I believe he should be in the know of what Zen5 is going look like when he made that statement.
Also Mike Clark said something about designing same core for all workloads.
Way too less info atm

I think we had seen this rumor for both, as I've seen leakers pointing this for "Zen5" instead of "mobile Zen5". BTW, we had also some hints for the APU after Phenix being MCM. But, these are rumors for now, these designs are probably yet to be finalized.
Zen5 high level design/architecture should have been finalized if they plan to launch by 2024.
Implementation, bringup , validation etc will take at least two years if not more. Especially considering the complex packaging concepts that AMD envisions.

Zen5 does sounds like something exotic though from whatever rumor going around.
 

Hans de Vries

Senior member
May 2, 2008
321
1,018
136
www.chip-architect.com
I could imagine a CCD with as the base die a 4x4 array of Zen5/L2 blocks, and on top of that an equal sized stack of 128MB L3 dies.
All serial interconnect now using PCI-e 6.0 style PAM4 doubling the bandwidths. The Genoa/Bergamo IO dies could have the internal buses already.

 

leoneazzurro

Senior member
Jul 26, 2016
905
1,430
136
This rumor is a bit strange, because using Zen4 as "little" core is quite stretching it a bit. Zen4 has full blown AVX512 support for example.

On top of that Robert Hallock said it explicitly that AMD will not do big.LITTLE design. I believe he should be in the know of what Zen5 is going look like when he made that statement.
Also Mike Clark said something about designing same core for all workloads.
Way too less info atm

Well, that could be for the time being. Also, there are the patents. The rumors say that it will not be the same Zen4 die as in Raphael, but something called "Zen4D" or such. Then there is the rumor of Zen5 being "exotic". Dunno what to think about it, there could be also a lot of FUD going on for concealing what the real solution will be.

Zen5 high level design/architecture should have been finalized if they plan to launch by 2024.
Implementation, bringup , validation etc will take at least two years if not more. Especially considering the complex packaging concepts that AMD envisions.

Zen5 does sounds like something exotic though from whatever rumor going around.

I did not mean the architecture, I meant the shape the final product will be in the term of packaging. Being Zen an highy modular archtitecture, and as AMD is pushing a lot on modular packaging for bringing together etheregenous "blocks" (that could be validated individually and later connected and validated as a whole), it would be possible to easily add changes. I.e. adding more CCDs, or having maybe some "accelerator" maybe based on Xilinx IP. But this is all speculation, of course.
 

moinmoin

Diamond Member
Jun 1, 2017
4,934
7,619
136
Make of the rumor what you will: https://videocardz.com/newz/amd-patents-a-task-transition-method-between-big-and-little-processors

This is all based on patents AMD submitted.
That article mentions only one patent, and it's the boring patent covering the possibility of hybrid cores posing as one. I think the author just took the Strix Point containing big.LITTLE rumor going around on Twitter at the time, looked up one patent and ran with that.

Personally I'm confident that AMD will do "hybrid" "cores".
- The OS and software won't see the additional "cores", that's what the patent in the article is about, transparent moving of a thread from one core to another hybrid one without software interaction.
- But the "core" won't even be complete on its own! That's the second, the promising patent covering this topic was about, catching illegal opcode exceptions and moving code to another core that can handle said opcode. This approach would allow to create very fast and efficient dumb cores that can speed up common simple code, with the usual fat core only having to handle all the more complex cases.

@DisEnchantment mentioned both patents back in June:
https://forums.anandtech.com/threads/speculation-zen-4-epyc-4-genoa-ryzen-6000.2571425/post-40522899

I expanded on the latter patent back then which I still consider a big deal:
https://forums.anandtech.com/threads/speculation-zen-4-epyc-4-genoa-ryzen-6000.2571425/post-40523434
I can imagine that one to be part of the excitement at AMD around Zen 5.
 

Ajay

Lifer
Jan 8, 2001
15,332
7,792
136
I could imagine a CCD with as the base die a 4x4 array of Zen5/L2 blocks, and on top of that an equal sized stack of 128MB L3 dies.
All serial interconnect now using PCI-e 6.0 style PAM4 doubling the bandwidths. The Genoa/Bergamo IO dies could have the internal buses already.

I could imagine a CCD with as the base die a 4x4 array of Zen5/L2 blocks, and on top of that an equal sized stack of 128MB L3 dies.
All serial interconnect now using PCI-e 6.0 style PAM4 doubling the bandwidths. The Genoa/Bergamo IO dies could have the internal buses already.

Well, that's a very interesting solution! I would lean towards stacked logic (2x8) + stacked L3$. I keep thinking AMD should use a much larger L2$, but they seem to be moving much more incrementally here. If we have stacked logic, we have more direct cooling (direct to die, or inter-stack cooling). Which ever the case, TSMC's and/or Intel's advanced packaging work will finally see a big payoff. It's going to be a fun show to watch!
 

Ajay

Lifer
Jan 8, 2001
15,332
7,792
136
Zen5 high level design/architecture should have been finalized if they plan to launch by 2024.
Implementation, bringup , validation etc will take at least two years if not more. Especially considering the complex packaging concepts that AMD envisions.
I would think implementation must be pretty close to being done, because AMD will also want to verify functionality in FPGAs first. Honestly, I don't know how what % of the implementation is tested this way.
 

jamescox

Senior member
Nov 11, 2009
637
1,103
136
Hmmm ... I dont think that is the route AMD will take with Genoa.

View attachment 51995
Zen4 CCD from the Gigabyte leak likely has two SDP/IF links.
On top of that to support 96 or even 128 cores would mean they need to support up to 512 SerDes links.
Way too much power wasted and looking at the routing for Rome above already is very complicated.
On Rome they had to route the links underneath the CCD.

And in ISSCC 2021, Sam Naffziger already alluded to interposers/higher density interconnects (highlighing by me). This was before Lisa announced 3D V-Cache.
View attachment 51998
In fact from this slide we knew the second item already is coming to Zen3. (Cache while not exactly memory is backed by SRAM which is memory)

From TSMC's offical data, CoWoS-L with LSI/Si bridges is proven and it reaches 3x reticle size which can cover all chiplets for a hypothetical 16 CCD EPYC.
View attachment 51997

Anyway, I think AMD will most likely go with some sort of interposer, probably CoWoS-R if not CoWoS-L if there is really no need for super high density interconnects. i.e. if 4um contact pitch is enough (i.e. CoWoS-R) instead of the high density CoWoS-L (<1um pitch)
If not, they will burn power linking those 96/128 cores, it is not sustainable.
You can read yourself the paper by Naffziger
The rumors say SP5 is significantly larger than SP3. They could, perhaps do 2-1-1-2 cpu die down each side which would allow the IFIS routing to go under the two single cpu chips on each side in the middle. The rumors I have seen so far indicate that Genoa will look almost the same as Milan, except with up to 4 extra cpu chips. Likely still serdes. I thought the gigabyte leak actually listed IO die, cpu chiplet, and package sizes. This seems to indicate that it is all still serdes. That is going to be the quickest time to market, so that is what I expected. Intel has talked about stacked devices, but we will see when they are actually widely available.

I suspect that they are working on a stacked version in parallel that might come out later. I suspect that is the rumored 128 core device. I don’t think there will be more than 96 cores with serdes, but we could get surprised again. With the larger SP5 package, they have some more area for routing. It could also be more layers and such.
 
Last edited:

turtile

Senior member
Aug 19, 2014
614
294
136
That article mentions only one patent, and it's the boring patent covering the possibility of hybrid cores posing as one. I think the author just took the Strix Point containing big.LITTLE rumor going around on Twitter at the time, looked up one patent and ran with that.

There was a research paper that I can not find now which explained the use of a mini-processor that would organize/monitor the powerful cores to decide which core(s) should run certain tasks. The result was over a 20% performance increase in theory. I think this is what AMD is planning. They can also use the core to shut off the entire CCX or certain cores.
 

moinmoin

Diamond Member
Jun 1, 2017
4,934
7,619
136
There was a research paper that I can not find now which explained the use of a mini-processor that would organize/monitor the powerful cores to decide which core(s) should run certain tasks. The result was over a 20% performance increase in theory. I think this is what AMD is planning. They can also use the core to shut off the entire CCX or certain cores.
I actually think Zen is already doing something in that direction through the scalable control fabric which is for monitoring and controlling all parts of the package. Except that it can't actually move tasks as that's the job of the OS' scheduler (which under Windows happens to be braindead). There's definitely room for optimizations, though the choice between offering more data to the OS for the scheduler to consider (making software support the weak link) and going the opposite way of making cores completely opaque so that the OS doesn't even try to move tasks around (so that the hardware scheduler can handle it all uninterrupted) both have obvious disadvantages.
 

jamescox

Senior member
Nov 11, 2009
637
1,103
136

Turin has a max cTDP of 600W

Should we have ZEN5 thread ?


ZEN5 EPYC should also have two configurations. 192C 384T 256C 512T

Interesting. This particular "group" also has a max VID of 1.8V. Zen 5 has some very interesting traits it seems
We still seem to be in the dark about Zen 4, so it seems to be a bit premature to make a Zen 5 thread. With pervasive use of stacking, AMD could go in many different directions and have many different variants; so speculation is a bit useless. In general, I am thinking that they might have Zen 4 based stacked device later on leading to initial stacked Zen 5 solution possibly using some of the same die as the Zen 4 variant. Also, might Zen 5 variants contain integrated GPUs on package for HPC? Is all 600 Watts for cpu cores and IO die / interposers / whatever? It would be pretty easy to get to 600 Watts if an HBM gpu or two are included. If the IO die / cpu die are stacked, that frees up a lot of space on package for other devices. They could get that high of power consumption with CPUs if they include massive floating point resources, like if they basically attach a few gpu-like units directly to the cpu. Zen 5 seems to be the radical design departure, so we don’t really have much information to go on.
 
  • Like
Reactions: Tlh97

Hans de Vries

Senior member
May 2, 2008
321
1,018
136
www.chip-architect.com
The rumors say SP5 is significantly larger than SP3. They could, perhaps do 2-1-1-2 cpu die down each side which would allow the IFIS routing to go under the two single cpu chips on each side in the middle. The rumors I have seen so far indicate that Genoa will look almost the same as Milan, except with up to 4 extra cpu chips. Likely still serdes. I thought the gigabyte leak actually listed IO die, cpu chiplet, and package sizes. This seems to indicate that it is all still serdes. That is going to be the quickest time to market, so that is what I expected. Intel has talked about stacked devices, but we will see when they are actually widely available.

I suspect that they are working on a stacked version in parallel that might come out later. I suspect that is the rumored 128 core device. I don’t think there will be more than 96 cores with serdes, but we could get surprised again. With the larger SP5 package, they have some more area for routing. It could also be more layers and such.

The rumors say? 🙂 This is all out in the open with extreme detail for months..



SP5_dim.JPG


SP5_removal.JPG
 
Last edited:

andermans

Member
Sep 11, 2020
151
153
76
We still seem to be in the dark about Zen 4, so it seems to be a bit premature to make a Zen 5 thread. With pervasive use of stacking, AMD could go in many different directions and have many different variants; so speculation is a bit useless. In general, I am thinking that they might have Zen 4 based stacked device later on leading to initial stacked Zen 5 solution possibly using some of the same die as the Zen 4 variant. Also, might Zen 5 variants contain integrated GPUs on package for HPC? Is all 600 Watts for cpu cores and IO die / interposers / whatever? It would be pretty easy to get to 600 Watts if an HBM gpu or two are included. If the IO die / cpu die are stacked, that frees up a lot of space on package for other devices. They could get that high of power consumption with CPUs if they include massive floating point resources, like if they basically attach a few gpu-like units directly to the cpu. Zen 5 seems to be the radical design departure, so we don’t really have much information to go on.


600W for 256 cores wouldn't be that much. That is something like 2.4W per core, while Milan @ 225W is almost 4W per core (IO is included in this number so don't take this closely. Just saying this isn't crazy much for that many cores)
 

Hans de Vries

Senior member
May 2, 2008
321
1,018
136
www.chip-architect.com
Zen4 and Zen5 should use the same packages. (Just like Zen...Zen3)
Most likely with the same CCD and IO die arrangements in the packages as well.

I would expect:

Zen4 ---> Zen5

(1) General use of PAM4 for the SERDES so:
- PCIe-5 --> PCI-6 doubles the bandwidth using the same frequency but 2 bit instead of 1 bit per clock edge.
- XGMI3 --> XGMI4 doubles the bandwidth between the IO die and the CCD's using the same number of pins

(2) Doubling the number of cores for each CCD's is enabled by doubling the SERDES bandwidth.
- 16 cores per CCD
- The same number of serdes IO lines
- L3 VCache in increments of 128 MB per die


Possible_Zen5_CCD_smaller.jpg
 
Last edited: