Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 161 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
820
1,456
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

Untitled2.png


What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts! :)
 
Last edited:
  • Like
Reactions: richardllewis_01

leoneazzurro

Golden Member
Jul 26, 2016
1,114
1,866
136
It will depend on which type on IP AMD will leverage on. If they are so willing to integrate these in their product line, they for sure have some customers asking for them.
 

moinmoin

Diamond Member
Jun 1, 2017
5,236
8,443
136
Xilinx offered a lot of different IPs so I'm honestly surprised if there weren't some small parts in AMD products already. Though the teaser for 2023 is bound to be for something bigger. Maybe the FPGA capability embedded in cores already like hinted by several patents? Then AMD has its own "software defined silicon". 😬
 

maddie

Diamond Member
Jul 18, 2010
5,151
5,537
136
Doesn't Xilinx have a large amount of leading edge packaging IP also, besides FPGA?
 

DisEnchantment

Golden Member
Mar 3, 2017
1,777
6,787
136
Doesn't Xilinx have a large amount of leading edge packaging IP also, besides FPGA?
Xilinx do have a ton of packaging IP, but anything chiplet AMD has been racking up a ton and they have a lot more of them.
Xilinx IP pool is smaller, close to 5K. AMD has a bigger patent pool close to 3x that of Xilinx not counting around 1.5-2K in various stages of approval.
Good thing about the patent pool of these two is that they are more focused on the area they do business in unlike many others who patent many things not even remotely related to what their core business is.

Then AMD has its own "software defined silicon".
You can argue software defined silicon is actually more suitable here since you can download the actual HDL code to the FPGA and change the HW behaviour on the fly:)
FPGAs are not cheap though, so not sure where they will deploy them.
But Video Codecs, tailored AI algos etc are so fast to deploy on FPGAs using Xilinx store components.
 
  • Like
Reactions: Tlh97 and moinmoin

Hans Gruber

Platinum Member
Dec 23, 2006
2,507
1,345
136
Another reason why AMD would release Zen 4 earlier. Not just as a response to Alder Lake. If Intel is planning to quickly follow up with another Alder Lake revision in rapid succession. It would make Zen 4 underwhelming and return AMD to a value play CPU company. By releasing Zen 4 earlier. It would be on the market before Intel has a follow up to the current Alder Lake processors. Keeping Zen 4 relevant (market leading performance) for several months and protecting their margins.

People forget that Intel is on the 10nm process that was delayed for years. They are moving to 7nm silicon by next year if not late 2022. Intel was on 14nm and still is on the CPU's previous to Alder lake. That has been 6 years on 14nm.

We all know Zen 4 will be significantly better than Alder Lake. Zen 3 should have been on 5nm TSMC silicon. AMD has a silicon cushion because they have not been on the cutting edge of nodes offered by TSMC.

All is not lost for AMD. They simply have to adjust their product release dates on Zen 4.
 

dnavas

Senior member
Feb 25, 2017
355
190
116
But Video Codecs, tailored AI algos etc are so fast to deploy on FPGAs using Xilinx store components.

It'll be interesting to watch how this evolves. I get visions of LLVM-generated "code" that gets installed in processors as part of giant distributed big-data processing. Maybe that's the bit-banger in me.
 

Saylick

Diamond Member
Sep 10, 2012
3,952
9,229
136
I have to at least expect that Xilinx's SmartNIC products will integrate something from AMD... Currently they use ARM cores with some nondescript vector engine and their own FPGA block. I am not sure how big of a deal it would be to run x86 on the NIC instead of ARM, especially since Nvidia's competing Mellanox SmartNICs are also ARM-based, but that seems like a relatively small product to integrate first.

As for longer term collaboration, I think we're only a few years out before we see an AMD heterogeneous HPC package that has CPU, GPU, and FPGA chiplets all under the hood. The Xilinx-side of the company will focus on developing a software platform for the ENTIRE processor to run. At that point, it will just be a matter having the software decide where best to run your code.
 
  • Like
Reactions: Tlh97 and ftt

eek2121

Diamond Member
Aug 2, 2005
3,397
5,027
136
Another reason why AMD would release Zen 4 earlier. Not just as a response to Alder Lake. If Intel is planning to quickly follow up with another Alder Lake revision in rapid succession. It would make Zen 4 underwhelming and return AMD to a value play CPU company. By releasing Zen 4 earlier. It would be on the market before Intel has a follow up to the current Alder Lake processors. Keeping Zen 4 relevant (market leading performance) for several months and protecting their margins.

People forget that Intel is on the 10nm process that was delayed for years. They are moving to 7nm silicon by next year if not late 2022. Intel was on 14nm and still is on the CPU's previous to Alder lake. That has been 6 years on 14nm.

We all know Zen 4 will be significantly better than Alder Lake. Zen 3 should have been on 5nm TSMC silicon. AMD has a silicon cushion because they have not been on the cutting edge of nodes offered by TSMC.

All is not lost for AMD. They simply have to adjust their product release dates on Zen 4.

AMD stated 2H 2022 for Zen 4. I believe AMD over a rando on the chiphell forums.

EDIT: the original claim is here: https://www.chiphell.com/forum.php?mod=viewthread&tid=2392910&page=1#pid49276473
 

jpesk2

Junior Member
Jan 10, 2019
7
6
81
Some back of the napkin math. If the rumor about 18% IPC and 7% all core boost increase and 8.7% single core boost.
Cinebench r23
Single Core would be around 2140 (single core boost of about 5.35Ghz +18% IPC)
Multi Core would be around 36300 (mulit core boost of about 4.5Ghz +18% IPC)
This assumes pretty much perfect scaling with IPC and frequency (I know that is almost never a thing)

I have no idea if the rumor from chiphell is even close, but if it is Zen 4 is looking pretty darned good.
I think the CPU market over the next few years for consumers is going to be great -- the performance available for desktop users is going to be ridiculous.
 

Mopetar

Diamond Member
Jan 31, 2011
8,450
7,657
136
Graymon says TSMC N6 for Zen4 IOD, BTW.

If they did use N6 for any IO dies it would be consumer. The Zen 2/3 IO die on GF's 12nm node was only about 125 mm^2 so they can still get a lot of them per wafer. If the rumors about integrated graphics on the IO die are true then they would be adding a lot of logic that does get more benefits from being on TSMC.

There also isn't as much extra cost to the design since almost everything on the IO die would have been designed for Rembrandt which uses the same N6 process. While not all of it is a simple copy/paste job, a lot of the blocks could be reused or modified as necessary.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,777
6,787
136
Some back of the napkin math. If the rumor about 18% IPC and 7% all core boost increase and 8.7% single core boost.
Cinebench r23
Single Core would be around 2140 (single core boost of about 5.35Ghz +18% IPC)
Multi Core would be around 36300 (mulit core boost of about 4.5Ghz +18% IPC)
This assumes pretty much perfect scaling with IPC and frequency (I know that is almost never a thing)

I have no idea if the rumor from chiphell is even close, but if it is Zen 4 is looking pretty darned good.
I think the CPU market over the next few years for consumers is going to be great -- the performance available for desktop users is going to be ridiculous.
1644877330741.png

Just in case folks forget the famous drum about higher than industry trend for IPC, AMD was beating.18% IPC over a period of two years is poor.
If it is really this, and it may well be, they will have to eat their own words, they better be ready to take a lot of flak. Don't forget the >1.6x Xtor gain they get on 72.225 mm2 N5 CCD.
It's likely not. but you never know, could even be lesser too :p
 

jpesk2

Junior Member
Jan 10, 2019
7
6
81
View attachment 57442

Just in case folks forget the famous drum about higher than industry trend for IPC, AMD was beating.18% IPC over a period of two years is poor.
If it is really this, and it may well be, they will have to eat their own words, they better be ready to take a lot of flak. Don't forget the >1.6x Xtor gain they get on 72.225 mm2 N5 CCD.
It's likely not. but you never know, could even be lesser too :p
Were they talking strictly IPC or was it pure performance? If it is pure performance than with just the speculation that came from chiphell puts it at almost a 30% improvement from Zen3. Redgmaming, MLID, and some others are saying that it should be around 40% improvment over zen3 -- I think 30% is probably more likely.
 

Saylick

Diamond Member
Sep 10, 2012
3,952
9,229
136
View attachment 57442

Just in case folks forget the famous drum about higher than industry trend for IPC, AMD was beating.18% IPC over a period of two years is poor.
If it is really this, and it may well be, they will have to eat their own words, they better be ready to take a lot of flak. Don't forget the >1.6x Xtor gain they get on 72.225 mm2 N5 CCD.
It's likely not. but you never know, could even be lesser too :p
That plot was for ST performance, so it includes IPC and clock gains.
This is a graph of raw performance, combining both the efficiency of processing instructions (IPC, instructions per clock) as well as benefits from an improved manufacturing process. AMD cites the industry trend as growing 7-8% every year, and that they fully expect the Zen package, due to design improvements in the microarchitecture and better foundry notes, will go above and beyond that 7-8% figure.
With an 18% IPC gain and 7% increase in peak clocks over Zen 3, which came out in Nov 2020, I think they're on track to beat the previous 7-8% CAGR in ST performance. Knowing now that Intel just sat on their laurels for the second half of the last decade, 7-8% is a low bar to beat. 1.08*1.08 = 1.166, so the 18% IPC gains alone satisfies their goal.
 

jamescox

Senior member
Nov 11, 2009
644
1,105
136
If they did use N6 for any IO dies it would be consumer. The Zen 2/3 IO die on GF's 12nm node was only about 125 mm^2 so they can still get a lot of them per wafer. If the rumors about integrated graphics on the IO die are true then they would be adding a lot of logic that does get more benefits from being on TSMC.

There also isn't as much extra cost to the design since almost everything on the IO die would have been designed for Rembrandt which uses the same N6 process. While not all of it is a simple copy/paste job, a lot of the blocks could be reused or modified as necessary.
The opposite seems more likely. The Epyc IO die may be very pad limited without some advanced packaging technology. It is not that large and it somehow has 12 channel DDR5, 128 pci-express 5, and 12 cpu die connections. That is a ridiculous number of pads required.

The consumer IO die will likely only have 2 channel memory and 32 high speed serdes for pci-express, SATA, etc. The gpu that is included has been rumored to be very small. It is just there to provide roughly the same features across the whole line. This makes it look like they may make the consumer IO die at GF and the Epyc IO die at TSMC.

That seems like it makes more sense to me. The power consumption of the consumer IO die isn’t really that important. The power consumption for the Epyc IO die is a lot more important, so it seems more likely to be on 6 nm and use some advanced packaging tech to handle the ridiculous amount of IO. The consumer IO die wouldn’t really need any advanced packaging tech. It would likely be larger than the current IO die due to the gpu, so probably not of limited. Why waste TSMC wafers when it isn’t required.

Low end parts might be actual GF made APUs. Higher end parts, but <= 8 core might be TSMC made APUs with the MCM desktop parts only being 8 to 16 cores with GF IO die. Chipsets would still be GF. There has been some rumors of a lower end socket between full Epyc and desktop parts. That is a huge gap without something. Perhaps they will have an IO die on GF that is roughly half size; 6 channel memory, 64 lanes, only 4 or 8 GMI for cpu die. I don’t know if 6 makes sense if the IO die is still divided into 4 quadrants. That would allow Threadripper, workstation, and low end server parts without using the full, massive, SP5 Epyc socket.
 
  • Like
Reactions: Tlh97 and Joe NYC

Joe NYC

Diamond Member
Jun 26, 2021
3,387
4,989
136
If they did use N6 for any IO dies it would be consumer. The Zen 2/3 IO die on GF's 12nm node was only about 125 mm^2 so they can still get a lot of them per wafer. If the rumors about integrated graphics on the IO die are true then they would be adding a lot of logic that does get more benefits from being on TSMC.

I think Raphael and Genoa will most likely use IOD on the same process technology. If they used different, Graymon would have said that.

There also isn't as much extra cost to the design since almost everything on the IO die would have been designed for Rembrandt which uses the same N6 process. While not all of it is a simple copy/paste job, a lot of the blocks could be reused or modified as necessary.

Agreed. That (reuse) was actually my first response on the subject of IOD potentially using TSMC N6:

Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000) | AnandTech Forums: Technology, Hardware, Software, and Deals
 

Joe NYC

Diamond Member
Jun 26, 2021
3,387
4,989
136
How do you think Xilinx IP will help Bergamo ?

For server vendors, there is always encrypt / decrypt, encode / decode. A block of FPGA could be included in the CPU and could then be programmed for specific needs of the client, for encryption methods, codecs, which can then be changed or new codecs added.

Also, the whole area of Confidential Computing is striving to be dealing with data being encrypted at all parts of the transit, and here again, the FPGA could offload some of that.
 

Mopetar

Diamond Member
Jan 31, 2011
8,450
7,657
136
The opposite seems more likely. The Epyc IO die may be very pad limited without some advanced packaging technology. It is not that large and it somehow has 12 channel DDR5, 128 pci-express 5, and 12 cpu die connections. That is a ridiculous number of pads required.

I'm not sure what the argument here is. All of that IO is precisely what doesn't benefit from a node shrink to the same degree as logic or even SRAM does. Why use an advanced node when you can't shrink below a certain size for the very reason you outline?

The consumer IO die will likely only have 2 channel memory and 32 high speed serdes for pci-express, SATA, etc. The gpu that is included has been rumored to be very small. It is just there to provide roughly the same features across the whole line. This makes it look like they may make the consumer IO die at GF and the Epyc IO die at TSMC.

It doesn't matter if the GPU itself is small in the sense that it only has 2 compute units. Look at a die annotation for any of AMD's APUs and see how much additional logic is included so that the GPU can actually connect to a display.

YtxGU48DGLgI2sCX.jpg

Source: https://www.techpowerup.com/264801/amd-renoir-die-shot-pictured

Even though it's not a powerful GPU it still needs to include a lot of other circuitry that isn't a shader just to function as a GPU. I wouldn't be surprised if most of the transistors dedicated to the GPU aren't going to the shaders. Some of that (e.g. ROPs) can scale downward with the reduced number of CUs, but other parts can't.

That seems like it makes more sense to me. The power consumption of the consumer IO die isn’t really that important. The power consumption for the Epyc IO die is a lot more important, so it seems more likely to be on 6 nm and use some advanced packaging tech to handle the ridiculous amount of IO. The consumer IO die wouldn’t really need any advanced packaging tech. It would likely be larger than the current IO die due to the gpu, so probably not of limited. Why waste TSMC wafers when it isn’t required.

Although it seems that way on the face of things, you actually have it backwards. AMD faces fiercer competition in the desktop space than they face in server right now. The advantages that they have in terms of the number of cores and the efficiency of those cores will mean that even if the IO die isn't as efficient as it could be they'll still come out on top. Meanwhile on desktop where there will be more fierce competition, any advantage that AMD can get will benefit them.

I mentioned that the desktop IO die for Zen 2/3 is about 125 mm^2. Meanwhile the IO die used on Epyc CPUs is over 400 mm^2 and that's for the current dies which only have 8 memory channels and connect to at most 8 chiplets. AMD can get roughly 4x as many desktop IO dies from a wafer as they can server IO dies. The only real justification for using it for servers is that the server chips can better absorb the extra cost of using TSMC for an IO die since they're a higher margin part.
 
  • Like
Reactions: scineram and Tlh97