Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Vattila · Oct 6, 2019

Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts!

DrMrLordX · Feb 14, 2022

leoneazzurro said:
Probably on the server side.

I would expect integrated FPGAs to be pretty niche. At least at first.

leoneazzurro · Feb 14, 2022

It will depend on which type on IP AMD will leverage on. If they are so willing to integrate these in their product line, they for sure have some customers asking for them.

moinmoin · Feb 14, 2022

Xilinx offered a lot of different IPs so I'm honestly surprised if there weren't some small parts in AMD products already. Though the teaser for 2023 is bound to be for something bigger. Maybe the FPGA capability embedded in cores already like hinted by several patents? Then AMD has its own "software defined silicon". 😬

maddie · Feb 14, 2022

Doesn't Xilinx have a large amount of leading edge packaging IP also, besides FPGA?

ryanjagtap · Feb 14, 2022

maddie said:
Doesn't Xilinx have a large amount of leading edge packaging IP also, besides FPGA?

Not only that, can they use only Xilinx IPs or their partner IPs as well (like Arastu, etc)?

DisEnchantment · Feb 14, 2022

maddie said:
Doesn't Xilinx have a large amount of leading edge packaging IP also, besides FPGA?

Xilinx do have a ton of packaging IP, but anything chiplet AMD has been racking up a ton and they have a lot more of them.
Xilinx IP pool is smaller, close to 5K. AMD has a bigger patent pool close to 3x that of Xilinx not counting around 1.5-2K in various stages of approval.
Good thing about the patent pool of these two is that they are more focused on the area they do business in unlike many others who patent many things not even remotely related to what their core business is.

moinmoin said:
Then AMD has its own "software defined silicon".

You can argue software defined silicon is actually more suitable here since you can download the actual HDL code to the FPGA and change the HW behaviour on the fly

FPGAs are not cheap though, so not sure where they will deploy them.
But Video Codecs, tailored AI algos etc are so fast to deploy on FPGAs using Xilinx store components.

Hans Gruber · Feb 14, 2022

Another reason why AMD would release Zen 4 earlier. Not just as a response to Alder Lake. If Intel is planning to quickly follow up with another Alder Lake revision in rapid succession. It would make Zen 4 underwhelming and return AMD to a value play CPU company. By releasing Zen 4 earlier. It would be on the market before Intel has a follow up to the current Alder Lake processors. Keeping Zen 4 relevant (market leading performance) for several months and protecting their margins.

People forget that Intel is on the 10nm process that was delayed for years. They are moving to 7nm silicon by next year if not late 2022. Intel was on 14nm and still is on the CPU's previous to Alder lake. That has been 6 years on 14nm.

We all know Zen 4 will be significantly better than Alder Lake. Zen 3 should have been on 5nm TSMC silicon. AMD has a silicon cushion because they have not been on the cutting edge of nodes offered by TSMC.

All is not lost for AMD. They simply have to adjust their product release dates on Zen 4.

dnavas · Feb 14, 2022

DisEnchantment said:
But Video Codecs, tailored AI algos etc are so fast to deploy on FPGAs using Xilinx store components.

It'll be interesting to watch how this evolves. I get visions of LLVM-generated "code" that gets installed in processors as part of giant distributed big-data processing. Maybe that's the bit-banger in me.

Saylick · Feb 14, 2022

I have to at least expect that Xilinx's SmartNIC products will integrate something from AMD... Currently they use ARM cores with some nondescript vector engine and their own FPGA block. I am not sure how big of a deal it would be to run x86 on the NIC instead of ARM, especially since Nvidia's competing Mellanox SmartNICs are also ARM-based, but that seems like a relatively small product to integrate first.

As for longer term collaboration, I think we're only a few years out before we see an AMD heterogeneous HPC package that has CPU, GPU, and FPGA chiplets all under the hood. The Xilinx-side of the company will focus on developing a software platform for the ENTIRE processor to run. At that point, it will just be a matter having the software decide where best to run your code.

eek2121 · Feb 14, 2022

Hans Gruber said:
Another reason why AMD would release Zen 4 earlier. Not just as a response to Alder Lake. If Intel is planning to quickly follow up with another Alder Lake revision in rapid succession. It would make Zen 4 underwhelming and return AMD to a value play CPU company. By releasing Zen 4 earlier. It would be on the market before Intel has a follow up to the current Alder Lake processors. Keeping Zen 4 relevant (market leading performance) for several months and protecting their margins.

People forget that Intel is on the 10nm process that was delayed for years. They are moving to 7nm silicon by next year if not late 2022. Intel was on 14nm and still is on the CPU's previous to Alder lake. That has been 6 years on 14nm.

We all know Zen 4 will be significantly better than Alder Lake. Zen 3 should have been on 5nm TSMC silicon. AMD has a silicon cushion because they have not been on the cutting edge of nodes offered by TSMC.

All is not lost for AMD. They simply have to adjust their product release dates on Zen 4.

AMD stated 2H 2022 for Zen 4. I believe AMD over a rando on the chiphell forums.

EDIT: the original claim is here: https://www.chiphell.com/forum.php?mod=viewthread&tid=2392910&page=1#pid49276473

jpesk2 · Feb 14, 2022

Some back of the napkin math. If the rumor about 18% IPC and 7% all core boost increase and 8.7% single core boost.
Cinebench r23
Single Core would be around 2140 (single core boost of about 5.35Ghz +18% IPC)
Multi Core would be around 36300 (mulit core boost of about 4.5Ghz +18% IPC)
This assumes pretty much perfect scaling with IPC and frequency (I know that is almost never a thing)

I have no idea if the rumor from chiphell is even close, but if it is Zen 4 is looking pretty darned good.
I think the CPU market over the next few years for consumers is going to be great -- the performance available for desktop users is going to be ridiculous.

Mopetar · Feb 14, 2022

Joe NYC said:
Graymon says TSMC N6 for Zen4 IOD, BTW.

If they did use N6 for any IO dies it would be consumer. The Zen 2/3 IO die on GF's 12nm node was only about 125 mm^2 so they can still get a lot of them per wafer. If the rumors about integrated graphics on the IO die are true then they would be adding a lot of logic that does get more benefits from being on TSMC.

There also isn't as much extra cost to the design since almost everything on the IO die would have been designed for Rembrandt which uses the same N6 process. While not all of it is a simple copy/paste job, a lot of the blocks could be reused or modified as necessary.

DisEnchantment · Feb 14, 2022

jpesk2 said:
Some back of the napkin math. If the rumor about 18% IPC and 7% all core boost increase and 8.7% single core boost.
Cinebench r23
Single Core would be around 2140 (single core boost of about 5.35Ghz +18% IPC)
Multi Core would be around 36300 (mulit core boost of about 4.5Ghz +18% IPC)
This assumes pretty much perfect scaling with IPC and frequency (I know that is almost never a thing)

I have no idea if the rumor from chiphell is even close, but if it is Zen 4 is looking pretty darned good.
I think the CPU market over the next few years for consumers is going to be great -- the performance available for desktop users is going to be ridiculous.

Just in case folks forget the famous drum about higher than industry trend for IPC, AMD was beating.18% IPC over a period of two years is poor.
If it is really this, and it may well be, they will have to eat their own words, they better be ready to take a lot of flak. Don't forget the >1.6x Xtor gain they get on 72.225 mm2 N5 CCD.
It's likely not. but you never know, could even be lesser too

jpesk2 · Feb 14, 2022

DisEnchantment said:
View attachment 57442

Just in case folks forget the famous drum about higher than industry trend for IPC, AMD was beating.18% IPC over a period of two years is poor.
If it is really this, and it may well be, they will have to eat their own words, they better be ready to take a lot of flak. Don't forget the >1.6x Xtor gain they get on 72.225 mm2 N5 CCD.
It's likely not. but you never know, could even be lesser too

Were they talking strictly IPC or was it pure performance? If it is pure performance than with just the speculation that came from chiphell puts it at almost a 30% improvement from Zen3. Redgmaming, MLID, and some others are saying that it should be around 40% improvment over zen3 -- I think 30% is probably more likely.

RnR_au · Feb 14, 2022

Kepler tweets in the thread that the all core frequency boost is much higher than the 8.7% stated in the rumour. Later, he says its >20% for all-core workloads like Cinebench, Blender, etc.

That gets it closer to the 40% overall that has been mentioned by others.

https://twitter.com/x/status/1493103049098809353

Saylick · Feb 14, 2022

DisEnchantment said:
View attachment 57442

Just in case folks forget the famous drum about higher than industry trend for IPC, AMD was beating.18% IPC over a period of two years is poor.
If it is really this, and it may well be, they will have to eat their own words, they better be ready to take a lot of flak. Don't forget the >1.6x Xtor gain they get on 72.225 mm2 N5 CCD.
It's likely not. but you never know, could even be lesser too

That plot was for ST performance, so it includes IPC and clock gains.

This is a graph of raw performance, combining both the efficiency of processing instructions (IPC, instructions per clock) as well as benefits from an improved manufacturing process. AMD cites the industry trend as growing 7-8% every year, and that they fully expect the Zen package, due to design improvements in the microarchitecture and better foundry notes, will go above and beyond that 7-8% figure.

With an 18% IPC gain and 7% increase in peak clocks over Zen 3, which came out in Nov 2020, I think they're on track to beat the previous 7-8% CAGR in ST performance. Knowing now that Intel just sat on their laurels for the second half of the last decade, 7-8% is a low bar to beat. 1.08*1.08 = 1.166, so the 18% IPC gains alone satisfies their goal.

jamescox · Feb 14, 2022

Mopetar said:
If they did use N6 for any IO dies it would be consumer. The Zen 2/3 IO die on GF's 12nm node was only about 125 mm^2 so they can still get a lot of them per wafer. If the rumors about integrated graphics on the IO die are true then they would be adding a lot of logic that does get more benefits from being on TSMC.

There also isn't as much extra cost to the design since almost everything on the IO die would have been designed for Rembrandt which uses the same N6 process. While not all of it is a simple copy/paste job, a lot of the blocks could be reused or modified as necessary.

The opposite seems more likely. The Epyc IO die may be very pad limited without some advanced packaging technology. It is not that large and it somehow has 12 channel DDR5, 128 pci-express 5, and 12 cpu die connections. That is a ridiculous number of pads required.

The consumer IO die will likely only have 2 channel memory and 32 high speed serdes for pci-express, SATA, etc. The gpu that is included has been rumored to be very small. It is just there to provide roughly the same features across the whole line. This makes it look like they may make the consumer IO die at GF and the Epyc IO die at TSMC.

That seems like it makes more sense to me. The power consumption of the consumer IO die isn’t really that important. The power consumption for the Epyc IO die is a lot more important, so it seems more likely to be on 6 nm and use some advanced packaging tech to handle the ridiculous amount of IO. The consumer IO die wouldn’t really need any advanced packaging tech. It would likely be larger than the current IO die due to the gpu, so probably not of limited. Why waste TSMC wafers when it isn’t required.

Low end parts might be actual GF made APUs. Higher end parts, but <= 8 core might be TSMC made APUs with the MCM desktop parts only being 8 to 16 cores with GF IO die. Chipsets would still be GF. There has been some rumors of a lower end socket between full Epyc and desktop parts. That is a huge gap without something. Perhaps they will have an IO die on GF that is roughly half size; 6 channel memory, 64 lanes, only 4 or 8 GMI for cpu die. I don’t know if 6 makes sense if the IO die is still divided into 4 quadrants. That would allow Threadripper, workstation, and low end server parts without using the full, massive, SP5 Epyc socket.

jpiniero · Feb 14, 2022

jamescox said:
Why waste TSMC wafers when it isn’t required.

I doubt they would have put the IGP in there if the IO die was using a GloFo node. It has to be N6.

Joe NYC · Feb 14, 2022

DisEnchantment said:
Zen 5 ??

It’s Day One For The Combined AMD And Xilinx And CEO Lisa Su Is Energized

#1-ranked tech industry analyst Patrick Moorhead had the chance to catch up with AMD CEO Lisa Su this weekend to talk about Day One of the combined AMD and Xilinx.

www.forbes.com

https://twitter.com/x/status/1493224799837069312

Hopefully Bergamo

Joe NYC · Feb 14, 2022

Mopetar said:
If they did use N6 for any IO dies it would be consumer. The Zen 2/3 IO die on GF's 12nm node was only about 125 mm^2 so they can still get a lot of them per wafer. If the rumors about integrated graphics on the IO die are true then they would be adding a lot of logic that does get more benefits from being on TSMC.

I think Raphael and Genoa will most likely use IOD on the same process technology. If they used different, Graymon would have said that.

Mopetar said:
There also isn't as much extra cost to the design since almost everything on the IO die would have been designed for Rembrandt which uses the same N6 process. While not all of it is a simple copy/paste job, a lot of the blocks could be reused or modified as necessary.

Agreed. That (reuse) was actually my first response on the subject of IOD potentially using TSMC N6:

Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000) | AnandTech Forums: Technology, Hardware, Software, and Deals

Saylick · Feb 14, 2022

Joe NYC said:
Hopefully Bergamo

How do you think Xilinx IP will help Bergamo ?

turtile · Feb 14, 2022

Joe NYC said:
Hopefully Bergamo

My guess is that we will first see Xilinx inside Zen 4 APUs to accelerate consumer workloads.

Joe NYC · Feb 14, 2022

Saylick said:
How do you think Xilinx IP will help Bergamo ?

For server vendors, there is always encrypt / decrypt, encode / decode. A block of FPGA could be included in the CPU and could then be programmed for specific needs of the client, for encryption methods, codecs, which can then be changed or new codecs added.

Also, the whole area of Confidential Computing is striving to be dealing with data being encrypted at all parts of the transit, and here again, the FPGA could offload some of that.

Mopetar · Feb 14, 2022

jamescox said:
The opposite seems more likely. The Epyc IO die may be very pad limited without some advanced packaging technology. It is not that large and it somehow has 12 channel DDR5, 128 pci-express 5, and 12 cpu die connections. That is a ridiculous number of pads required.

I'm not sure what the argument here is. All of that IO is precisely what doesn't benefit from a node shrink to the same degree as logic or even SRAM does. Why use an advanced node when you can't shrink below a certain size for the very reason you outline?

The consumer IO die will likely only have 2 channel memory and 32 high speed serdes for pci-express, SATA, etc. The gpu that is included has been rumored to be very small. It is just there to provide roughly the same features across the whole line. This makes it look like they may make the consumer IO die at GF and the Epyc IO die at TSMC.

It doesn't matter if the GPU itself is small in the sense that it only has 2 compute units. Look at a die annotation for any of AMD's APUs and see how much additional logic is included so that the GPU can actually connect to a display.

Source: https://www.techpowerup.com/264801/amd-renoir-die-shot-pictured

Even though it's not a powerful GPU it still needs to include a lot of other circuitry that isn't a shader just to function as a GPU. I wouldn't be surprised if most of the transistors dedicated to the GPU aren't going to the shaders. Some of that (e.g. ROPs) can scale downward with the reduced number of CUs, but other parts can't.

That seems like it makes more sense to me. The power consumption of the consumer IO die isn’t really that important. The power consumption for the Epyc IO die is a lot more important, so it seems more likely to be on 6 nm and use some advanced packaging tech to handle the ridiculous amount of IO. The consumer IO die wouldn’t really need any advanced packaging tech. It would likely be larger than the current IO die due to the gpu, so probably not of limited. Why waste TSMC wafers when it isn’t required.

Although it seems that way on the face of things, you actually have it backwards. AMD faces fiercer competition in the desktop space than they face in server right now. The advantages that they have in terms of the number of cores and the efficiency of those cores will mean that even if the IO die isn't as efficient as it could be they'll still come out on top. Meanwhile on desktop where there will be more fierce competition, any advantage that AMD can get will benefit them.

I mentioned that the desktop IO die for Zen 2/3 is about 125 mm^2. Meanwhile the IO die used on Epyc CPUs is over 400 mm^2 and that's for the current dies which only have 8 memory channels and connect to at most 8 chiplets. AMD can get roughly 4x as many desktop IO dies from a wafer as they can server IO dies. The only real justification for using it for servers is that the server chips can better absorb the extra cost of using TSMC for an IO die since they're a higher margin part.

jrdls · Feb 14, 2022

deasd said:
Rumor time:

https://twitter.com/x/status/1493095824976277508

If that 18% IPC increase is relative to zen3, that would be underwhelming, but if this is relative to zen3D, then that's huge.

Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Senior member

Lifer

Golden Member

Diamond Member

Diamond Member

Member

Golden Member

Platinum Member

Senior member

Diamond Member

Diamond Member

Junior Member

Diamond Member

Golden Member

Junior Member

Platinum Member

Diamond Member

Senior member

Lifer

Diamond Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Junior Member