Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 24 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
799
1,351
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

Untitled2.png


What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts! :)
 
Last edited:
  • Like
Reactions: richardllewis_01

Ajay

Lifer
Jan 8, 2001
15,430
7,849
136
Why do you think that neither are probably N5?

For mass-market availability. Like Milan now, specific hyperscalers will have availability.

Bergman has already noted that 7nm is supply constrained (thestreet.com)
Bergman: “TSMC has been a great partner for AMD. We've certainly grown over the last couple of years a great deal, and as now our primary foundry, they've been a big part of helping us through that growth. And as we look forward, they're key for us to continue that growth in the coming quarters and years.”

“In terms of specific product lines, I can't really give specifics on where we're seeing tightness...But yes, we would like to have more wafers going forward, to help the growth that we have in front of us. From a worldwide perspective, there continues to be tightness. But we're working through that with the TSMC, and we’ll see what 2021 brings us.”

But part of the Zen3 shortage was that popularity for Vermeer was higher than AMD's optimistic projections (same article).
 

scineram

Senior member
Nov 1, 2020
361
283
106
Why do you think that neither are probably N5?

For mass-market availability. Like Milan now, specific hyperscalers will have availability.
All the previous discussion about wafer availability on N5. It is not an accident that AMD only commited Zen 4 to 5 nm.1607012734186.png

Now it does leave open the possibility, but they probably did this for a reason. Even if RDNA3 and CDNA 2 are chiplets, that really helps with yields, but not wafer supply. Now the decision had to have been made already, and if they did choose 5 nm, I think they would have said so. N6 for GPU is more likely.

Zen 4 was in design when Zen 3 released, not even taped out. With around 15 months cadence it should be ready in 2022, as the roadmaps indicate. Also consumer cpu always launches first. Server validation takes much longer, why not sell high margin chips for desktops in the meantime?
 

uzzi38

Platinum Member
Oct 16, 2019
2,622
5,880
146
All the previous discussion about wafer availability on N5. It is not an accident that AMD only commited Zen 4 to 5 nm.View attachment 35014

Now it does leave open the possibility, but they probably did this for a reason. Even if RDNA3 and CDNA 2 are chiplets, that really helps with yields, but not wafer supply. Now the decision had to have been made already, and if they did choose 5 nm, I think they would have said so. N6 for GPU is more likely.

Zen 4 was in design when Zen 3 released, not even taped out. With around 15 months cadence it should be ready in 2022, as the roadmaps indicate. Also consumer cpu always launches first. Server validation takes much longer, why not sell high margin chips for desktops in the meantime?
And what would you think of the possibility that RDNA3 could be designed with multiple nodes in mind, hence the lack of specification as to which node?

Also Genoa is different. It is launching before Zen 4 desktop and mobile. It's also very expensive, so it won't actually phase out Milan. At the least not to begin. Similar relationship to the Sapphire Rapids that ships in 2021 and Ice Lake SP.
 

maddie

Diamond Member
Jul 18, 2010
4,738
4,667
136
All the previous discussion about wafer availability on N5. It is not an accident that AMD only commited Zen 4 to 5 nm.View attachment 35014

Now it does leave open the possibility, but they probably did this for a reason. Even if RDNA3 and CDNA 2 are chiplets, that really helps with yields, but not wafer supply. Now the decision had to have been made already, and if they did choose 5 nm, I think they would have said so. N6 for GPU is more likely.

Zen 4 was in design when Zen 3 released, not even taped out. With around 15 months cadence it should be ready in 2022, as the roadmaps indicate. Also consumer cpu always launches first. Server validation takes much longer, why not sell high margin chips for desktops in the meantime?
Higher yields effectively means more wafers.
 

jamescox

Senior member
Nov 11, 2009
637
1,103
136
All the previous discussion about wafer availability on N5. It is not an accident that AMD only commited Zen 4 to 5 nm.View attachment 35014

Now it does leave open the possibility, but they probably did this for a reason. Even if RDNA3 and CDNA 2 are chiplets, that really helps with yields, but not wafer supply. Now the decision had to have been made already, and if they did choose 5 nm, I think they would have said so. N6 for GPU is more likely.

Zen 4 was in design when Zen 3 released, not even taped out. With around 15 months cadence it should be ready in 2022, as the roadmaps indicate. Also consumer cpu always launches first. Server validation takes much longer, why not sell high margin chips for desktops in the meantime?
If it uses chiplets and/or die stacking, then there is some possibility for different components to be made one different processes. AMD‘s gpu With infinity cache is looking surprising similar to an Epic IO die. They have full cpu-like virtual memory and possibly cache coherency. The Epyc IO die is technically 512-bit memory. The unified memory controller design may be very similar, just with different physical interfaces. This may allow them to reuse some portions of the design. If they do something with chip stacking and interposers, I guess it may even be possible to reuse the same silicon for infinity fabric and infinity cache. The physical memory interfaces would just need to be stacked somehow. I am not sure how plausible shared silicon is, but the infinity architecture could allow them to make some components (infinity fabric links, infinity cache, memory controllers, etc) on a different process and just make the compute units on N5, even for GPUs. I don’t know what rdna has as far as infinity fabric links, but cdna devices for HPC could have many links for connecting up to 8 other cdna devices. That was already in one of their infinity architecture slides. That starts looking a lot like an Epyc IO die.

The current Epyc Rome package is up to 8*75 square mm (600 square mm) for cpu die and ~435 for the IO die for over 1000 square mm of silicon total. That is only for the high core count parts though. Some are only 2 or 4 cpu die. Only the >32 core parts and some of the 7Fxx parts have the full 8 cpu chips. Most EPYC processors sold are probably only 4 cpu die (300 square mm) and the 435 square mm IO die from GF. That is a big savings as far as 7 nm wafer supply. A 28 core Intel Xeon is close to 800 square mm on 14 nm, if I am remembering correctly. Desktop parts are similarly less than half (might be near half if on the same process tech) 7 nm for the common parts (single cpu die plus GF IO die). AMD, on average, probably only needs about half of the wafer supply compared Intel monolithic die.

You would expect GPUs to be more logic heavy than cpus, but with the gigantic infinity cache on big Navi GPUs, that may have changed. I would expect that AMD cpus are still more cache heavy, but they have more cache than competing parts; Intel had to up the cache sizes significantly also.
 

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
If you were going to go for SMT4, you would have to deepen the L1 caches and associated instruction buffers, and it would make sense to widen the CPU to enable dispatching more micro-ops per cycle to keep all the threads happy. Depending on the situation, and how well the processor can allocate it's resources, it could actually help single thread performance in some situations. However, it would start to impact max frequency negatively.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,601
5,780
136
Whether AMD could begin leveraging TSMC’s 7nm EUV process technology for CPU and GPU production.

Bergman: “Certainly EUV is an option with that process, and there are no technical limitations holding us back, nor are there huge technical benefits either. So that's more of a manufacturing question...Now there are additional nodes on 7-nanometer that we’ll take advantage of over time. But again, nothing to disclose.”
So they did consider N7+ but could not find huge benefits.
Seems like N6 is officially on the table. The other "additional node on 7-nanometer" is N6.
 

randomhero

Member
Apr 28, 2020
180
247
86
So they did consider N7+ but could not find huge benefits.
Seems like N6 is officially on the table. The other "additional node on 7-nanometer" is N6.
That explains Warhol a bit.
So enhaced Zen3 with new IO die.Maybe without on package SerDes, but Si bridge.
Zen4 only for enterprise on 5 nm.
 

yuri69

Senior member
Jul 16, 2013
387
617
136
The all-in Zen 4 line using 5nm would likely face capacity issues.

Even in Q2 2021 the TSMC 5nm capacity is planned to be lower than the current 7nm one (110k vs 140 wpm). So it seems likely the 5nm might be initially used for high-margin markets only aka server/HPC.

That would have fragmented the lineup - Zen 3(+) being at 7nm, 7nm EUV, or 6nm and Zen 4 at 5nm. Interesting times ahead.
 

Ajay

Lifer
Jan 8, 2001
15,430
7,849
136
So they did consider N7+ but could not find huge benefits.
Seems like N6 is officially on the table. The other "additional node on 7-nanometer" is N6.
Seems likely. Sadly, Bergman declined to provide any specific product or process details, as per usual.
 

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
That interview still seems to lead us to the conclusion above: Zen3"+" products on either N7+ or N6, most likely for desktop where power and core count/package space aren't at a premium, and Zen4 on N5( maybe p?) For Epyc/TR packages and for whatever follows Cezanne/Lucienne as Zen4/RDNA2.

I also think that since Cezanne/Lucienne are their last Vega products, it would make sense to unleash their FP32 performance capabilities. As we saw with Bristol Ridge, it is possible to allow iGPUs to extract their full capabilities. Vega Is still a computer monster. Since AMD is barely present in the tiny mobile workstation GPU market, this would allow them more capabilities in that market for minimal investment and likely no negative impact on their other products. It could even be restricted to just their "pro" SKUs. It would give them a competitive edge on Xe in at least one area.
 

Ajay

Lifer
Jan 8, 2001
15,430
7,849
136
That interview still seems to lead us to the conclusion above: Zen3"+" products on either N7+ or N6, most likely for desktop where power and core count/package space aren't at a premium
I don't think anything in that article points to a Zen3+ product coming down the pike.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,601
5,780
136
I guess going forward it is going to be pure fantasy here on this thread going by how the Zen3 launch turn out. Zero information :expressionless: .
Also AMD has more or less established credibility to be able to execute and are going to be less likely to be open.
At best speculation from patents only.
Linux commits are also gone, They did not even update GCC/Clang/AOCC compilers for Zen3 until last week.
 

soresu

Platinum Member
Dec 19, 2014
2,657
1,858
136
I guess going forward it is going to be pure fantasy here on this thread going by how the Zen3 launch turn out. Zero information :expressionless: .
Just goes to show how effective their information control has become.

Even if a rumour/leak is correct you would never know as it gets swarmed by a ton of other stuff.
 

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
What we know are their roadmaps and interviews. We know Zen4 on N5 is planned. We know that they are going to use other N7 family nodes for future products. We know that they are working towards stacking from patents. We know that they still want to progress core counts. What we don't know are specifics.
 

jamescox

Senior member
Nov 11, 2009
637
1,103
136
The all-in Zen 4 line using 5nm would likely face capacity issues.

Even in Q2 2021 the TSMC 5nm capacity is planned to be lower than the current 7nm one (110k vs 140 wpm). So it seems likely the 5nm might be initially used for high-margin markets only aka server/HPC.

That would have fragmented the lineup - Zen 3(+) being at 7nm, 7nm EUV, or 6nm and Zen 4 at 5nm. Interesting times ahead.
Given that no one seems to be able to match TSMC currently, TSMC‘s production capacity is going to stay in short supply just due to demand. If AMD sticks with 32 MB L3 for Zen 4, which I think makes sense, the CCD die will be very small even without much of a shrink. They may move to larger L2 caches in some manner. Apple does very well with large, shared L2, although that sharing generally isn’t going to scale to a large number of cores. Intel went with large shared L3 with Nehalem at 4-cores. The previous core 2 duo processors with large L2 shared between 2 cores still does very well for some applications. I still use a core 2 duo based 17 inch MacBook Pro (the last 17 inch MacBook Pro). It is unclear if they will just increase L2 size, or re-architect it a bit. I am wondering if it would actually make sense to go back to 2 cores sharing a single larger L2, then from the L3 perspective, it looks somewhat like a 4 core CCX again. That would make very efficient use of resources, but it may affect software scheduling optimizations again.

The current CCD is only around 75 square mm, which is probably smaller than most mobile chips which include a gpu and a bunch of other fixed function hardware. They may reduce the die size by moving to chip stacking of some form so they will not need 32-bit (single direction) serdes links between CCD and IO die. Increasing the L2 cache sizes may increase the die area slightly, but it may not be significant due to the node shrink. If they share the L2 among 2 cores or something, then that could allow them to be much more efficient on die area. At 5 nm they may be able to add a lot of functionality with minimal increase in die size.

They would still want to sell the CCD in the higher mid-range and high end desktop parts. Basically, they make massive numbers of CCD, which isn’t hard to do given the number of die per wafer at such a small size. The higher leakage, higher clocking die go to the desktop market where a 65 watt 6-core is okay. The lower leakage parts get stockpiled for Epyc. There is a good chance that the Ryzen CCDs just aren’t good enough to be Epyc processors except for some of the weird 2 CCD Epyc parts. The 5950x may be be one high leakage, high clock die paired with a lower leakage, more Epyc-like die (higher price). Given the pricing structure, they want you to buy the 6-core or the 12-core, which are probably both not good enough power consumption bin for Epyc.

I don’t think we are going to see increased L3 with Zen 4. It just doesn’t really seem to make much sense. The latency penalty would be high, but I guess larger L2 could make up for it, so I guess we can’t rule it out. They could make multiple types of CCD eventually. They could use a 64 MB CCD for high end database applications and HPC, but I don’t know if it would be necessary if Epyc gets L4 (infinity cache). I don’t think they would add L4 cache to the desktop IO die; 32MB is already huge. We may see some more APUs in the desktop market and those could be candidates for staying on 7 nm or 6 nm while leaving the 5 nm allocation for Genoa CCD with essentially salvage CCD for high-end desktop. Zen 3 is pretty much only higher end mid-range and high end desktop plus presumably stockpiles for Milan currently. We don’t have <300$ Ryzen 5000 yet; anything less than the 5600x might still be an APU.
 

jamescox

Senior member
Nov 11, 2009
637
1,103
136
What we know are their roadmaps and interviews. We know Zen4 on N5 is planned. We know that they are going to use other N7 family nodes for future products. We know that they are working towards stacking from patents. We know that they still want to progress core counts. What we don't know are specifics.
With chip stacking possibilities, it is almost impossible to speculate. I had thought they would want to double the cores with Genoa, but that is actually dependent on how much bandwidth DDR5 can provide. Or at least it is at first thought it is. If you throw in the possibility of infinity cache in an Epyc IO die, interposer, whatever, then the bandwidth argument kind of go out the window also. I guess we may be in for a surprise.
 

moinmoin

Diamond Member
Jun 1, 2017
4,944
7,656
136
Here is an analyst who argues it's possible that AMD (including console business) surpasses Apple as biggest TSMC customer in revenue in 2022:


The reasoning is alright and doesn't even mention the server business that has a growing backlog and the biggest "G"PU die of them all.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,601
5,780
136
Here is an analyst who argues it's possible that AMD (including console business) surpasses Apple as biggest TSMC customer in revenue in 2022
By revenue probably quite behind.
But by volume? Right now AMD is estimated to be using 45k wpm. That too constrained by supply.
Counting wafer starts only that's like 70% of entire N5 output.
In Q4 21 Xilinx's 7nm/5nm share will add to AMD's tally as well.
By volume of wafer starts AMD should already be close, if not surpass, in next Quarters by virtue of increased supply which they disclosed.
 
Last edited:

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
If AMD intends to grow their volume and market share, they absolutely have to get with Samsung for a non-trivial portion of their water demands. GloFo is essentially out of the game for them unless they do something on the low end for AMD on 12LP+. Eventually, Raven2 needs replacing. Maybe a "half Renoir"?