Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 165 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
800
1,363
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

Untitled2.png


What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts! :)
 
Last edited:
  • Like
Reactions: richardllewis_01

Mopetar

Diamond Member
Jan 31, 2011
7,918
6,194
136
I guess for AMD the writing was rather how Intel handles HEDT (or rather how it let that market implode after it wasn't competitive anymore against TR). AMD seems most interested in gaining profitable shares in existing sizable markets, HEDT appears not to be one anymore.

Threadripper still exists, but AMD doesn't have a lot of reason to invest effort in it. Intel has basically ceded that space to AMD and until they face any serious competition don't expect AMD to go out of their way to rush to get products out, particularly when they can sell an even higher margin Epyc CPU using the same parts.
 

Mopetar

Diamond Member
Jan 31, 2011
7,918
6,194
136
They'd either need 12-core chiplets or an IO die that can connect up to three chiplets. Not sure if they'd make such an IO die if the majority of the Ryzen lineup is 1 and 2 chiplet processors.

Making a new custom IO die just for a 3-chiplet product doesn't make a lot of financial sense either.
 

eek2121

Platinum Member
Aug 2, 2005
2,931
4,027
136
They'd either need 12-core chiplets or an IO die that can connect up to three chiplets. Not sure if they'd make such an IO die if the majority of the Ryzen lineup is 1 and 2 chiplet processors.

Making a new custom IO die just for a 3-chiplet product doesn't make a lot of financial sense either.

Who says the IO die limits them? AMD likes to future proof. They will make an IO die that supports up to 3, or even 4 ccds.

In other news, Charlie from SemiAccurate claims that he has seen performance numbers from Bergamo, and that it is a monster.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,618
14,609
136
Who says the IO die limits them? AMD likes to future proof. They will make an IO die that supports up to 3, or even 4 ccds.

In other news, Charlie from SemiAccurate claims that he has seen performance numbers from Bergamo, and that it is a monster.
Well, it would be nice to read something other than that statement. Its behind a paywall, right ? Anybody know any details ?
 
  • Like
Reactions: Drazick

Joe NYC

Platinum Member
Jun 26, 2021
2,050
2,550
106
They'd either need 12-core chiplets or an IO die that can connect up to three chiplets. Not sure if they'd make such an IO die if the majority of the Ryzen lineup is 1 and 2 chiplet processors.

Making a new custom IO die just for a 3-chiplet product doesn't make a lot of financial sense either.

We know that Genoa die will have 4 groups of 3 chiplets connected to the IOD. So in theory, it should already be feasible, if the desktop IOD is roughly 1/4 of the server IOD, and is 1 group of 3 chiplets.

But it does not mean that just because it is feasible, AMD will offer such an SKU.
 

Joe NYC

Platinum Member
Jun 26, 2021
2,050
2,550
106
It's interesting that Bergamo is the monster, sounds like the bigger package hierarchy changes are not yet happening with Genoa but actually with Bergamo.

That's what I thought originally, but it seems that approach to Bergamo will, instead, be quite simple (if Charlie's info is correct), within same context as Genoa MCM, re-use of the design, not re-inventing it.
 

moinmoin

Diamond Member
Jun 1, 2017
4,967
7,715
136
That's what I thought originally, but it seems that approach to Bergamo will, instead, be quite simple (if Charlie's info is correct), within same context as Genoa MCM, re-use of the design, not re-inventing it.
So the "monster" part is just in reference to near equal performance per core but 128 of them in Bergamo instead 96 in Genoa?
 

Joe NYC

Platinum Member
Jun 26, 2021
2,050
2,550
106
So the "monster" part is just in reference to near equal performance per core but 128 of them in Bergamo instead 96 in Genoa?

Obviously, a different core, with different performance and power characteristics.

Genoa and Raphael are sharing the CCD (as far as we know), so that CCD will be optimized to get maximum performance, perhaps less concerned with power efficiency than the 4c cores. But yeah, 128 vs. 96 cores.
 

moinmoin

Diamond Member
Jun 1, 2017
4,967
7,715
136
Obviously, a different core, with different performance and power characteristics.

Genoa and Raphael are sharing the CCD (as far as we know), so that CCD will be optimized to get maximum performance, perhaps less concerned with power efficiency than the 4c cores. But yeah, 128 vs. 96 cores.
Previously I suggested the move with Bergamo is in simple terms a TAM expansion of the previously mobile only optimized Zen core, so higher density, higher efficiency at lower frequencies and lower wattage, and a couple cuts compared to the full core, while retaining mostly the same performance aside workloads depending on cut areas. Seems that's still in the cards.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Knowing this, I expect we will eventually see a 24 core SKU just so AMD can hold Intel off until the second half of 2023.

Wouldn't be too surprised to see a 24 core chip, but you'll pay in terms of $ and power. The power use will probably increase by 50% or so and you'd have a mini-HEDT chip in terms of price(not that we're not there already, because $800 for the top of the line chip kinda is HEDT).
 

Joe NYC

Platinum Member
Jun 26, 2021
2,050
2,550
106
Previously I suggested the move with Bergamo is in simple terms a TAM expansion of the previously mobile only optimized Zen core, so higher density, higher efficiency at lower frequencies and lower wattage, and a couple cuts compared to the full core, while retaining mostly the same performance aside workloads depending on cut areas. Seems that's still in the cards.

Yeah, exactly, so AMD will have 2 server parts with different performance characteristics, to more efficiently address their target markets.

And looks like Intel is doing something similar, splitting the server roadmap, but arriving to market fashionably late, ~2 years behind Bergamo (adding usual delays):
 
  • Like
Reactions: scineram

Abwx

Lifer
Apr 2, 2011
11,039
3,686
136
Wouldn't be too surprised to see a 24 core chip, but you'll pay in terms of $ and power. The power use will probably increase by 50% or so and you'd have a mini-HEDT chip in terms of price(not that we're not there already, because $800 for the top of the line chip kinda is HEDT).

With the 5nm node they could release a 32C SKU within same TDP as a 5950X and a 24C would consume the same as a 5900X if it wasnt for the higher throughput/Hz.

If IPC is increased say 20% then they would have to increase the thermals by the same number to work at same frequencies and 2x the core amount.
 

eek2121

Platinum Member
Aug 2, 2005
2,931
4,027
136
Wouldn't be too surprised to see a 24 core chip, but you'll pay in terms of $ and power. The power use will probably increase by 50% or so and you'd have a mini-HEDT chip in terms of price(not that we're not there already, because $800 for the top of the line chip kinda is HEDT).

I could care less about the price. Power might be a bit higher, but still much lower than Alder Lake or Raptor Lake. If they release on I will buy it.
 

jamescox

Senior member
Nov 11, 2009
637
1,103
136
Previously I suggested the move with Bergamo is in simple terms a TAM expansion of the previously mobile only optimized Zen core, so higher density, higher efficiency at lower frequencies and lower wattage, and a couple cuts compared to the full core, while retaining mostly the same performance aside workloads depending on cut areas. Seems that's still in the cards.
It has been my guess for a while now that Bergamo is a stacked device using infinity cache bridge chips. They would likely be the same infinity cache chips used as the MCD for RDNA GPUs. If that is the case, then they could likely fit 16 full Zen 4 cores on a die similar in size to a standard Zen 4 die with 32 MB L3. If the L2 cache is denser on the process they are using then they may even be able to increase the L2 size. If they are stacked and packed densely (could possibly fit into a single reticle size), then they obviously need to be very low power. High core count Genoa already will need to clock low due to power consumption. The 64 core versions of Milan only clock at 2 GHz and 2.45 GHz base clock. Perhaps it turned out that Zen 4c can still clock quite high, even on the power optimized process. If it can clock even a little higher with 512 MB to 1 GB of stacked cache and 128 cores, it will just destroy everything else on the market.
 

jamescox

Senior member
Nov 11, 2009
637
1,103
136
We know that Genoa die will have 4 groups of 3 chiplets connected to the IOD. So in theory, it should already be feasible, if the desktop IOD is roughly 1/4 of the server IOD, and is 1 group of 3 chiplets.

But it does not mean that just because it is feasible, AMD will offer such an SKU.
They need something in between 2 and 12 channel memory. That is a ridiculous gap in capabilities. A lot of servers don’t need 12 channel DDR5 or a lot of the other stuff that Genoa offers. They really could use a 4 or 6 channel, 64 pci express lane part for lower end servers, workstations, and HEDT. A smaller socket has been rumored to exist, but we really don’t know what is on it if it does actually exists. It seems likely to exists given the massive gap between AM5 and SP5; SP5 would be ridiculously expensive for a lot of systems.

I have wondered if they will pull something outside of the box though, like connecting two desktop IO die together to make the 4 channel memory and 64 lanes rather than use expensive (possibly) TSMC-made Genoa IO die. It seems to make a lot more sense to reuse multiple lower end parts than to waste a high end part by disabling half of the functionality unless they have a lot of salvage full Epyc IO die. It seems unlikely that they would have enough defective Epyc IO die to fill the markets in question.

It doesn’t seem like it would take much to connect two IO die together; a silicon bridge would be preferred over a serdes connection though. It would be great if there is still a version with 3 cpu links and 3 memory channels. If that exists, then they could plausibly offer up to 3 cpu die products and then use one of the links to connect 2 IO die together to make a 4 die product with 2x memory and 2x pci-express. That would still be limited to 4 die, but I suspect a lot of AMD Epyc sales are 32 core or less, so it could be a very cheap product to make. Also, if AM5 is to be used for Zen 5, which is supposed to be a completely new architecture compared to more of an update like Zen 4, then it seems like they need the headroom of more than 2 channel memory. I don’t know if there is enough pins on AM5 for 3 channel memory though.
 

eek2121

Platinum Member
Aug 2, 2005
2,931
4,027
136
They need something in between 2 and 12 channel memory. That is a ridiculous gap in capabilities. A lot of servers don’t need 12 channel DDR5 or a lot of the other stuff that Genoa offers. They really could use a 4 or 6 channel, 64 pci express lane part for lower end servers, workstations, and HEDT. A smaller socket has been rumored to exist, but we really don’t know what is on it if it does actually exists. It seems likely to exists given the massive gap between AM5 and SP5; SP5 would be ridiculously expensive for a lot of systems.

I have wondered if they will pull something outside of the box though, like connecting two desktop IO die together to make the 4 channel memory and 64 lanes rather than use expensive (possibly) TSMC-made Genoa IO die. It seems to make a lot more sense to reuse multiple lower end parts than to waste a high end part by disabling half of the functionality unless they have a lot of salvage full Epyc IO die. It seems unlikely that they would have enough defective Epyc IO die to fill the markets in question.

It doesn’t seem like it would take much to connect two IO die together; a silicon bridge would be preferred over a serdes connection though. It would be great if there is still a version with 3 cpu links and 3 memory channels. If that exists, then they could plausibly offer up to 3 cpu die products and then use one of the links to connect 2 IO die together to make a 4 die product with 2x memory and 2x pci-express. That would still be limited to 4 die, but I suspect a lot of AMD Epyc sales are 32 core or less, so it could be a very cheap product to make. Also, if AM5 is to be used for Zen 5, which is supposed to be a completely new architecture compared to more of an update like Zen 4, then it seems like they need the headroom of more than 2 channel memory. I don’t know if there is enough pins on AM5 for 3 channel memory though.

You don't have to install 12 channels worth of memory. If you only install 6, guess what, EPYC will use those 6. I tested that on my Threadripper and it worked fine.
 
  • Like
Reactions: lightmanek

Joe NYC

Platinum Member
Jun 26, 2021
2,050
2,550
106
They need something in between 2 and 12 channel memory. That is a ridiculous gap in capabilities. A lot of servers don’t need 12 channel DDR5 or a lot of the other stuff that Genoa offers. They really could use a 4 or 6 channel, 64 pci express lane part for lower end servers, workstations, and HEDT. A smaller socket has been rumored to exist, but we really don’t know what is on it if it does actually exists. It seems likely to exists given the massive gap between AM5 and SP5; SP5 would be ridiculously expensive for a lot of systems.

I have wondered if they will pull something outside of the box though, like connecting two desktop IO die together to make the 4 channel memory and 64 lanes rather than use expensive (possibly) TSMC-made Genoa IO die. It seems to make a lot more sense to reuse multiple lower end parts than to waste a high end part by disabling half of the functionality unless they have a lot of salvage full Epyc IO die. It seems unlikely that they would have enough defective Epyc IO die to fill the markets in question.

I don't know where I saw it, that there will be a small Genoa socket S6 with 4 memory channels. Which would be a good step between 2 and 12 channels.

And it would allow AMD to compete in low end servers, where Xeon D currently has no competition.

Also, it would probably be a good socket for Threadripper, and maybe they could seriously try to turn Threadripper into HEDT device, that would cover low core count (16 cores) all the way up, and with generous Power budget...

It doesn’t seem like it would take much to connect two IO die together; a silicon bridge would be preferred over a serdes connection though. It would be great if there is still a version with 3 cpu links and 3 memory channels. If that exists, then they could plausibly offer up to 3 cpu die products and then use one of the links to connect 2 IO die together to make a 4 die product with 2x memory and 2x pci-express. That would still be limited to 4 die, but I suspect a lot of AMD Epyc sales are 32 core or less, so it could be a very cheap product to make. Also, if AM5 is to be used for Zen 5, which is supposed to be a completely new architecture compared to more of an update like Zen 4, then it seems like they need the headroom of more than 2 channel memory. I don’t know if there is enough pins on AM5 for 3 channel memory though.

I don't know what the economics would be of making a new IOD or using full Genoa IOD and disabling part of it, or connecting 2 desktop IODs. It would depend on volume. If AMD is serious about competing in low end server market, separate die would be justified. But an inept effort like current Threadripper would not really be worth a separate die...

As far as AM5, I am going to guess that there are not enough spare pins for extra memory channels...

But I would be more interested in an IO Die for desktop that internally is 3-4 memory channels enabled, and there would be an SKU with ~16 GB of high speed LPDDR5 in the MCM or mobile SOC. If the next gen APU is chiplet based, the built in, fast extra memory channels could significantly enhance the GPU performance.
 
  • Like
Reactions: HurleyBird

LightningZ71

Golden Member
Mar 10, 2017
1,628
1,898
136
Current DT one is around 15W, so if there s any 6nm based i/o it will be at half this number, with GF s 12nm+ it should be within 10W.
If it remained EXACTLY the same logically...

It's not.

Its now supporting DDR5, higher speed links between the CCDs and the IOD, it now has an iGPU with display drivers and media decoder, supports more and higher speed USB ports and, on top of that, PCIe 5 (initially or eventually). NONE of that is free.

And, in MY opinion, those power limits support the possibility of three CCDs, which is 50% more high speed CCD links.

You won't get all that at 66% of current power.
 

jamescox

Senior member
Nov 11, 2009
637
1,103
136
You don't have to install 12 channels worth of memory. If you only install 6, guess what, EPYC will use those 6. I tested that on my Threadripper and it worked fine.
Yeah, so? SP5 and the Genoa IO die is still going to be an expensive platform that is unnecessary in a large number of servers. Threadripper was a somewhat unofficial product from the start. Some AMD employees literally worked on it in their spare time until it actually became official. I always thought that they should have a cut down socket since Threadripper seemed like a bit of a kludge. As their market share grows, making a specific product for different market niches makes more sense. AMD has a lot more R&D budget now so they can afford to design and tape out more products. This isn’t like Zen 1 where it was literally one die for everything.

Early on, when I first started hearing rumors of IO die being made on a TSMC process, I had wondered if they would make a modular IO die. That is, make a single quadrant and connect 4 of them together for Epyc, 2 for Threadripper, and 1 for Ryzen. Using silicon bridges would make it like a single chip. That would be difficult to get it to work properly though and then the gigabyte leak seems to show a monolithic die.

If the smaller socket exist, it might just be a full Epyc IO die, but that seems very wasteful, especially if it is 6 nm TSMC silicon. It seems like they could do a lot better with either a half size IO die, perhaps made at GF, or by combining multiple smaller IO die.