Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 123 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
820
1,456
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

Untitled2.png


What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts! :)
 
Last edited:
  • Like
Reactions: richardllewis_01

jamescox

Senior member
Nov 11, 2009
644
1,105
136
Which makes me wonder. If there are supply chain issues for DDR5 DIMMs, does that also mean that LPDDR5 has the same issues?
Reports were saying something like 8 to 9 month back order on the power management chips. I don’t know if the LPDDR5 variant would use the same power management chip as regular DDR5. If it uses a different one, then that is likely to be in short supply also, so I wouldn’t get my hopes up. I wanted to get a new car and that defiantly isn’t happening. There is a lot of empty car lots, mostly due to chip shortages. The lead times are so long these days that disruptions can go on for months, if not years. It also isn’t just the fabs. I heard some rumors of supply issues do to covid outbreaks at packaging facilities in Malaysia and such. The whole supply chain seems to be a mess.
 

jamescox

Senior member
Nov 11, 2009
644
1,105
136
I am not worried about AMD right now. I think they need to make the AM5 platform compatible with DDR4 and DDR5 memory. I think they need to make sure the next memory controller has no mhz limitations like is previous ryzen CPU limitations on the ram speed.
I don’t know why they would do that. If they can mix and match Zen 3 and Zen 4 IO die, they I would just expect a weird Zen 4 variant paired with a Zen 3 IO die on AM4. That seems unlikely, but IFOP connections are similar to pci-express electrically and pci-express has maintained backward compatibility for a long time. I think AM5 will be DDR5 only.
 

Joe NYC

Diamond Member
Jun 26, 2021
3,387
4,989
136
I think it has all been rumors rather than based on any actual statements by AMD. If anyone knows differently, then post a link. I thought the rumors were saying Threadripper 5000 in November. I also heard some rumors of better Milan availability in November, but it seems that may be pushed into next year. I don’t know how long it has been since Zen 3 was released vs. Zen 2 release to Zen 2 based Threadripper release. It seems like it is delayed, which wouldn’t be surprising really.

For servers CPUs, Milan-X in particular, the release date is nearly meaningless. If hyperscalers get 100% of the first ~6 months of production of Milan X, you can just throw a dart at a calendar and pick a random date and call it a release date..

If regular Milan already has nearly 6 month waiting time, Milan-X could be even longer... It does not mean AMD is not shipping them. AMD is shipping them to the hyperscalers who placed their orders months and months back...
 

Joe NYC

Diamond Member
Jun 26, 2021
3,387
4,989
136
This is wild speculation, but I don’t doubt that AMD will take advantages of the synergies here. At a minimum, I think that the infinity cache chips will be used in cpu and gpu products. It likely isn’t coming until zen4c / Bergamo for CPUs though. I guess we get a preview of it with RDNA3.

BTW, in the latest Investor Presentation PDF from AMD, desktop Raphael is still not on the roadmap.

I heard a rumor that these days, AMD only puts those products on the roadmap that have already taped out. So this could mean that AMD is still fiddling with the design, maybe with the packaging / interconnect...
 

leoneazzurro

Golden Member
Jul 26, 2016
1,114
1,866
136
Tapeout refers more to the silicon, btw. Which, in the case of CCDs, should be already happened. But it must be seen, if the I/O die is already ready or not: this time, it seems it will be a more complex die, because of the integrated GPU (and I would really to know this GPU specs, if it's something like in Rembrandt or cut-down like in the desktip Intel parts). .
 

Joe NYC

Diamond Member
Jun 26, 2021
3,387
4,989
136
Tapeout refers more to the silicon, btw. Which, in the case of CCDs, should be already happened. But it must be seen, if the I/O die is already ready or not: this time, it seems it will be a more complex die, because of the integrated GPU (and I would really to know this GPU specs, if it's something like in Rembrandt or cut-down like in the desktip Intel parts). .

CCD for Genoa, which includes IFoP connection to IOD is already sampling, so that design is done.

We were just speculating earlier that AMD may move past IFoP in the very near future. I thought maybe Bergamo, but who knows, maybe even Raphael may make some changes.

AMD already has the EFB in production in Mi200 cards, which would be a nice improvement over IFoP. And there may be more exotic technologies later on...

Or, the absence of Raphael from the roadmap may very well mean absolutely nothing...

1637763995697.png
 
  • Like
Reactions: lightmanek

DisEnchantment

Golden Member
Mar 3, 2017
1,777
6,787
136
1637766393398.png

Just a slide, but Bergamo seems well within 2022 with H1 2023 availability, i.e. same like Milan X, manufactured like few quarters earlier but gobbled up by favorite partner Azure. Genoa even earlier
 
  • Like
Reactions: lightmanek

Joe NYC

Diamond Member
Jun 26, 2021
3,387
4,989
136
View attachment 53318

Just a slide, but Bergamo seems well within 2022 with H1 2023 availability, i.e. same like Milan X, manufactured like few quarters earlier but gobbled up by favorite partner Azure. Genoa even earlier

You can check out the latest investor presentation either from PDF on AMD web site or just the images heere:

This same server roadmap slide is included alongside the client slide I posted above, which does not show Raphael.
 

Doug S

Diamond Member
Feb 8, 2020
3,396
6,021
136
Reports were saying something like 8 to 9 month back order on the power management chips. I don’t know if the LPDDR5 variant would use the same power management chip as regular DDR5. If it uses a different one, then that is likely to be in short supply also, so I wouldn’t get my hopes up. I wanted to get a new car and that defiantly isn’t happening. There is a lot of empty car lots, mostly due to chip shortages. The lead times are so long these days that disruptions can go on for months, if not years. It also isn’t just the fabs. I heard some rumors of supply issues do to covid outbreaks at packaging facilities in Malaysia and such. The whole supply chain seems to be a mess.

I am not 100% certain but I would assume DDR5 is the only one that needs PMICs, because on-module power regulation was added for that standard. LPDDR5 doesn't have "modules", let alone need to worry about more than one per channel, so there's no need for a separate PMIC.

That's why DDR5 is having supply problems and not DDR4, because DDR4 doesn't include the PMIC on the DIMM (at least the standard doesn't require it, there may be a small segment of high end overclocker DIMMs that use an on chip PMIC)
 

DisEnchantment

Golden Member
Mar 3, 2017
1,777
6,787
136
You can check out the latest investor presentation either from PDF on AMD web site or just the images heere:

This same server roadmap slide is included alongside the client slide I posted above, which does not show Raphael.
Of course they will not show. There can be many reasons why.
  1. They need to keep enough ammunition in order to prepare hype generating announcements to bolster share prices that may be impacted by the outcome of Xilinx Merger.
  2. Stock is not impacted by ADL at all, in fact AMD is soaring. Investors are more or less happy.
  3. They don't feel threatened enough.
  4. AMD's leadership are not fickle minded, they don't feel the need to react to every rustle but remain steadfast on long term goals and execution
  5. Partners are already kept informed
  6. Prevent Osborne effect that may be caused by premature announcements
  7. It will take them a while to build up stock
  8. Keep regular drip feeding instead of complete info dump so that they remain in the news constantly.
  9. Keep competition guessing
  10. ...
AMD's leadership and marketing are doing their job very well
 

leoneazzurro

Golden Member
Jul 26, 2016
1,114
1,866
136
Of course they will not show. There can be many reasons why.
  1. They need to keep enough ammunition in order to prepare hype generating announcements to bolster share prices that may be impacted by the outcome of Xilinx Merger.
  2. Stock is not impacted by ADL at all, in fact AMD is soaring. Investors are more or less happy.
  3. They don't feel threatened enough.
  4. AMD's leadership are not fickle minded, they don't feel the need to react to every rustle but remain steadfast on long term goals and execution
  5. Partners are already kept informed
  6. Prevent Osborne effect that may be caused by premature announcements
  7. It will take them a while to build up stock
  8. Keep regular drip feeding instead of complete info dump so that they remain in the news constantly.
  9. Keep competition guessing
  10. ...
AMD's leadership and marketing are doing their job very well

Well, it's not that they did not state quite clearly Raphael is coming in 2022, after all in their video they told us that quite clearly (desktop platform with DDR5 and PCIE5 in 2022, cannot be Rembrandt as it has no PCIE5)
 

DisEnchantment

Golden Member
Mar 3, 2017
1,777
6,787
136
Well, it's not that they did not state quite clearly Raphael is coming in 2022, after all in their video they told us that quite clearly (desktop platform with DDR5 and PCIE5 in 2022, cannot be Rembrandt as it has no PCIE5)
Being purposefully vague is the name of the game. You can bet they would be saber rattling when empty on the inside. Its predictable behavior from coming from a position of perceived weakness. But not AMD, we know they are cooking up lots of things.
Nothing drives up share prices higher than when Lisa comes up on stage and unleash the announcements and her famous "oh one more thing" parting shot
 
Last edited:
  • Like
Reactions: lightmanek

jamescox

Senior member
Nov 11, 2009
644
1,105
136
CCD for Genoa, which includes IFoP connection to IOD is already sampling, so that design is done.

We were just speculating earlier that AMD may move past IFoP in the very near future. I thought maybe Bergamo, but who knows, maybe even Raphael may make some changes.

AMD already has the EFB in production in Mi200 cards, which would be a nice improvement over IFoP. And there may be more exotic technologies later on...

Or, the absence of Raphael from the roadmap may very well mean absolutely nothing...

View attachment 53314
As I have said, I believe Bergamo is almost certainly a stacked device and I suspect it will be using the same cache / bridge chips as the GPUs; possibly up to 512 MB or 1 GB total cache, likely as L4. I am wondering if they can actually fit it in a single reticle sized package. The IO die might have to shrink to something like 250 mm2 to do that, unless the Zen 4c cpu chiplets are actually smaller than standard Zen 4. I think IFOP links are double width for read vs. IFIS links, so removing 12 of those from the 6 nm Genoa IO die could reduce die size significantly. They could also possibly split some things out into other chiplets or embedded silicon. Getting 12 channel DDR5 and 128 pci-express lanes out of such a small IO die may be difficult. Anyway, TSMC stacking tech can go larger than reticle size, but that likely has some negatives. It will be interesting if they can do 128-cores in a single reticle sized package.
 

leoneazzurro

Golden Member
Jul 26, 2016
1,114
1,866
136
Being purposefully vague is the name of the game. You can bet they would be saber rattling when empty on the inside. Its predictable behavior from coming from a position of perceived weakness. But not AMD, we know they are cooking up lots of things.
Nothing drives up share prices higher than when Lisa comes up on stage and unleash the announcements and her famous "oh one more thing" parting shot

After all, AMD did not show Rembrandt in the investor presentation, and it is literally a product that all rumors point to be released in a couple of months...
 

Ajay

Lifer
Jan 8, 2001
16,094
8,112
136
After all, AMD did not show Rembrandt in the investor presentation, and it is literally a product that all rumors point to be released in a couple of months...
Has AMD officially mentioned anything more specific than 2022 for Rembrandt?
 

DisEnchantment

Golden Member
Mar 3, 2017
1,777
6,787
136
Updated PPRs on AMD TechDocs


Nothing interesting so far, CPUID section not fully updated. Some new tidbits here though

AMD64 Architecture Programmer’s Manual Volume 3: General-Purpose and System Instructions [Version 3.33 Nov 21]
This is some intermediate version, it looks related to the ongoing PR for perf for which maintainers were blocking for lack of documentation

CPUID Fn8000_0022_EBX Extended Performance Monitoring and Debug
15:10 NumDfPmc Number of available Northbridge Performance Monitor Counters
3:0 NumCorePmc Number of Core Performance Counters

• CPUID Fn8000_0022_EBX[NumDfPmc] > 4 indicates support for additional Northbridge
performance counters.
• CPUID Fn8000_0001_ECX[PerfCtrExtLLC] = 1 indicates support for the six L3 performance
counters

CPUID Fn8000_001A_EAX Performance Optimization Identifiers
2 FP256 The internal FP/SIMD execution datapath is 256 bits wide.

What the changes may be hinting
- Additional DF ports?
- few hardening features and SEV additions
- Fused 256 Bit op for AVX 512 looks like at the moment (unless manual changed later)

Folks with the original leaked manual know what is the value of this CPUID register?
CPUID Fn8000_001A_EAX Performance Optimization Identifiers
@Hans de Vries any idea?
 
Last edited:

Joe NYC

Diamond Member
Jun 26, 2021
3,387
4,989
136
As I have said, I believe Bergamo is almost certainly a stacked device and I suspect it will be using the same cache / bridge chips as the GPUs; possibly up to 512 MB or 1 GB total cache, likely as L4. I am wondering if they can actually fit it in a single reticle sized package.

Whay would the package be limited by the max reticle size? I think interposer is likely dead, which is what would make reticle size be a concern.

But, regarding the size of the Geona SP5 package, some of the comments / rumors I encountered is that it will be quite large.

The IO die might have to shrink to something like 250 mm2 to do that

Jim from Adore TV says that the IO Die of Genoa will be 263 mm2, on N6, so you are right in the same ballpark. So that is quite a reduction from 424 mm2 of the current die, about which some people speculated that it is limited by number of pins.

unless the Zen 4c cpu chiplets are actually smaller than standard Zen 4. I think IFOP links are double width for read vs. IFIS links, so removing 12 of those from the 6 nm Genoa IO die could reduce die size significantly. They could also possibly split some things out into other chiplets or embedded silicon. Getting 12 channel DDR5 and 128 pci-express lanes out of such a small IO die may be difficult. Anyway, TSMC stacking tech can go larger than reticle size, but that likely has some negatives. It will be interesting if they can do 128-cores in a single reticle sized package.

It seems like a puzzle.

IF there are more pins for more DDR channels, same number of pins for PCIe, more pins for Infinity Fabric, while the IO Die is smaller, something had to give, it seems. Like some way to propagate the signal into the substrate, other than direct bumps from the IO die, or other ways to connect from IO Die to chiplets.

And Genoa already has to have that solution built in, whatever that solution is...
 

jamescox

Senior member
Nov 11, 2009
644
1,105
136
Whay would the package be limited by the max reticle size? I think interposer is likely dead, which is what would make reticle size be a concern.

But, regarding the size of the Geona SP5 package, some of the comments / rumors I encountered is that it will be quite large.



Jim from Adore TV says that the IO Die of Genoa will be 263 mm2, on N6, so you are right in the same ballpark. So that is quite a reduction from 424 mm2 of the current die, about which some people speculated that it is limited by number of pins.



It seems like a puzzle.

IF there are more pins for more DDR channels, same number of pins for PCIe, more pins for Infinity Fabric, while the IO Die is smaller, something had to give, it seems. Like some way to propagate the signal into the substrate, other than direct bumps from the IO die, or other ways to connect from IO Die to chiplets.

And Genoa already has to have that solution built in, whatever that solution is...

If you look at the stacking technology, most of it is still dependent on the reticle size:


They have things like 1.5x, 2x, and even 3x for some types, but there is still almost certainly some penalty for exceeding reticle size. It may not be much of an issue for some of the technologies, but I would guess that there is a cost penalty some how, even if it is just yields. For full interposer, the cost was significantly more just due to the massive amounts of silicon. The dual chiplet GPUs have to exceed reticle size.

The supposed SP5 leak was saying around 72 mm2 for cpu chiplets and still close to 400 mm2 for the Genoa IO die. It may be pad limited due to the massive amount of IO though. That size could be doable in maybe a 1.5x reticle sized device for Bergamo. If they can remove all of the IFOP physical interface hardware, then it could be smaller. That is a massive amount of pad area reduction since IFOP are wider than IFIS (2x width at half serdes rate, I believe). The Genoa die has 12 IFOP that could be removed. I have wondered if the Genoa IO die might be smaller with some embedded silicon with physical level interfaces to reduce routing congestion and IO die area. It sounds like Bergamo may actually be doable as a 1x reticle device, which would likely reduce cost and complexity. Moving large caches into bridge chips that do not add to the overall area should be a big savings. You have 128-cores in a single reticle sized device though. This can’t have very high power consumption. I am thinking that the serdes IFOP may stick around even into Zen 5 for the lower thermal density that it allows. The regular Genoa package has the die spread out quite far. You need a big heatsink, but it is much easier to cool than a monolithic die.
 

Joe NYC

Diamond Member
Jun 26, 2021
3,387
4,989
136
If you look at the stacking technology, most of it is still dependent on the reticle size:


They have things like 1.5x, 2x, and even 3x for some types, but there is still almost certainly some penalty for exceeding reticle size. It may not be much of an issue for some of the technologies, but I would guess that there is a cost penalty some how, even if it is just yields. For full interposer, the cost was significantly more just due to the massive amounts of silicon. The dual chiplet GPUs have to exceed reticle size.

The monolithic chips and interposers are dependent on the reticle size. So, say Mi100 had an interposer, so it was limited to the reticles size. Mi200 drops the interposer, replaces it with EFB, and is no longer constrained by the reticle size (which the Mi250 exceeded by good margin).

The supposed SP5 leak was saying around 72 mm2 for cpu chiplets and still close to 400 mm2 for the Genoa IO die. It may be pad limited due to the massive amount of IO though.

If it is 400 mm2 for IO die rather than 260 mm2 that Adore TV mentioned, then Genoa probably does not have to make any revolutionary changes, 400 mm2 can probably accommodate incremental IO increases.

That size could be doable in maybe a 1.5x reticle sized device for Bergamo. If they can remove all of the IFOP physical interface hardware, then it could be smaller. That is a massive amount of pad area reduction since IFOP are wider than IFIS (2x width at half serdes rate, I believe). The Genoa die has 12 IFOP that could be removed. I have wondered if the Genoa IO die might be smaller with some embedded silicon with physical level interfaces to reduce routing congestion and IO die area. It sounds like Bergamo may actually be doable as a 1x reticle device, which would likely reduce cost and complexity. Moving large caches into bridge chips that do not add to the overall area should be a big savings. You have 128-cores in a single reticle sized device though. This can’t have very high power consumption. I am thinking that the serdes IFOP may stick around even into Zen 5 for the lower thermal density that it allows. The regular Genoa package has the die spread out quite far. You need a big heatsink, but it is much easier to cool than a monolithic die.

The reticle size is about 800+ mm2, and it is unlikely that AMD will get anywhere close to it with any of the chips, including IO die.

As far as future packaging technologies, I don't know if either EMIB or EFB can reduce the die size if the die is pad limited. Maybe, I just don't know if there are any density increases.

EFB could make routing easier, since the bridge is elevated, above the substrate, and other traces could be going through the substrate underneath. But I don't know how big a difference that could make.

If AMD is able to use stacked connection with TSV, hybrid bond connection, those have extremely high density, and there could still metal layers under the TSVs, unrelated to TSVs, that can still be connecting to IO. It could result in 25% to 33% die size reduction and tiny savings on the CCD.

As far as keeping the IFoP just to keep the chips spread out, AMD can just use a longer bridge, leave space between the chips and fill it with mold similar to one in the EFB. In fact, if they use active bridge with L3 die, it could use some length to accommodate the SRAM (if it is, say 80 mm2) and overlap the 2 dies being connected just on the edges, not covering a big area of the chips below. It could solve concern about thermals while stacking...

Anyway, there have been just about no leaks about Raphael. We just have a vague idea that it will come about a year from now. Roughly the same time frame of RDNA3. So, if AMD is going to use some advance packaging technologies for RDNA3, why not Raphael?

It's not that AMD has very high concern about desktop and time to market in desktop space (judging by Zen3D being MIA in desktop). If it takes an extra quarter and AMD could move past IFoP on Rembrandt, maybe use the same chiplet interconnect on Rembrandt as Bergamo, that could turn out valuable in the long run...
 
  • Like
Reactions: Vattila

eek2121

Diamond Member
Aug 2, 2005
3,397
5,029
136
After all, AMD did not show Rembrandt in the investor presentation, and it is literally a product that all rumors point to be released in a couple of months...

Rembrandt is launching in January 2022. Just because AMD doesn't confirm it, doesn't mean it isn't happening.

Raphael is also a guaranteed launch in 2022. I'd bet money on Q3 or earlier rather than Q4 like some leaks have suggested. My suspicion is that Zen 4 won't replace Zen3D, it will complement it.
 

Timorous

Golden Member
Oct 27, 2008
1,970
3,851
136
Rembrandt is launching in January 2022. Just because AMD doesn't confirm it, doesn't mean it isn't happening.

Raphael is also a guaranteed launch in 2022. I'd bet money on Q3 or earlier rather than Q4 like some leaks have suggested. My suspicion is that Zen 4 won't replace Zen3D, it will complement it.

Seems like the obvious play. Different nodes with different price targets means you can ship more product.