Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Vattila · Oct 6, 2019

Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts!

Timorous · Jul 14, 2021

jpiniero said:
Allegedly it's a special model. Like a 7950XXX.

Ahh the cpu pron edition.

soresu · Jul 14, 2021

Ajay said:
That's refrigerant based cooling.

So is the TSMC effort.

Solid state heat conduction for 3D stacks sounds cool and simple in theory.

But in practice it simply isn't viable for logic stacks without some radically new tech, parhaps thermal transistors which are still largely at the lab research phase IIRC.

The main problem is that the heat won't just move in the right direction without some kind of external force acting upon it, as with liquid current, or the process through which heat pipes work (wicking?) - though I'm not even certain a heat pipe could work adequately at that scale.

Micro fluidic silicon via channels are a lot more advanced than anything currently in the cooling market that does not cost an arm and a leg to operate.

It wouldn't surprise me if all in one MF pump/reservoir/radiator coolers could be created which just directly attach to the socket as current air coolers do - and likely somewhat more compact than current AIO models, at least for SKUs closer to current TDPs rather than 4-8 hi stacks of logic, or something with similarly insane thermal density.

I have to say that I've never been a fan of liquid cooling before, but this does make me hope for a future with it fully integrated at chip level.

At least until some near perfect solid state solution arrives in the future, assuming that better materials/devices for logic, memory, IO and power IC's don't make cooling largely redundant by then.

Which advanced spintronic devices/materials could certainly do for logic and memory - likewise for photonics and IO. Power remains problematic though, unless graphene or some similar 2D material can handle that without significant resistance thermals in stacks.

Asterox · Jul 14, 2021

Ok, Raphael APU is up to 170W TDP.

https://twitter.com/x/status/1415065218087100422

https://twitter.com/x/status/1415254514677043205

Well, i can imagine Zen4 8/16 APU with 64-128mb 3D V-Cache to fit under 170W TDP.

JujuFish · Jul 14, 2021

Asterox said:
Ok, Raphael APU is up to 170W TDP.

https://twitter.com/x/status/1415065218087100422

https://twitter.com/x/status/1415254514677043205

Well, i can imagine Zen4 8/16 APU with 64-128mb 3D V-Cache to fit under 170W TDP.

Why do you insist on adding colors to your posts? It only makes things harder to read.

Abwx · Jul 14, 2021

Timorous said:
What do they need 170W for? 16c + igp?

Dunno, but if the pic above is accurate then the i/o device seems quite big since TSMC s 6nm has 3x GF s 14LP density, so possibly for a basic GPU.

eek2121 · Jul 14, 2021

Abwx said:
Dunno, but if the pic above is accurate then the i/o device seems quite big since TSMC s 6nm has 3x GF s 14LP density, so possibly for a basic GPU.

Probably a pre overclocked model. Anyone remember the Athlon FX-51 and similar models from back in the day?

Coincidentally, a 280mm AIO can cool a 170W CPU. without much issue. Maybe we will see 5ghz all core out of it.

xilli_fiberbit · Jul 14, 2021

Previous APU was monolithic, so Raphael APU is build on chiplet too ?

jpiniero · Jul 14, 2021

xilli_fiberbit said:
Previous APU was monolithic, so Raphael APU is build on chiplet too ?

Most likely the IGP is a part of the IO die.

Mopetar · Jul 14, 2021

jpiniero said:
Most likely the IGP is a part of the IO die.

Seems like a bit of an odd combination given that the IO doesn't benefit as much from smaller nodes because the physical interface can't shrink, whereas the GPU benefits a lot from being on a newer node.

Since you don't need/want every IO die to have a GPU you wind up making a special IO die with a GPU. Why not just make a separate GPU chiplet at that point since you're still designing a separate unique piece of silicon, but at least the different chips can be manufactured on separate nodes which are best suited for each chip.

If AMD is eventually going MCM with their GPUs it wouldn't be a bad idea to get some practical experience with an APU first to work out some of the quirks of using such an approach.

jpiniero · Jul 14, 2021

Mopetar said:
Since you don't need/want every IO die to have a GPU you wind up making a special IO die with a GPU. Why not just make a separate GPU chiplet at that point since you're still designing a separate unique piece of silicon, but at least the different chips can be manufactured on separate nodes which are best suited for each chip.

It would depend on what node the IO die is on. If it's N6, the IGP die might not be big enough to be worthwhile to separate at this point given how good the yields are. We're talking about the lowest viable CU possible, if that's 3 or 6 that's what it will be.

Remember there's going to be mobile versions of the entire Raphael lineup to combat Alder/Raptor Lake-S BGA.

jamescox · Jul 14, 2021

Ajay said:
Hmm, wonder if it's just the max socket TDP for AM5. Could be AMD is giving itself some extra headroom.

If they added a lot of extra floating point units (AVX or whatever), then that could pull a lot of power when utilized. Also might be extra power for a high end stacked cache chip version.

jamescox · Jul 14, 2021

Timorous said:
Stacked on the IO die would mean the IO Die and chiplets are on the same node right? TSMC don't do cross node stacking or is it that they don't do cross node stacking yet?

Depends on the type of stacking. The SoIC stacking without micro-solder bumps probably requires that the chips be the same process. Using the lower density micro-solder ball based stacking doesn’t have those restrictions though, so it should be possibly to mix chips made on different process tech at different locations.

jamescox · Jul 14, 2021

jpiniero said:
I'd say that Genoa looks like Rome, except it has 12 dies instead of 8. 96 was probably chosen because of power consumption and perhaps space. Have to read that article to see what Charlie thinks but Bergamo could be stacked dies or stacked on top of the IO die. Either way the power consumption is going to be crazy.

I was thinking the same, but that requires 3 links per quadrant. They might have gone that route for the first iteration of Zen 4 with only stacked L3. Going up to 128-core doesn’t really fit with that though.

I have also been thinking that 4 die (or stacks) is 32 cores and they could be connected with LSI since they can be directly adjacent to the IO die. Even with higher core counts available, I would expect most sales are still 32-core or less, so the common, single layer part is cheap. Going up to 2 layers is 64 cores, 3 layers is 96, and 4 layers would be 128. The thermal constraint would get worse with each layer, although the SoIC stacking without micro-solder balls has good thermal conductivity compared to the micro-solder ball solution. The die are also polished down very thin. Perhaps 3 layers is doable, but 4 pushes the clocks too low without using some extra cooling tech, which might come later, in a refreshed version. I don’t know how they would handle cache die stacked on top unless they only use the cache die on 32-core, single layer, single core optimized parts only, basically F-series.

It should be interesting once we actually get some 3D stacking rather than just 2.5D. Speculation is going to be all over the place due to the number of possibilities with die stacking tech.

DisEnchantment · Jul 15, 2021

Mopetar said:
If AMD is eventually going MCM with their GPUs it wouldn't be a bad idea to get some practical experience with an APU first to work out some of the quirks of using such an approach.

They have this already with Aldebaran, and seems it is also available outside AMD for sometime to partners like HPE and supporting ISVs.

soresu said:
Solid state heat conduction for 3D stacks sounds cool and simple in theory.

But in practice it simply isn't viable for logic stacks without some radically new tech, parhaps thermal transistors which are still largely at the lab research phase IIRC.

The main problem is that the heat won't just move in the right direction without some kind of external force acting upon it, as with liquid current, or the process through which heat pipes work (wicking?) - though I'm not even certain a heat pipe could work adequately at that scale.

Yeah this article I posted describes some of it.

What Goes Wrong In Advanced Packages

More heterogeneous designs and packaging options add challenges across the supply chain, from design to manufacturing and into the field.

semiengineering.com

The temperature gradient induces mechanical stress leading to device failure.
There are patents for thermoelectric devices embedded in the device to address these topics, but temperature gradient could be a problem.
Immersion cooling which is very common in HPC could actually accelerate device failure due to the temperature gradient (in case the heat from bottom die cannot be dissipated evenly). SoIC packaging is very desired because of this.
Another knob to tune in addition to the already very long list of knobs.

DrMrLordX · Jul 15, 2021

xilli_fiberbit said:
Previous APU was monolithic, so Raphael APU is build on chiplet too ?

Raphael is meant to be the successor to Vermeer. It's going to be primarily a high-performance desktop part. It isn't meant to be a replacement for Cezanne.

JoeRambo · Jul 15, 2021

Interesting news about 170W. These power limits pretty much put a ceiling on stock CPU all core clocks. So it's not like Zen4 uses 170W, it's more like AMD knows that it needs 170W to reach performance they want to have with top SKUs.
AMD could release 170W ZEN3 parts today as long as there is socket with power delivery spec that allows it. CPU is ready to use that power budget with ease, just there is no competition from Intel and they can enjoy efficiency advantages due to running lower all-core clocks and voltages.

soresu · Jul 15, 2021

xilli_fiberbit said:
Previous APU was monolithic, so Raphael APU is build on chiplet too ?

DrMrLordX said:
Raphael is meant to be the successor to Vermeer. It's going to be primarily a high-performance desktop part. It isn't meant to be a replacement for Cezanne.

As DrMrLordX says Raphael is a different market segment and a successor to Vermeer or Vermeer-x (Warhol?).

The replacement for Cezanne will be Rembrandt, likely a monolithic 6nm/N6 die from TSMC.

Rembrandt is rumoured to have 8C Zen3 CPU, 12CU RDNA2 GPU, USB4, PCIe4, DDR5 and probably a host of other goodies less obvious.

It should also support HW AV1 decode as a RDNA2 based chip.

soresu · Jul 15, 2021

JoeRambo said:
Interesting news about 170W. These power limits pretty much put a ceiling on stock CPU all core clocks. So it's not like Zen4 uses 170W, it's more like AMD knows that it needs 170W to reach performance they want to have with top SKUs.
AMD could release 170W ZEN3 parts today as long as there is socket with power delivery spec that allows it. CPU is ready to use that power budget with ease, just there is no competition from Intel and they can enjoy efficiency advantages due to running lower all-core clocks and voltages.

Probably mostly a 65W reservation for GPU powaaaa.

Mopetar · Jul 15, 2021

jpiniero said:
It would depend on what node the IO die is on. If it's N6, the IGP die might not be big enough to be worthwhile to separate at this point given how good the yields are. We're talking about the lowest viable CU possible, if that's 3 or 6 that's what it will be.

Remember there's going to be mobile versions of the entire Raphael lineup to combat Alder/Raptor Lake-S BGA.

Why build an IO die on N6 thought when you don't see much size reduction (remember the physical interfaces always take up the same amount of space)? Given that AMD already can't get enough wafers to satisfy all of the demand they're seeing it seems bizarre to go down that route.

Also pairing an IO die with such low-end GPU capabilities seems pointless outside of ensuring that everything now has some minimal onboard video. If you want something more powerful then you need yet another piece of silicon. Are they going to make another IO die with 8 - 12 CU?

If you wanted to make a lowest viable CU product, just make it part of a monolithic die. There were some other rumors about AMD doing an Athlon refresh on a newer node at Global Foundries, so it would seem odd to duplicate that using a far more expensive TSMC node.

Abwx · Jul 15, 2021

Mopetar said:
Why build an IO die on N6 thought when you don't see much size reduction (remember the physical interfaces always take up the same amount of space)? Given that AMD already can't get enough wafers to satisfy all of the demand they're seeing it seems bizarre to go down that route.

Also pairing an IO die with such low-end GPU capabilities seems pointless outside of ensuring that everything now has some minimal onboard video. If you want something more powerful then you need yet another piece of silicon. Are they going to make another IO die with 8 - 12 CU?

If you wanted to make a lowest viable CU product, just make it part of a monolithic die. There were some other rumors about AMD doing an Athlon refresh on a newer node at Global Foundries, so it would seem odd to duplicate that using a far more expensive TSMC node.

Athlons and 8C APUs are not on the same market but anyway there s no Zen 4 monolithic APU coming before 2023, rumour is that the 5000s replacements will include a GPU.

Rumor: AMD Ryzen 7000 (Raphael) to Introduce Integrated GPU in Full Processor Lineup

The rumor mill keeps crushing away; in this case, regarding AMD's plans for their next-generation Zen designs. Various users have shared pieces of the same AMD roadmap, which apparently places AMD in an APU-focused landscape come their Ryzen 7000 series. we are currently on AMD's Ryzen...

www.techpowerup.com

Of course a truckload of salt is to be considered...

Also the pic posted by Computerbase seem to show three dies of different sizes, dunno if it s related to the article :

AMD-CPU-Gerüchte: Raphael bleibt bei 16 Kernen, aber bekommt 170 Watt

AMDs kommende Zen-4-CPU mit dem Codenamen Raphael wird voraussichtlich erneut nicht die Anzahl der Kerne steigern, dafür jedoch die TDP.

www.computerbase.de

Hitman928 · Jul 15, 2021

Abwx said:
Athlons and 8C APUs are not on the same market but anyway there s no Zen 4 monolithic APU coming before 2023, rumour is that the 5000s replacements will include a GPU.

Rumor: AMD Ryzen 7000 (Raphael) to Introduce Integrated GPU in Full Processor Lineup

The rumor mill keeps crushing away; in this case, regarding AMD's plans for their next-generation Zen designs. Various users have shared pieces of the same AMD roadmap, which apparently places AMD in an APU-focused landscape come their Ryzen 7000 series. we are currently on AMD's Ryzen...

www.techpowerup.com

Of course a truckload of salt is to be considered...

Also the pic posted by Computerbase seem to show three dies of different sizes, dunno if it s related to the article :

AMD-CPU-Gerüchte: Raphael bleibt bei 16 Kernen, aber bekommt 170 Watt

AMDs kommende Zen-4-CPU mit dem Codenamen Raphael wird voraussichtlich erneut nicht die Anzahl der Kerne steigern, dafür jedoch die TDP.

www.computerbase.de

That image looks like the sample Lisa Su showed of the 3d cache CPU where there's the IO die and then 2 chiplets, one with 3d cache and one without to show the difference.

Abwx · Jul 15, 2021

Hitman928 said:
That image looks like the sample Lisa Su showed of the 3d cache CPU where there's the IO die and then 2 chiplets, one with 3d cache and one without to show the difference.

Well, dunno the purpose of a single stacked chip if that s two cpu clusters on the top.

Other than this wonder if it s not the backported Zen 2 below :

https://twitter.com/x/status/1415717310111698945

Hitman928 · Jul 15, 2021

Abwx said:
Well, dunno the purpose of a single stacked chip if that s two cpu clusters on the top.

Other than this wonder if it s not the backported Zen 2 below :

https://twitter.com/x/status/1415717310111698945

It's not an actual product, it was just for demonstration purposes only to show what the stacked chip looks like versus non-stacked.

LightningZ71 · Jul 15, 2021

Mopetar said:
Why build an IO die on N6 thought when you don't see much size reduction (remember the physical interfaces always take up the same amount of space)? Given that AMD already can't get enough wafers to satisfy all of the demand they're seeing it seems bizarre to go down that route.

Also pairing an IO die with such low-end GPU capabilities seems pointless outside of ensuring that everything now has some minimal onboard video. If you want something more powerful then you need yet another piece of silicon. Are they going to make another IO die with 8 - 12 CU?

If you wanted to make a lowest viable CU product, just make it part of a monolithic die. There were some other rumors about AMD doing an Athlon refresh on a newer node at Global Foundries, so it would seem odd to duplicate that using a far more expensive TSMC node.

Why not use N6? There are two significant issues that AMD needs to address for competitive reasons. The first is excessive power draw from their IOD chips and IF links between the various dies. N6 can help there as it is an improvement over N7, though not massive, which is a big improvement over GF 14/12LPP, which is currently in use. The N6 based IOD should have notably lower draw from PCIe 4, and the SerDes links connecting it to the CCDs. It will also allow the memory controller run more efficiently. This leaves a greater fraction of the package power for the CCDs.

The second big advantage is density. While N6 is only a minor density improvement over N7, it is a big improvement over GF 12/14LPP. Making the IOD more dense allows AMD the space to add a small iGPU, fixing a competitive disadvantage that they have vs. Intel. Furthermore, since Intel has moved to a Xe based iGPU, it has nontrivial performance available to the user. AMD will need a nontrivial amount of die resources to match or beat it, which means something more than the 3CU Vega solution in Raven2/Dali.

As for choosing N6 over N7, the bits of information that we are getting indicates that it will be a long life node with minimal extra cost over N7, compatible design rules to make migrating IP easier, and a slight wafer yield improvement. While both are more expensive per wafer than gf12/14lpp, there is a cost savings in not having to ship the IODs from their foundry to the package assembly site that offsets some of that. AMD already has much of what's needed for an N6 IOD designed for N7/N6 from working on Cezanne, Renoir, and their revisions, so that's less complicated to migrate as well.

eek2121 · Jul 15, 2021

Abwx said:
Well, dunno the purpose of a single stacked chip if that s two cpu clusters on the top.

Other than this wonder if it s not the backported Zen 2 below :

https://twitter.com/x/status/1415717310111698945

Van Gogh lives.

Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Senior member

Golden Member

Diamond Member

Golden Member

Lifer

Lifer

Diamond Member

Member

Lifer

Diamond Member

Lifer

Senior member

Senior member

Senior member

Golden Member

Lifer

Golden Member

Diamond Member

Diamond Member

Diamond Member

Lifer

Diamond Member

Lifer

Diamond Member

Platinum Member

Diamond Member