Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 93 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
809
1,412
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

Untitled2.png


What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts! :)
 
Last edited:
  • Like
Reactions: richardllewis_01

moinmoin

Diamond Member
Jun 1, 2017
5,064
8,032
136
My problem is not that AMD is selling 5950x with 4.9 GHz boost clock.

My problem is that AMD is NOT selling a version of 5800x, say 5810x with 4.9 GHz or higher boost clock.

In addition to current 5800x CPU with 4.7 GHz boost clock.
I hope you are aware you're talking about a frequency difference of 4,26%.

That would be more like old Intel style of segmentation and rationing of performance to force people to buy something they don't want in order to get something they want....
AMD apparently prefers to have as few DIY desktop SKUs as possible. The segmentation between the SKUs then has to work on several parameters to make the more costly SKUs seem worth the premium. AMD however doesn't segment functionality (aside the PRO features) so that leaves amount of cores and frequencies as available parameters, which is exactly what AMD makes use of. And the boost clock difference between 5600X, 5800X, 5900X and 5950X is exactly 100MHz each. Maybe you'd prefer the difference to be 25MHz? To me the difference is honestly minuscule enough that to me all this discussion is a whole lot of whatever...

For the client computing,
Zen2 --> Zen3 has 9% increase in effective die area for average 19% increase in IPC across all loads.
Zen3 --> Zen3D with 4 layers of V-Cache would result in 2.8x increase in effective die area for a questionable gain in general purpose compute outside of gaming.
Not to mention 2.8x die area + packaging cost would result in almost 3x more production cost per CCD. N6/7 might have gotten cheaper but I doubt 3x cheaper.
Just plain lackluster engineering if its entire purpose is to defeat Alder Lake. On the same lines like NetBurst going for the MHz with no improvements, if not regression, elsewhere.
And what about that projected 46% GM in the Earnings call?

Zen3D was not developed to address gaming or as a response to Alder Lake, Ryzen with Zen3D is, most likely, simply some rejected dies from Milan-X because the HPC/DC/Server market can sustain those high costs and they are really finding the excellent use for those huge caches.
Don't expect good availability.
While this may be true I think the per area cost for the SRAM wafers are significantly lower: Far smaller dies, higher yield also due to far more repetitive patterns, and I think those pure SRAM V-Cache dies needs fewer layers than a full CPU design (not sure about that)?

So the advantage of this approach at this point very likely is rejuvenating an already paid off CPU die design without the whole cost for a new mask, validation etc. for the CPU die itself. The SRAM die may even be stock TSMC (so no added cost)?
 

Joe NYC

Platinum Member
Jun 26, 2021
2,539
3,471
106
For the client computing,
Zen2 --> Zen3 has 9% increase in effective die area for average 19% increase in IPC across all loads.
Zen3 --> Zen3D with 4 layers of V-Cache would result in 2.8x increase in effective die area for a questionable gain in general purpose compute outside of gaming.
Not to mention 2.8x die area + packaging cost would result in almost 3x more production cost per CCD. N6/7 might have gotten cheaper but I doubt 3x cheaper.

As a reality check, the current lowest end GPU AMD is making is 237 mm2 of a logic die + board + memory + power management. All this effort for gaming, all for $379

Why would does it seem insurmountable to have less die area of much cheaper SRAM die, with higher yields in a CPU that will sell more, with lower bill of materials?

If 224mm2 is your ceiling, should both AMD and NVidia immediately stop producing graphics cards?

Just plain lackluster engineering if its entire purpose is to defeat Alder Lake. On the same lines like NetBurst going for the MHz with no improvements, if not regression, elsewhere.
And what about that projected 46% GM in the Earnings call?

NetBurst is known (infamous) for low IPC.
V-Cache will be known for improving IPC

Zen3D was not developed to address gaming or as a response to Alder Lake, Ryzen with Zen3D is, most likely, simply some rejected dies from Milan-X because the HPC/DC/Server market can sustain those high costs and they are really finding the excellent use for those huge caches.
Don't expect good availability.

How different is it from Zen 3, which was impossible to buy at MSRP for 6 months?
Should AMD have not released Zen 3 for client / desktop?

All of the Ryzen 5000x product line has dies that could be sold in Milan server chips.
Should AMD have not released Zen 3 for client / desktop?

I am wondering why people are inventing these random bars to clear that apply only to Zen 3D and no other product.
 

Joe NYC

Platinum Member
Jun 26, 2021
2,539
3,471
106
I hope you are aware you're talking about a frequency difference of 4,26%.

It is also symbolic, as in making it a priority, taking a segment which most benefits from such a CPU - gaming - seriously.

And BTW, if 4.9 GHz is the limit in 16 core CPU, it is possible that 5 GHz may be achievable in 8 core CPU

AMD apparently prefers to have as few DIY desktop SKUs as possible. The segmentation between the SKUs then has to work on several parameters to make the more costly SKUs seem worth the premium.

I noticed the few SKUs.

AMD however doesn't segment functionality (aside the PRO features) so that leaves amount of cores and frequencies as available parameters, which is exactly what AMD makes use of. And the boost clock difference between 5600X, 5800X, 5900X and 5950X is exactly 100MHz each. Maybe you'd prefer the difference to be 25MHz? To me the difference is honestly minuscule enough that to me all this discussion is a whole lot of whatever...

No, I would go from 1 SKU for 8 core to 2 (high clock and lower clock)
Also 2 SKUs for 16 core (high clock and lower clock).

That's all. All together, from 4 5000x SKUs to 6.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,747
6,598
136
While this may be true I think the per area cost for the SRAM wafers are significantly lower: Far smaller dies, higher yield also due to far more repetitive patterns, and I think those pure SRAM V-Cache dies needs fewer layers than a full CPU design (not sure about that)?

So the advantage of this approach at this point very likely is rejuvenating an already paid off CPU die design without the whole cost for a new mask, validation etc. for the CPU die itself. The SRAM die may even be stock TSMC (so no added cost)?
I don't think anybody on the forums is trying to argue about the advantages because that is the whole purpose of the architectural design for chiplets and stacking in the first place. AMD has been beating this trumpet since a while.

But to me the noise about Zen 3D being a primary response to Alder Lake does not make sense, it was designed for something else. And trickles down to Desktop.
Daytona platform already have BIOS support for V-Cache since a while because it was designed for that market all along.
Regarding cost of each SRAM die, I won't argue since I don't have data.
But stacking SRAM, it is not going to be cheap, a similar scenario is there with DRAM, there is a reason why DRAM memory of some capacity is much cheaper than HBM even though they use the similar underlying DRAM dies, simply because of the amount TSVs needed and the binning needed to get the final KGSDs
 
Last edited:
  • Like
Reactions: Tlh97

eek2121

Diamond Member
Aug 2, 2005
3,100
4,398
136
It is also symbolic, as in making it a priority, taking a segment which most benefits from such a CPU - gaming - seriously.

And BTW, if 4.9 GHz is the limit in 16 core CPU, it is possible that 5 GHz may be achievable in 8 core CPU



I noticed the few SKUs.



No, I would go from 1 SKU for 8 core to 2 (high clock and lower clock)
Also 2 SKUs for 16 core (high clock and lower clock).

That's all. All together, from 4 5000x SKUs to 6.

The 5950X already hits 5050 mhz (5.05 ghz) out of the box.
 
  • Like
Reactions: Tlh97 and Joe NYC

Joe NYC

Platinum Member
Jun 26, 2021
2,539
3,471
106
Yes, it is stock. I said “out of the box”.

AMD does not advertise it because some cores don’t quite reach it. On my machine, 4 cores only reach 4.95 ghz, the rest will reach 5.05 ghz.

That's VERY nice....

I have to check my CPU, what clock speeds I am getting....

BTW, but this may explain some of the oddities of testing, if a CPU can optionally exceed its official boost frequencies out of the box.

Or is this only the feature of 5950x?
 

LightningZ71

Golden Member
Mar 10, 2017
1,798
2,156
136
It is also symbolic, as in making it a priority, taking a segment which most benefits from such a CPU - gaming - seriously.

And BTW, if 4.9 GHz is the limit in 16 core CPU, it is possible that 5 GHz may be achievable in 8 core CPU



I noticed the few SKUs.



No, I would go from 1 SKU for 8 core to 2 (high clock and lower clock)
Also 2 SKUs for 16 core (high clock and lower clock).

That's all. All together, from 4 5000x SKUs to 6.

They ALREADY have more than 4 SKUs. There are lower wattage non-x SKUs that are offered to OEMs. The 5900 and 5800 have been out for a while now.
 
  • Like
Reactions: Joe NYC

moinmoin

Diamond Member
Jun 1, 2017
5,064
8,032
136
It is also symbolic, as in making it a priority, taking a segment which most benefits from such a CPU - gaming - seriously.
🤦
And BTW, if 4.9 GHz is the limit in 16 core CPU, it is possible that 5 GHz may be achievable in 8 core CPU
Seems you don't know how AMD's boost frequencies work with Zen 3 chips: Unlike for previous gens these are guaranteed frequencies, and most chips pass them at stock. And 5 GHz is already achievable depending on the chip.
But to me the noise about Zen 3D being a primary response to Alder Lake does not make sense, it was designed for something else. And trickles down to Desktop.
Daytona platform already have BIOS support for V-Cache since a while because it was designed for that market all along.
Of course Zen 3D being a gaming performance response to Alder Lake is nonsense, I sure hope everybody sane sees it's a PR move first and foremost. One targeted at the consumer audience, so obviously no talk about servers. But of course it was first designed for use in server like everything Zen was. My previous guess was that Zen 3 was originally planned together with 3D stacking, and either markets circumstances (weak competition) didn't require AMD to go all out with the most costly version, or the launch schedule for 3D stacking slipped, or Zen 3 sans stacking was actually moved ahead. Maybe a mix of all that.
 

Joe NYC

Platinum Member
Jun 26, 2021
2,539
3,471
106
Of course Zen 3D being a gaming performance response to Alder Lake is nonsense, I sure hope everybody sane sees it's a PR move first and foremost. One targeted at the consumer audience, so obviously no talk about servers.

The talk is about Ryzen class SKUs of Zen 3D. That the SKUs are going to be designated as a gaming response to Alder Lake gaming SKU launch.

But of course it was first designed for use in server like everything Zen was.

No one is arguing otherwise.

But the chiplet approach allows AMD to share the server chiplet with high end desktop, so the chiplet based Ryzen processors get to benefit from it.

My previous guess was that Zen 3 was originally planned together with 3D stacking, and either markets circumstances (weak competition) didn't require AMD to go all out with the most costly version, or the launch schedule for 3D stacking slipped, or Zen 3 sans stacking was actually moved ahead. Maybe a mix of all that.

The launch dates did not align for Zen 3 to be launched with V-Cache, but since Zen 3 is such a flexible design, it is doing exceedingly well in the market place even without V-Cache.

The automated assembly facility (of TSMC) that will specialize in this sort of stacking was due to come online in May 2021. I am not sure what the actual completion date is (or was). But the stars are now all aligned for the Zen 3D.
 

jpiniero

Lifer
Oct 1, 2010
15,223
5,768
136
Of course Zen 3D being a gaming performance response to Alder Lake is nonsense,

The 3D stacking in general, yes, that was Milan-X. But AMD must have decided that doing a 1 layer version of the chiplet (for desktop gaming perf) was better than the alternatives given the time frame.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,747
6,598
136
Zen4 seems to belong to the same Family 19H like Zen3. Core uArch and ISA (Zen 3 added a whole bunch of extensions to the ISA over Zen2 though) would largely remain similar.:confused:

I wonder where the major part of the perf will come from.
AVX512 is confirmed. Pretty clear when AMD never objected to using feature level 4 in gcc/clang for x86 being AVX512 mandatory.
Interposer will have to wait for a bit.
 
Last edited:

Joe NYC

Platinum Member
Jun 26, 2021
2,539
3,471
106
Zen4 seems to belong to the same Family 19H like Zen3. Core uArch and ISA (Zen 3 added a whole bunch of extensions to the ISA over Zen2 though) would largely remain similar.:confused:

I wonder where the major part of the perf will come from.
AVX512 is confirmed. Pretty clear when AMD never objected to using feature level 4 in gcc/clang for x86 being AVX512 mandatory.
Interposer will have to wait for a bit.

Almost no die size savings gong from GloFo 12nm to 6nm TSMC.
424mm2 vs 397mm2

But a lot of room to stack something on top of this die ;)
 

leoneazzurro

Golden Member
Jul 26, 2016
1,052
1,716
136
Almost no die size savings gong from GloFo 12nm to 6nm TSMC.
424mm2 vs 397mm2

But a lot of room to stack something on top of this die ;)

Well it seems the I/O die will use 12 channels of DDR5 RAM while the older one was using 8 channels of DDR4. Moreover it should have more IF links. I think the space needed to accomodate all these connection alone will be a big part of the die size.
 

Joe NYC

Platinum Member
Jun 26, 2021
2,539
3,471
106
Well it seems the I/O die will use 12 channels of DDR5 RAM while the older one was using 8 channels of DDR4. Moreover it should have more IF links. I think the space needed to accomodate all these connection alone will be a big part of the die size.

Perhaps the bumps alone needed to accommodate these dictate the minimum die size.