Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 208 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
799
1,351
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

Untitled2.png


What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts! :)
 
Last edited:
  • Like
Reactions: richardllewis_01

Abwx

Lifer
Apr 2, 2011
10,970
3,523
136
40-45% gain over previous gen 16-core part is simply unachievable, even with increase in clock speeds. Go look at each 6/8/12/16 Core SKU from Zen 2 -> Zen 3.
Zen 2 to Zen 3 was the same process, enough said...

If the single thread is up 15% and the multi-thread is up 33%, then SMT (efficiency/performance enhancement) has increased by 16% for a given clock.

Are we thinking that single thread clocks are lower than multi thread clocks. That is the only scenario where SMT uplift is higher than single thread.

The 7800X does 21000 pts, that s 2625 pts/core while SMT efficency is 32% in Cinebench, that would put the ST score at 1988 pts at all core frequency.

If all core is 5GHz for this SKU then ST perf at same clock should be 20% higher than Zen 3 since a 5950X does 1657 pts, or eventually that all cores frequency for the 7800X is 5.25GHz.
 

inf64

Diamond Member
Mar 11, 2011
3,703
4,034
136
My conclusion based on the limited data we have is that AMD didn't put the best performance in the ST estimate (hence the vague >15% ST; at what ST clocks?). The workload showed diminishing IPC returns on past Zen gens (Zen1->Zen2->Zen3), and the leaked MT performance in R23 implies a very high all core boost (maybe greater than 5.2Ghz) with a moderate IPC jump of around 10% in this test. So they most likely ran a 5.2Ghz ST boost on the sample they tested which resulted in ~15% increase over 5950X.

Frankly, I don't follow. We don't have any MT CBr23 scores from AMD yet.

Even if we assume 33% higher MT score, why would you assume that over PBO? I assume PBO will scale worse with Zen 4 because it comes with a higher power target out of the box.
The leak is from redgamingtech video, he showed an excel table with scores for each SKU (16c->8C). If we don't assume PBO is working for 5950X and compare the leaked R23 MT results to a 5950X running at around 3.8-3.9Ghz all core boost in R23 MT test, that means there is even a slight IPC regression if the score is 33% higher while the all core boost is 36% higher (makes no sense).
 

tamz_msc

Diamond Member
Jan 5, 2017
3,821
3,643
136
Zen 2 to Zen 3 was the same process, enough said...
As per CPU-monkey. 3950X = 24000 pts. 2950X = 21000 pts.

Process advantage means nothing in this context.

But of course, feel free to keep digging that hole you've dug for yourself with those outrageous numbers.
 

gdansk

Platinum Member
Feb 8, 2011
2,123
2,630
136
The leak is from redgamingtech video, he showed an excel table with scores for each SKU (16c->8C). If we don't assume PBO is working for 5950X and compare the leaked R23 MT results to a 5950X running at around 3.8-3.9Ghz all core boost in R23 MT test, that means there is even a slight IPC regression if the score is 33% higher while the all core boost is 36% higher (makes no sense).
Or you estimate the all core clock is higher than it is.
5-8% IPC in CBr23 and about 900MHz higher sustained fits the numbers. But it is not what we hoped.
 

AtenRa

Lifer
Feb 2, 2009
14,001
3,357
136
As per CPU-monkey. 3950X = 24000 pts. 2950X = 21000 pts.

Process advantage means nothing in this context.

But of course, feel free to keep digging that hole you've dug for yourself with those outrageous numbers.

ZEN 2 = Ryzen 3xxx
ZEN 3 = Ryzen 5xxx

2950X is not ZEN 2 but ZEN+ @ 14nm
 

Abwx

Lifer
Apr 2, 2011
10,970
3,523
136
As per CPU-monkey. 3950X = 24000 pts. 2950X = 21000 pts.

Process advantage means nothing in this context.

But of course, feel free to keep digging that hole you've dug for yourself with those outrageous numbers.

But there s a process refinement between the 1950X and the 2950X, like between the 1800X and the 2700X, from 14nm to 12nm.

What about comparing the 1800X to a 3800X..?..


Difference is 39%.

Besides it s 22% between the 3800X and the 5800X despite a same process.

Clearly you are just here to try making thing bad for AMD, you jumped on an erroneous 31% perf improvement in Blender without even noticing that it was 45%.

Worse, you added that 8000 more pts to ADL would put it 30% faster than a 5950X and enough to take over Zen 4 without noticing that the latter is 40% faster than a 5950X...

So after so much blunders you still have the guts to patronize the gallery..?..
 
  • Like
Reactions: Tlh97

inf64

Diamond Member
Mar 11, 2011
3,703
4,034
136
Or you estimate the all core clock is higher than it is.
5-8% IPC in CBr23 and about 900MHz higher sustained fits the numbers. But it is not what we hoped.
There are many unknowns regarding the boost clocks on Zen4 and that's been throwing off the IPC calculations by a large margin. We can only guess, but I think R23 is on the lower end of the scale and the geo-mean average will be 15%, while the boost clocks will go 5.6Ghz for ST and 5Ghz for MT (top SKU).
 

AtenRa

Lifer
Feb 2, 2009
14,001
3,357
136
Or you estimate the all core clock is higher than it is.
5-8% IPC in CBr23 and about 900MHz higher sustained fits the numbers. But it is not what we hoped.

At the end what we want is performance.

So if we get 15% higher ST performance coming from 5% IPC + 10% higher clocks or
If we get 15% higher ST performance coming from 15% IPC and zero higher clocks

The end result is the same ;)
 
  • Like
Reactions: lightmanek

deasd

Senior member
Dec 31, 2013
520
763
136
Paul Alcorn on Twitter: "AMD has confirmed that the 170W figure for AM5 is PPT, not TDP." / Twitter

And that's the power draw in that Blender render now confirmed. 170W, not the 230W we first assumed.

Unless it's some sandbagging playing by AMD, or it's just AMD marketing mess up:




MSI official document showing there is indeed a 170w TDP option, which imply so call ~230w PPT.

(If the ES Zen4 showing today(125w TDP) is not a top model(170w TDP)...??)
 

ondma

Platinum Member
Mar 18, 2018
2,721
1,281
136
At the end what we want is performance.

So if we get 15% higher ST performance coming from 5% IPC + 10% higher clocks or
If we get 15% higher ST performance coming from 15% IPC and zero higher clocks

The end result is the same ;)
Except usually the performance from higher clocks uses more power than an IPC increase, at least if one is pushing the clocks to the max.
 

gdansk

Platinum Member
Feb 8, 2011
2,123
2,630
136
At the end what we want is performance.

So if we get 15% higher ST performance coming from 5% IPC + 10% higher clocks or
If we get 15% higher ST performance coming from 15% IPC and zero higher clocks

The end result is the same ;)
Ideally but history has shown the speed demons are flamethrowers while the IPC monsters are relatively more efficient.

But with 5nm Zen 4 will probably tame down fine in servers and laptops.
 

tamz_msc

Diamond Member
Jan 5, 2017
3,821
3,643
136
But there s a process refinement between the 1950X and the 2950X, like between the 1800X and the 2700X, from 14nm to 12nm.

What about comparing the 1800X to a 3800X..?..


Difference is 39%.

Besides it s 22% between the 3800X and the 5800X despite a same process.

Clearly you are just here to try making thing bad for AMD, you jumped on an erroneous 31% perf improvement in Blender without even noticing that it was 45%.

Worse, you added that 8000 more pts to ADL would put it 30% faster than a 5950X and enough to take over Zen 4 without noticing that the latter is 40% faster than a 5950X...

So after so much blunders you still have the guts to patronize the gallery..?..
That wasn't the point at all. The point was how the tweet of Hans De Vries makes no sense whatsoever. The biggest MT gains always come for the lower core parts, and it is a decreasing sequence of gains as you go on increasing cores. That is because for a given PPT the power available for each core, and hence performance per core, decreases.
 

LightningZ71

Golden Member
Mar 10, 2017
1,628
1,898
136
There are likely other factors associated with the higher than expected performance scaling of the 12 and 16 core parts from the red leak.

1) there is likely a bit of memory bus saturation going on in the 5900x and 5950x that is somewhat improved by the much faster performance of the DDR5 setup on the ZEN4 product.

2)the 12 and 16 core parts still require some level of communication between the CCDs via the IF link through the IOD. The IF links must run faster in ZEN4 to make use of the increased bandwidth of the DDR5 memory.

3)the IOD itself likely operates more quickly overall due to the improved process that it was made with. This likely removes some latency involved in various memory requests and CCD crosstalk.

Combining removing bottlenecks with CCD performance improvements can certainly result in higher MT scaling for 12+ core parts.

If that bears out, even with modest IPC improvements, it leads me to believe that AMD can make significant MT improvements against the 12900k sufficient to retain overall (but certainly not complete as edge cases will always exist) performance leadership over the 13900k when it is released.

I do expect that the move to N5 will enable significant MT improvements across the entire product stack as all-core boost numbers should be sustainable at notably higher frequencies and larger L2s and faster DRAM will keep the cores better fed.
 

Hulk

Diamond Member
Oct 9, 1999
4,230
2,017
136
The architecture on these machines (Zen3/ADL) is already so wide that it has been my opinion that significant ST performance increases due to architectural enhancements are going to be tough to come by. The low hanging fruit, and even the fruit higher up the tree has been picked. The only fruit left is really high up and hard to harvest.

Now a few admittedly more knowledgeable members here have informed me that there is plenty of IPC to be gained from architecture.

That being said (written), given what AMD has reported thus far it seems ST IPC on these current architectures is topping out. Perhaps the battle will be moving to cache and the rest of the memory subsystem in an effort to most efficiently "feed the beast?"
 
  • Like
Reactions: lightmanek

AtenRa

Lifer
Feb 2, 2009
14,001
3,357
136
Except usually the performance from higher clocks uses more power than an IPC increase, at least if one is pushing the clocks to the max.

Ideally but history has shown the speed demons are flamethrowers while the IPC monsters are relatively more efficient.

But with 5nm Zen 4 will probably tame down fine in servers and laptops.

RDNA2 is a speed demon but power consumption is fine.
Dont forget we are also transitioning to 5nm that brings lower power vs 7nm.
 

gdansk

Platinum Member
Feb 8, 2011
2,123
2,630
136
RDNA2 is a speed demon but power consumption is fine.
Dont forget we are also transitioning to 5nm that brings lower power vs 7nm.
GPUs are something of a different story than CPUs. Still when at the edge of their efficiency curve it's a mess. See the relative inefficiency of the 6700XT compared to the 6900XT, for example.
 

Paul98

Diamond Member
Jan 31, 2010
3,732
199
106
Seeing that 5.52Ghz in gaming and not just a simple single thread max, along with any IPC gains, and DDR5, should see a good boost. Though should be interesting compared to 5800x3D. I wonder if we will see v-cache CPU's later on, or will we see that in Zen5.

But as others have said, a lot of it is the platform changes.

I am more looking forward to Zen5 compared to Zen4 as Zen4 is a being new platform, new memory, start of PCIE5.0, I expect there to be some issues that should be worked out over that first generation, along with prices and availability of DDR5, and 5.0 NVME drives.
 

AtenRa

Lifer
Feb 2, 2009
14,001
3,357
136
About the Blender , without having data from 5950X at the same benchmark , talking about the MT performance of ZEN 4 vs ZEN 3 is pointless.



GPUs are something of a different story than CPUs. Still when at the edge of their efficiency curve it's a mess. See the relative inefficiency of the 6700XT compared to the 6900XT, for example.

Im not talking about taking a CPU and increase clocks but rather design a CPU with 5% increase in IPC and 10% higher clocks to get 15% higher ST.
two different things
 

LightningZ71

Golden Member
Mar 10, 2017
1,628
1,898
136
Cinebench doesn't care about system memory or inter-core latency and bottlenecks.
It has shown to be moderately sensitive to IF clocks on the two CCD products. Uncoupled ram setups and lower IF clock setups do somewhat underperform properly configured systems. It's certainly not a big effect, but it's there.