Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 272 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
799
1,351
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

Untitled2.png


What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts! :)
 
Last edited:
  • Like
Reactions: richardllewis_01

inf64

Diamond Member
Mar 11, 2011
3,703
4,034
136
It seems to be a worst-case scenario for Zen4's ST performance uplift vs. Zen3. CBR23 anyway. AMD has indicated "15%+", where 15% is CBR23 ST.

Even that is questionably low. They stated 8-10% IPC jump in specific apps (amongst which is CBR23). If the top chips clock north of 5.5Ghz, then the uplift is at minimum ~21%.
It only makes sense if the claim covers the whole Zen4 range where 7600X has the lowest ST boost of approx. 5.2Ghz => 5.2/4.9 x 1.08 ~= 1.15

For top SKU, if it can reach the 5.8Ghz for ST as some rumors indicated, then we have : 5.8/4.9 x 1.08 ~= 1.28 or 28% higher ST CBR23 score versus 5950X.
If the IPC in Cinebench is closer to the upper cited band (~10%), the uplift is ~30%.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,608
5,816
136
Even that is questionably low. They stated 8-10% IPC jump in specific apps (amongst which is CBR23).
Zen 4 will not have 8% IPC in CB R23. AMD added SPEC and GB besides CB in the footnotes for the Zen 4 IPC estimation because using only CB would make the IPC gains look bad.
Zen 4 does not introduce additional FP (or even int ports, from leaked manual) execution ports. Similarly, compiler configuration for execution cost remains unchanged and therefore we can surmise that no change in throughput as well. For FP execute bound loads like CB where the cache sub-system and FE is mildly engaged, low single digit IPC gains at best.
 

inf64

Diamond Member
Mar 11, 2011
3,703
4,034
136
Zen 4 will not have 8% IPC in CB R23. AMD added SPEC and GB besides CB in the footnotes for the Zen 4 IPC estimation because using only CB would make the IPC gains look bad.
Zen 4 does not introduce additional FP (or even int ports, from leaked manual) execution ports. Similarly, compiler configuration for execution cost remains unchanged and therefore we can surmise that no change in throughput as well. For FP execute bound loads like CB where the cache sub-system and FE is mildly engaged, low single digit IPC gains at best.
If that's true, than SPEC and GB would need to provide much larger ST uplifts so the average is close to 10%. I don't see that happening, as GB gains are usually very close to Cinebench gains, so that would leave only SPEC to pull up the average by a lot (think >18%)..
 

inf64

Diamond Member
Mar 11, 2011
3,703
4,034
136
I just checked the latest leak on twitter (Genoa ES), and Zen4 ES @ 3.5Ghz scored 1227 pts in R23 ST test. On the other hand Zen3 (both non-Vcache and Vcache) @ 3.52Ghz scored 1151/1143 pts. This shows ~7% IPC increase based on the ES results for 1T.

EDIT: Zen3 server ST scores in R23 are in line with 5950X's results (almost perfectly) : 1151 x 4.9/3.52=1600 pts
 
Last edited:

DisEnchantment

Golden Member
Mar 3, 2017
1,608
5,816
136
GB gains are usually very close to Cinebench gains,
GB scales with many things besides FP... cryptographic instructions, bandwidth, int instructions, cache latency, etc. CB is mostly FP execution bound.
I just checked the latest leak on twitter (Genoa ES), and Zen4 ES @ 3.5Ghz scored 1227 pts in R23 ST test. On the other hand Zen3 (both non-Vcache and Vcache) @ 3.52Ghz scored 1151/1143 pts. This shows ~7% IPC increase based on the ES results for 1T.
Based on what data you are sure that it is running fixed at that clock?
 

Abwx

Lifer
Apr 2, 2011
10,967
3,489
136
I just checked the latest leak on twitter (Genoa ES), and Zen4 ES @ 3.5Ghz scored 1227 pts in R23 ST test. On the other hand Zen3 (both non-Vcache and Vcache) @ 3.52Ghz scored 1151/1143 pts. This shows ~7% IPC increase based on the ES results for 1T.

EDIT: Zen3 server ST scores in R23 are in line with 5950X's results (almost perfectly) : 1151 x 4.9/3.52=1600 pts
I pointed that those results are dubbious, not only there s no IPC gain in R23 but there s also large regressions in R20 and R15.

FTR 5950X score 1657 in R23, so that s actually a regression as well.

 
  • Like
Reactions: Kaluan and Tlh97

inf64

Diamond Member
Mar 11, 2011
3,703
4,034
136
GB scales with many things besides FP... cryptographic instructions, bandwidth, int instructions, cache latency, etc. CB is mostly FP execution bound.

Based on what data you are sure that it is running fixed at that clock?
The guy who did the tests said so? You can see the specs of the Genoa ES in the chart, the max clock is ~3.5Ghz. There is also a screenshot while it's running ST tests, the clock is ~3.46Ghz...

1659009420391.png

I pointed that those results are dubbious, not only there s no IPC gain in R23 but there s also large regressions in R20 and R15.

FTR 5950X score 1657 in R23, so that s actually a regression as well.

The R23 ST score for Zen3 server parts is ~3% off from the desktop part, which is normal due to slower DDR4 memory. The test is not perfect, the guy said it's all done on ES HW and pre-production BIOS/firmware. It still gives us a ballpark values for R23, at least.
 
Last edited:

Abwx

Lifer
Apr 2, 2011
10,967
3,489
136
The R23 ST score for Zen3 server parts is ~3% off from the desktop part, which is normal due to slower DDR4 memory. The test is not perfect, the guy said it's all done on ES HW and pre-production BIOS/firmware. It still gives us a ballpark values for R23, at least.

You mean Zen 4.?.
Genoa use DDR5 and even then RAM has no influence in ST scores.

If the score is 1151@3.46GHz then it would score 1663@5GHz, that s the same score as the 5950X@5GHz since Computerbase sample run at this frequency for CB ST and wich is measured at 1657.

Edit : Only way the leaked score would be relevant is that frequency hoover somewhere between base frequency and 3.46GHz
 

inf64

Diamond Member
Mar 11, 2011
3,703
4,034
136
You mean Zen 4.?.
Genoa use DDR5 and even then RAM has no influence in ST scores.

If the score is 1151@3.46GHz then it would score 1663@5GHz, that s the same score as the 5950X@5GHz since Computerbase sample run at this frequency for CB ST and wich is measured at 1657.
No, I meant Zen 3. Zen 3 parts scored 1151pts in R23, check the chart again. That puts it in line with 5950X, adjusted for clocks (~3% slower as it's server vs desktop).


Genoa ES part scored 1227pts @ 3.5Ghz(~3.46Ghz), which is 7% faster than Zen 3 parts at around the same clock.
1659010833007.png

This means that 7950X @ 5.8Ghz would score 5.8/4.9 x 1.07 x 1650 = 2090 pts. Knowing that for Zen3 the server parts scored ~3% lower due to memory being slower (my guess), the best case scenario for 7950X is 2090 x 1.03~= 2152 pts
 
Last edited:
  • Like
Reactions: Kaluan

Abwx

Lifer
Apr 2, 2011
10,967
3,489
136
No, I meant Zen 3. Zen 3 parts scored 1151pts in R23, check the chart again. That puts it in line with 5950X, adjusted for clocks (~3% slower as it's server vs desktop).


Genoa ES part scored 1227pts @ 3.5Ghz(~3.46Ghz), which is 7% faster.
View attachment 65096

In that case there s your 7% IPC improvement, and eventually a tiny more since the 5950X doesnt really hold the 5GHz mark in ST due to short frequency dips, that s why AMD stated 4.9 officially.



If Zen 4 can sustain solidly 5.5GHz in ST that s good for 18% ST perf improvement in R23.
 

inf64

Diamond Member
Mar 11, 2011
3,703
4,034
136
In that case there s your 7% IPC improvement, and eventually a tiny more since the 5950X doesnt really hold the 5GHz mark in ST due to short frequency dips, that s why AMD stated 4.9 officially.


I'm not sure if you are now agreeing or disagreeing with me :D. I thought I showed clearly how R23 ST results for Zen 4 are 7% better at the same clock Vs Zen 3, at least in the leaked Genoa ES benchmarks. There are issues in this leak such as low R15/R20 ST scores, but that could be due to low ST clock for Genoa parts (for whatever reason).
 
  • Like
Reactions: Kaluan

Abwx

Lifer
Apr 2, 2011
10,967
3,489
136
I'm not sure if you are now agreeing or disagreeing with me :D. I thought I showed clearly how R23 ST results for Zen 4 are 7% better at the same clock Vs Zen 3, at least in the leaked Genoa ES benchmarks. There are issues in this leak such as low R15/R20 ST scores, but that could be due to low ST clock for Genoa parts (for whatever reason).

I agree but i wouldnt count on 5.8GHz, so far leaks point to 5.52GHz in ST, that will be enough to roughly match the competition in R23 ST, be in agreement with the stated 8-10% IPC improvement and largely exceed 15% ST perf improvement.
 
  • Like
Reactions: Tlh97 and inf64

inf64

Diamond Member
Mar 11, 2011
3,703
4,034
136
I agree but i wouldnt count on 5.8GHz, so far leaks point to 5.52GHz in ST, that will be enough to roughly match the competition in R23 ST, be in agreement with the stated 8-10% IPC improvement and largely exceed 15% ST perf improvement.
Regarding the maximum ST clocks for Raphael, I was referring to this : https://www.angstronomics.com/p/computex-beyond-the-coverage?s=w

"Regarding frequency targets, the game demo showing 5.55GHz maximum frequencies was also not with the final version. While Angstronomics is aware of an Ordering Part Number (OPN) that is fused for a 5.85 GHz Fmax, we will have to wait and see what the retail stepping fuses will be set at. "

~5.5Ghz was shown to be an easy clock for Raphael ES even in a gaming workload on 1+ threads, so higher than that is a reasonable assumption.
 
  • Like
Reactions: Kaluan

deasd

Senior member
Dec 31, 2013
520
763
136
Anyone tell me how reliable Angstronomics is? I remembered he said AMD aimed 7% IPC uplift for Zen4 few days after Computex, and more few days later AMD official said 8-10%. I wonder if '7% IPC' and '5.85 GHz Fmax ' are conservative number that AMD intentionally told journalists.


And Cinebench is a strange application to say the least. IIRC, some review sites data I collected, result vary from version to version for some newish CPUs like Zen3 and Alderlake.

just for IPC comparison. IIRC

R10, Alderlake won Zen3 with double digit %
R11.5, both tied, ADL won slightly but closed to 0%
R15, both tied, no difference
R20, ADL dominating just like R10
R23, Intel enlarge the lead (~2-3%) compared to R20, despite the fact that rendering scene is same


OTOH, when comes to Genoa leak, R23 has 7% uplift while R20, R15 has regression. Despite the ES immaturity, I guess CB result varying might due to Cinebench itself but not ES Genoa's fault.
 
  • Like
Reactions: Tlh97

Abwx

Lifer
Apr 2, 2011
10,967
3,489
136
Anyone tell me how reliable Angstronomics is? I remembered he said AMD aimed 7% IPC uplift for Zen4 few days after Computex, and more few days later AMD official said 8-10%. I wonder if '7% IPC' and '5.85 GHz Fmax ' are conservative number that AMD intentionally told journalists.


And Cinebench is a strange application to say the least. IIRC, some review sites data I collected, result vary from version to version for some newish CPUs like Zen3 and Alderlake.

just for IPC comparison. IIRC

R10, Alderlake won Zen3 with double digit %
R11.5, both tied, ADL won slightly but closed to 0%
R15, both tied, no difference
R20, ADL dominating just like R10
R23, Intel enlarge the lead (~2-3%) compared to R20, despite the fact that rendering scene is same


OTOH, when comes to Genoa leak, R23 has 7% uplift while R20, R15 has regression. Despite the ES immaturity, I guess CB result varying might due to Cinebench itself but not ES Genoa's fault.

Cinebench is a best case scenario for Intel for obvious reasons starting with R11.5...

R10 likely use X87 FP instructions.

R11.5 use SSE2 and was designed using Intel s software suite like Thread Profiler and compiled with ICC.

R15 extended instructions up to SSE4.2 as well as using Embree, wich is an Intel devellopped renderer.

R20 extend instructions up to AVX.

Dunno what has been done to R23 but likely that it was recompiled to take advantage of the soon to be relesed ADL uarch as well as being directly ported for Apple s uarch.
 
  • Like
Reactions: Kaluan

inf64

Diamond Member
Mar 11, 2011
3,703
4,034
136
Cinebench became popular after Zen 1 launched. I remember intel telling press that it's not real world test, but now that they have advantage they are all over it :)
According to all these leaks, it's obvious that AMD will reach near parity with Alder/Raptor lake in R23 ST test, maybe trail by a single digit %. It will be interesting to see how much can AMD push all threads on top 32T part, that will determine if they get the overall performance crown Vs Raptor Lake.

I think the top parts (no Vcache on AMD side) will be neck and neck, in almost everything - AMD having the perf./watt advantage of course. Once they launch the Vcache parts, AMD should win in gaming (and by a considerable margin, my guess is ~15+%).
 
  • Like
Reactions: Kaluan and Tlh97

eek2121

Platinum Member
Aug 2, 2005
2,930
4,026
136
Regarding the maximum ST clocks for Raphael, I was referring to this : https://www.angstronomics.com/p/computex-beyond-the-coverage?s=w

"Regarding frequency targets, the game demo showing 5.55GHz maximum frequencies was also not with the final version. While Angstronomics is aware of an Ordering Part Number (OPN) that is fused for a 5.85 GHz Fmax, we will have to wait and see what the retail stepping fuses will be set at. "

~5.5Ghz was shown to be an easy clock for Raphael ES even in a gaming workload on 1+ threads, so higher than that is a reasonable assumption.

5.85mhz fmax means a 5.7-5.75ghz max boost clock fyi, the chip would never hit 5.85ghz. Likely the 7950x has a 5.7ghz boost and a base clock in the mid 4ghz range.
 
  • Like
Reactions: Tlh97

eek2121

Platinum Member
Aug 2, 2005
2,930
4,026
136
Cinebench became popular after Zen 1 launched. I remember intel telling press that it's not real world test, but now that they have advantage they are all over it :)
According to all these leaks, it's obvious that AMD will reach near parity with Alder/Raptor lake in R23 ST test, maybe trail by a single digit %. It will be interesting to see how much can AMD push all threads on top 32T part, that will determine if they get the overall performance crown Vs Raptor Lake.

I think the top parts (no Vcache on AMD side) will be neck and neck, in almost everything - AMD having the perf./watt advantage of course. Once they launch the Vcache parts, AMD should win in gaming (and by a considerable margin, my guess is ~15+%).

If rumors are true, we should have the v-cache parts soon. Hopefully they won’t have to reduce clocks this time around.
 
  • Like
Reactions: Tlh97 and inf64

inf64

Diamond Member
Mar 11, 2011
3,703
4,034
136
5.85mhz fmax means a 5.7-5.75ghz max boost clock fyi, the chip would never hit 5.85ghz. Likely the 7950x has a 5.7ghz boost and a base clock in the mid 4ghz range.
Yeah, I'm aware of that - it's the same thing as fmax for 5950X. What I was trying to say is that 5.5Ghz for ST is a low number IMO, especially if they demonstrated multiple threads running at 5.5Ghz in a gaming workload. Whether they reach 5.7 or 5.8Ghz is anyone's guess, but the difference is really small (~1.5%).
 

eek2121

Platinum Member
Aug 2, 2005
2,930
4,026
136
Yeah, I'm aware of that - it's the same thing as fmax for 5950X. What I was trying to say is that 5.5Ghz for ST is a low number IMO, especially if they demonstrated multiple threads running at 5.5Ghz in a gaming workload. Whether they reach 5.7 or 5.8Ghz is anyone's guess, but the difference is really small (~1.5%).

It is possible that the number may have even gone up by now. Non of the chips AMD had floating around were final.