Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 266 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
821
1,457
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

Untitled2.png


What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts! :)
 
Last edited:
  • Like
Reactions: richardllewis_01

Exist50

Platinum Member
Aug 18, 2016
2,452
3,106
136
There are many Unknown on that implementation of Golden Cove.

Here is the list.

Larger L2:
Mesh Of Rings:
Quad compute tiles per CPU
Buggy Bios.

So far the QS samples are just not performing as they should. We will need to full release products and even then mature BIOS.
The mesh (where did you get "of rings") has been a known thing for years, nor is the die to die impact likely to be high with EMIB. And more L2 would increase performance. On top of that, the most striking numbers were the low single core scores, which isolates most of the architectural changes anyway. And that's just for SPR. Why is Genoa (even artificially core count limited) losing to Milan?

No specs we know of justify those results for either CPU, so the only rational conclusion is that they're too fundamentally flawed to hold any value, and should thus to discarded.
 

nicalandia

Diamond Member
Jan 10, 2019
3,331
5,282
136
Cinebench R15 and R20 ST
Let me be very clear. It's Cine Bench R23 or BUST....

Cinebench R20 and R15 Shows Golden Cove matching Zen3 IPC(clock for clock) so. You know that they are just too old to take advantage of recently released CPUS
 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,106
136
A number of leaks have all pointed to the same thing.
This is literally the first numbers we've gotten, and no leaker or any other source has claimed anything to match these scores.

The fact that you think they are wrong does not make my belief that they are correct wrong.
No, common sense is sufficient for that.

Think about this, If the 12900f 8 P-cores uses about 230 watts(maximum), then the 56 cores of SR if clocked the same would be 1617 watts. You know there is no way thats going to happen. So you need to ignore what you know about golden cove
You can find dozens of reviews of Golden Cove at <5GHz frequencies and surprise, it consumes far less power. By your logic, the 12400 should be impossible.

But it's very illuminating how you claim to care about "facts" while telling us to ignore everything we know about the product in question.
 
  • Like
Reactions: yuri69

Abwx

Lifer
Apr 2, 2011
11,885
4,873
136
Let me be very clear. It's Cine Bench R23 or BUST....

Cinebench R20 and R15 Shows Golden Cove matching Zen3 IPC(clock for clock) so. You know that they are just too old to take advantage of recently released CPUS

CB R20 was released one year before R23 and the rendered scene is the same, basically there s no difference other than a recompile that helped Intel get 2% more perf vs AMD comparatively to R20...

You can check the ST scores in R15/20/23 here :


Edit : R23 also added Apple M1support relatively to R20.
 
Last edited:
  • Like
Reactions: lightmanek

nicalandia

Diamond Member
Jan 10, 2019
3,331
5,282
136
The mesh (where did you get "of rings") has been a known thing for years, nor is the die to die impact likely to be high with EMIB. And more L2 would increase performance. On top of that, the most striking numbers were the low single core scores, which isolates most of the architectural changes anyway. And that's just for SPR. Why is Genoa (even artificially core count limited) losing to Milan?

No specs we know of justify those results for either CPU, so the only rational conclusion is that they're too fundamentally flawed to hold any value, and should thus to discarded.

The Mesh of Rings(Yeah that is the exact word, Source: Anandtech) has been known to have a impact on performance(some times even profound effect, that is why HCC Xeons lost to Desktop CPUs on that department). L2 size has no effect on Cinebench R23 performance. And yest the Low ST Performance score on Cinebench R23 is very worrisome. I mean low 1000s for a CPU that boost to 3.4 Ghz?

But GENOA(Early sample) with Super Beta Windows Server 2025 Manages to get a 8% boost in Single thread over Zen3?

Cinebench R23 Only Supports 256 Threads, I have asked Yuuki_AnS to re-run the test with SMT OFF. He is doing that and also a pair of Xeon SPR 8490 which is a 60 Core processor. for a total of 120 Cores.
 

nicalandia

Diamond Member
Jan 10, 2019
3,331
5,282
136
This is literally the first numbers we've gotten, and no leaker or any other source has claimed anything to match these scores.
First numbers for QS samples. But YuuKi_AnS have been providing CBR23 numbers for Early sample SPR Xeons for quite a while now. And it's the same performance trend. Lower than expected for known Golden Cove cores, but again there were too many variables to take into account. Early Beta Bios, Incomplete Windows Server support, Low clocks, Mesh of Rings, 4 tile per CPUs...

Now we have at least one thing less to worry about(it's a QS E3 sample, the closest we will get to Production release samples). Not sure about Windows Server Full Support. but other things and variables still at play.
 

nicalandia

Diamond Member
Jan 10, 2019
3,331
5,282
136
CB R20 was released one year before R23 and the rendered scene is the same, basically there s no difference other than a recompile that helped Intel get 2% more perf vs AMD comparatively to R20...

CBR20 shows Zen3 and Golden Cove to have the same IPC(same performance at ISO Power).

Edit. it was CBR15, Can we just let go of CBR15 and CBR20? I mean we have CBR23 right now. Which takes advantage of newer CPUs much better

1658248282000.png
 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,106
136
he Mesh of Rings(Yeah that is the exact word, Source: Anandtech) has been known to have a impact on performance(some times even profound effect, that is why HCC Xeons lost to Desktop CPUs on that department).
Not anything even remotely close to the results seen here, and especially when you consider that it's being compared to other server chips.

L2 size has no effect on Cinebench R23 performance. And yest the Low ST Performance score on Cinebench R23 is very worrisome. I mean low 1000s for a CPU that boost to 3.4 Ghz
So the logical conclusion is that it's not actually boosting to that speed, or is otherwise incapacitated in a way that doesn't reflect realistic performance expectations. Why reach so hard for an alternative explanation?

But GENOA(Early sample) with Super Beta Windows Server 2025 Manages to get a 8% boost in Single thread over Zen3?
...if you ignore all the tests it performs worse or on par in. You don't seriously believe that iso-core count, Genoa and Milan are the same, right?
 

pakotlar

Senior member
Aug 22, 2003
731
187
116
These QS SPR benchmarks are clearly underestimates of production performance, unless we seriously believe that SPR will perform +/- ice lake. It won’t. On the other hand, the Genoa benchies have similar issues.

I fully expect Sapphire Rapids to perform extremely well, and to clobber Milan. Maybe it will even perform better per core than Genoa, though I’m not optimistic that it will do so on average. I bet that 60 core SPR sill be more or less as good as 64 core Genoa. That is already a big accomplishment for Intel, given their many ongoing challenges.

However no one should expect SPR 60 core to beat or match Genoa 96 core, except in programs that can’t make use of 96 cores/192 threads. That is just not going to happen.
 

nicalandia

Diamond Member
Jan 10, 2019
3,331
5,282
136
Not anything even remotely close to the results seen here, and especially when you consider that it's being compared to other server chips.
I am in contact With YuuKi_AnS I will request a CBR23 Run on a Single Compute tile(no more than 14 core test), I want to check if there is an issue with those compute tiles. Let's hope we get to the bottom of this before SPR is released.

The thing is that those are QS E3 samples.... If they are this buggy at this late of the game.. I just can't see Intel releasing them this year



...if you ignore all the tests it performs worse or on par in. You don't seriously believe that iso-core count, Genoa and Milan are the same, right?
I am focusing on CBR23 at this stage(I don't care for CBR15-20) At ISO speed(All core boost) and ISO Cores, Genoa has about 13% more performance than Milan in MT.
 
Last edited:

inf64

Diamond Member
Mar 11, 2011
3,884
4,692
136
We are very close to launch for both products, it's just 2-3 months. These ES tests are super interesting but doesn't tell us with high degree of confidence how the end products will perform.
 
  • Like
Reactions: ryan20fun

nicalandia

Diamond Member
Jan 10, 2019
3,331
5,282
136
We are very close to launch for both products, it's just 2-3 months. These ES tests are super interesting but doesn't tell us with high degree of confidence how the end products will perform.
More tests are on their way, expect more results(hopefully with SMT OFF) by Sunday or Saturday.
 
  • Like
Reactions: lightmanek

deasd

Senior member
Dec 31, 2013
603
1,033
136
I am in contact With YuuKi_AnS I will request a CBR23 Run on a Single Compute tile(no more than 14 core test), I want to check if there is an issue with those compute tiles. Let's hope we get to the bottom of this before SPR is released.

The thing is that those are QS E3 samples.... If they are this buggy at this late of the game.. I just can't see Intel releasing them this year

As i said before, this leaker's reputation is doubtful, he leaked so many things before, and even sent ES chips to others.....now this time I dont think he has full access to newest and fully functioned toys because of his unlawful movement. I believe he only has some crappy ES in hand. Just don't read too much into it.

more strange thing is he even couldn't sure the EPYC 9664's core count even in his hand.....



edit: o ya.... he clarified the CPUs he used to do AIDA bandwidth test is 2S Geona 9664.... core count/frequency unknown
 
Last edited:

FangBLade

Senior member
Apr 13, 2022
203
399
106
Stop it fanboy amd aint giving you stock. Lol

You were already told not to use the term 'fanboy' and yet here you are using it again.
Maybe you can't comprehend what you're told?

Iron Woode

Super Moderator
If you can't stand criticism for Intel, delete your account. This is forum, we can love or hate Intel, it is our right. I'v seen you in other forums, you are not different than any hardcore fanboy of any company, Intel didn't live up to expectations, AMD will most likely crush it, deal with it.

Did you also fail to understand the OP's error?
Don't use the term 'fanboy' here again.

Iron Woode

Super Moderator
 
Last edited by a moderator:

Hans Gruber

Platinum Member
Dec 23, 2006
2,516
1,357
136
If you can't stand criticism for Intel, delete your account. This is forum, we can love or hate Intel, it is our right. I'v seen you in other forums, you are not different than any hardcore fanboy of any company, Intel didn't live up to expectations, AMD will most likely crush it, deal with it.
You've been here since April 13th 2022. I think you need to be here a few years to start throwing daggers.
 
  • Like
Reactions: Henry swagger

DrMrLordX

Lifer
Apr 27, 2000
22,902
12,971
136
Edit. it was CBR15, Can we just let go of CBR15 and CBR20?

R15 shows non-AVX FP performance for the core. It's still interesting.

These QS SPR benchmarks are clearly underestimates of production performance, unless we seriously believe that SPR will perform +/- ice lake. It won’t.

It shouldn't be that slow, but it does lead us to two questions: Why is it so slow at E3 stepping, and how many more steppings/how much longer will it need to be a launch-worthy product? Genoa seems to be doing quite well.
 

nicalandia

Diamond Member
Jan 10, 2019
3,331
5,282
136
As i said before, this leaker's reputation is doubtful, he leaked so many things before, and even sent ES chips to others.....now this time I dont think he has full access to newest and fully functioned toys because of his unlawful movement.

Don't kill the messenger. He has been providing info on Alder Lake ES, Sapphire Rapids ES and now Genoa. With Buggy BIOS and unsupported OS. I am thankful that he is able to do it and will bey very happy if he complete a CB R23 MT run with SMT OFF on Genoa.
 

eek2121

Diamond Member
Aug 2, 2005
3,410
5,049
136
This is literally the first numbers we've gotten, and no leaker or any other source has claimed anything to match these scores.


No, common sense is sufficient for that.


You can find dozens of reviews of Golden Cove at <5GHz frequencies and surprise, it consumes far less power. By your logic, the 12400 should be impossible.

But it's very illuminating how you claim to care about "facts" while telling us to ignore everything we know about the product in question.

Golden Cove at lower frequencies is indeed efficient. However, let us not forget that the only reason Intel was able to win any benchmarks in the first place (on the desktop) was that they pushed the chip well past that point. Based on pretty much every benchmark I've seen; a Golden Cove core is around 3-5% faster than a Zen 3 core. Zen 4 has a huge process advantage over Golden Cove, and it was already an efficient chip to begin with.

Anyone thinking That SPR is going to be faster than Genoa needs to go back and look at raw numbers. Genoa is basically refined Milan. SPR would have barely beaten Milan at similar core counts.

...if you ignore all the tests it performs worse or on par in. You don't seriously believe that iso-core count, Genoa and Milan are the same, right?

No? Genoa has larger L2, AVX-512 instructions, and is on a completely different node from Milan.
 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,106
136
Based on pretty much every benchmark I've seen; a Golden Cove core is around 3-5% faster than a Zen 3 core.
It's certainly a larger gap than that on average. But regardless, the point is the same. The leak posits that Golden Cove is far weaker, but we know how it performs.
Anyone thinking That SPR is going to be faster than Genoa needs to go back and look at raw numbers.
Who here has claimed otherwise? Certainly not me.
No? Genoa has larger L2, AVX-512 instructions, and is on a completely different node from Milan.
Well the leak shows Genoa behind in most tests and barely winning in the ones it does. So the rational conclusion is that this data is garbage and should be ignored. Not that only the Intel numbers are correct.
 

itsmydamnation

Diamond Member
Feb 6, 2011
3,072
3,897
136
It's certainly a larger gap than that on average. But regardless, the point is the same. The leak posits that Golden Cove is far weaker, but we know how it performs.
You have factored in the slow ass L3 in SPR compared to desktop ring when making that assessment right ?

actually has anyone done GC testing with the ring/ uncore clocked low to see its impacts ?
 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,106
136
You have factored in the slow ass L3 in SPR compared to desktop ring when making that assessment right ?
There is no reasonable set of conditions where Golden Cove is massively slower than even Skylake in single core. Trying to contort reality to fit a clearly flawed datapoint is just not going to produce sane results. Likewise with the Genoa scores.
 
  • Like
Reactions: Zucker2k

HurleyBird

Platinum Member
Apr 22, 2003
2,811
1,544
136
Well the leak shows Genoa behind in most tests and barely winning in the ones it does. So the rational conclusion is that this data is garbage and should be ignored. Not that only the Intel numbers are correct.

That the Genoa numbers are garbage does not make the SPR numbers any more or less likely to be correct. Totally orthogonal. Rationally we can say the Genoa numbers are garbage and should be ignored, while the SPR numbers, even if they don't look as egregious, should still obviously be taken with a big pinch of salt.
 
  • Like
Reactions: Zucker2k

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
27,247
16,107
136
There is no reasonable set of conditions where Golden Cove is massively slower than even Skylake in single core. Trying to contort reality to fit a clearly flawed datapoint is just not going to produce sane results. Likewise with the Genoa scores.
First, this is a Zen4/Genoa thread. Second, you continue to equate Golden cove in desktop to SR, which is a totally wrong assessment.
 
  • Like
Reactions: Grazick and Drazick

jamescox

Senior member
Nov 11, 2009
644
1,105
136
AVX is sufficiently distinct from GPU SIMD implementations that I really doubt there's any meaningful leveraging between the two. And if anything, AMD's CPU team has helped the GPU team.
At a high level, yes they are different, but the underlying floating point execution units are plausibly the same. I don’t know why they wouldn’t share some implementation. They are likely to be built in a similar process, so even a highly optimized layout might me shared. They have to support a lot of different sizes and formats, so even the underlying FP unit would probably be quite large.
 

jamescox

Senior member
Nov 11, 2009
644
1,105
136
Yeah that memory latency is shockingly good. I honestly expected DRAM latency to increase because of DDR5.

Also, you can infer a range for the clocks from that image. L1 latency is 4 cycles and reported at .1ns granularity, so 1.1 means clocks between 1/(1.15ns/4) and 1/(1.05ns/4) = 3.48GHz to 3.81GHz.

L2 would then be 15 or 16 cycles, which is a reasonable increase from the 12 of Zen 3 for being twize the size.

(edit) actually if you take the lowest clock and assume maximum rounding, L2 could just barely be 14 cycles. I don't think that's likely, though. If the cpu is running at 3.7GHz, the L2 latency is almost certainly 16 cycles.
I always thought that the first implementation of the IO die was likely not very optimal. I don’t quite remember how it was laid out internally without tracking down the diagram. If they switched from 14/12 nm Global Foundries to a 6 nm TSMC die, then I wouldn’t be surprised that they did a lot of optimizations for essentially the second generation implementation. They would have needed much higher clocks for DDR5 support anyway. TSMC 6 nm should allow for much higher clocks and higher clocks likely means lower latencies, even without design optimizations that are likely present.