Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 156 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,536
14,488
136
Oh, and for similar cores.... so 64 cores gets 615 and 60 cores of SR get 495. So at the closest we can bench, its 25% faster for for 7% more Genoa cores.
View attachment 86497
In addition, the SR power consumption was far greater than Genoa. Since some of these don't show it, I will use Phoronix test, 386 for Genoa vs 586 for SR. Not even close. So Genoa wins overall by 25% and using 50% less power.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,348
2,843
106
It's freq/voltage curve - not perf/w. Dense core has less static and dynamic power variables so it's more efficient at given voltage. Zen4 &c efficiency switch point is somewhere around 3GHz, which quick calculation gives approximation that it's combined dynamic/static capacitance at that 3GHz is about 2/3 of regular core.
I got ~29% lower capacitance, If 3GHz is the turning point.
My question is from where did you get that It is at 3GHz?

3GHz is rather low If you ask me.
8C16T 7840Hs can keep 4GHz at 54W in Prime95. Notebookcheck.net
8C16T Z1 Extreme can keep 3.5GHz at 31-33W in CB R15, there was sadly no Prime95. Notebookcheck.net

What I want to know is If 4+8 Strix Point limited to 45W will clock below or above 3GHz in CB R23 or Prime95.
If It can clock above, then those dense cores will be less efficient than normal ones, so Strix Point will not be ideal set at >45W TDP.
On the other hand, at low TDP It could be more efficient in highly parallel tasks.
 
Last edited:

randomhero

Member
Apr 28, 2020
180
247
86
@androc_thurston When you said that Venice will be silly expensive did you mean that because of fabbing cost or packaging cost. Packaging should not be that expensive in 2,3 years, unless production capacity will still be strained?

Sorry guys for off topic.
 

randomhero

Member
Apr 28, 2020
180
247
86
I got ~29% lower capacitance, If 3GHz is the turning point.
My question is from where did you get that It is at 3GHz?

3GHz is rather low If you ask me.
8C16T 7840Hs can keep 4GHz at 54W in Prime95. Notebookcheck.net
8C16T Z1 Extreme can keep 3.5GHz at 31-33W in CB R15, there was sadly no Prime95. Notebookcheck.net

What I want to know is If 4+8 Strix Point limited to 45W will clock below or above 3GHz in CB R23 or Prime95.
If It can clock above, then those dense cores will be less efficient than normal ones, so Strix Point will not be ideal set at >45W TDP.
On the other hand, at low TDP It could be more efficient in highly parallel tasks.
Cores can be clocked independently. So they need to make good driver that OS knows where to put priority tasks.Amd already has good onchip PWM.
I don't think they will push dense cores above efficiency point.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,348
2,843
106
Cores can be clocked independently. So they need to make good driver that OS knows where to put priority tasks.Amd already has good onchip PWM.
I don't think they will push dense cores above efficiency point.
I am talking about fully loaded cores, for example during CB R23.
Of course, during lightly threaded loads 1-4 threads, standard cores should have priority.

If OEMs set TDP at 80W for example, which they do with some Phoenix gaming laptops, then there is more than enough juice to clock past 3.6GHz.
For example: 12C24T 7845HX manages 4533MHz with 118W package power during Prime95 and that one uses 5nm process and is chiplet based.

But maybe Zen5C is better optimized and the turning point is higher than 3GHz.
 

randomhero

Member
Apr 28, 2020
180
247
86
I am talking about fully loaded cores, for example during CB R23.
Of course, during lightly threaded loads 1-4 threads, standard cores should have priority.

If OEMs set TDP at 80W for example, which they do with some Phoenix gaming laptops, then there is more than enough juice to clock past 3.6GHz.
For example: 12C24T 7845HX manages 4533MHz with 118W package power during Prime95 and that one uses 5nm process and is chiplet based.

But maybe Zen5C is better optimized and the turning point is higher than 3GHz.
I would still not be worried. 4 cores can suck lots of juice at high clocks. They can use more than half of allowed PPW @88 W.
 

Abwx

Lifer
Apr 2, 2011
10,926
3,414
136
I got ~29% lower capacitance, If 3GHz is the turning point.
My question is from where did you get that It is at 3GHz?
Voltage being 20% higher with 2/3 the capacitance this lead to 0.66 x (1.2)^2 = 0.95x the power, that is a non significant power saving of 5%, if capacitance is 29% lower then there s no perf/watt advantage at all, the dense core will use 2% more power.

So it s obvious that those numbers are either truncated or that they are measured before the integrated per core voltage regulation.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,348
2,843
106
I would still not be worried. 4 cores can suck lots of juice at high clocks. They can use more than half of allowed PPW @88 W.
Phoenix can keep 4350MHz at 82W during Prime95. 4 cores should manage similar clock at 45W, but is this approach better?
Let's say with 80W power budget you can have 4 Zen5 cores at 4.3GHz and 8 Zen5C at 3GHz vs All 12 cores at 3.6GHz. The later would provide a bit higher performance.
 

Fjodor2001

Diamond Member
Feb 6, 2010
3,742
233
106
I just noticed that in the slide below, for Nirvana it says "Low power core option". Notice that this is different naming compared to for Persephone where it says "Dense option".

I'm thinking if they mean that there will be two different core types, i.e.:

"Dense option" == C variant, e.g. Zen 4C.
"Low power core option" == E variant (new core, similar to Intel E-cores for Efficiency), so we'll get Zen 5E?

Any thoughts on this?

ucDDBPGhw9XAazV5JJweCS-970-80.png
 

Timorous

Golden Member
Oct 27, 2008
1,595
2,714
136
Again, that's the exact scenario that will disproportionately benefit the architecture that gains a massive non-SMT uplift at the expense of SMT yield, except that's not what the leak is showing.

At this point, I think you might be trolling me because based on your posting history (this is a complement) I don't see how this could possibly be going over your head for so long now (this isn't). You aren't engaging. You're just repeating yourself despite the fact that the thing you're repeating goes against your claim.

One more time: If the non-SMT threads (which are assigned first) have a massive uplift while SMT threads (which are assigned later) actually regress, and you're talking about an application with a semi-hard physical limit to thread scaling due to additional threads not being able to perform any meaningful work after a point, when comparing two CPUs that go past that point we're talking about a task that disproportionately benefits the CPU that relies less on SMT.

Therefore if the uplift of 256T Z5 vs 256 Z4 in Cinebench is only 15%, then you should expect a much lower uplift in consumer parts. And that just doesn't sound realistic.

Put some numbers to it.

CPU A is 16c 32t and has a 20% SMT uplift.
CPU B is 16c 32t and has a 10% SMT uplift.

If we first index to themselves in a 24t workload then CPU A and B SMT off would get 100 and with SMT on CPU A gets 110 and CPU B gets 105

Relative to A CPU B has a 30% IPC increase so if CPU A gets 100 SMT off then CPU B would be at 130.

With the SMT on that would change to 110 (100*1.1) for CPU A and 136.5 (130*1.05) for CPU B.

130/100 is obviously a 30% uplift.

136.5/110 is 26.5%. 24.1% uplift.

So in the scenario where the SMT drop off is relatively greater than the IPC gain in cases where you load all the real cores and some SMT cores the nT perf uplift can drop relative to 1t perf uplift.

Edit. No idea why I put 26.5%... brain fart.
 
Last edited:
  • Like
Reactions: Tlh97

naukkis

Senior member
Jun 5, 2002
705
576
136
We have straight up power curves too, finally decided to stop being lazy and just search it up lol
View attachment 86491
Efficiency switch point seems here to be about 2.5GHz. A bit lower point than with server C-only CCD, probably resulting from dense and regular core sharing ring and power delivery making C-part little handicapped. Still achieving 20% power saving in ~5W soc tdp levels, which isn't small achievement.

Power curve is similar with other hybrid designs - 1T workloads prefer prime cores instead of efficiency ones as cpu core power is so small amount of total soc power. Efficiency switch point swifts to higher frequency in mt workloads when priority core voltage won't drive whole cpu ring voltages up.
 
Last edited:
  • Like
Reactions: Tlh97

DisEnchantment

Golden Member
Mar 3, 2017
1,599
5,765
136
  • "2 basic block fetch" --> Does this mean 2x fetch and decode blocks akin to Tremont?
Maybe it is not, at least the diagram does not look like that.
Anybody has a clue what this is?

"Dense option" == C variant, e.g. Zen 4C.
"Low power core option" == E variant (new core, similar to Intel E-cores for Efficiency), so we'll get Zen 5E?

Any thoughts on this?
I would say the next evolution of the C cores.
The current C cores seem like a reaction to a competitor or opportunistic development. But it is an adaptation of the standard core only, save shaving of L3.
With such a fat Zen 5 core, it makes sense now to use a more efficient and less wasteful design and purpose built architecture for the C variants.

For cloud it would makes sense to use a 256b AVX pipe instead of a full blown 512b pipe for instance. Then physical design as well should target the density and frequency sweet spots not just port a high frequency physical implementation to a dense library. Then they could tailor the OOO structures to a certain throughput instead of going for the last drop of ST perf they can squeeze from the architecture.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,759
3,115
136
Maybe it is not, at least the diagram does not look like that.
Anybody has a clue what this is?


I would say the next evolution of the C cores.
The current C cores seem like a reaction to a competitor or opportunistic development. But it is an adaptation of the standard core only, save shaving of L3.
With such a fat Zen 5 core, it makes sense now to use a more efficient and less wasteful design and purpose built architecture for the C variants.

For cloud it would makes sense to use a 256b AVX pipe instead of a full blown 512b pipe for instance. Then physical design as well should target the density and frequency sweet spots not just port a high frequency physical implementation to a dense library. Then they could tailor the OOO structures to a certain throughput instead of going for the last drop of ST perf they can squeeze from the architecture.
Cloud might very well still want native 512 bit if they have the need from a crypto perspective.

I honestly don't Know....
 
Mar 11, 2004
23,056
5,519
146
Maybe it is not, at least the diagram does not look like that.
Anybody has a clue what this is?


I would say the next evolution of the C cores.
The current C cores seem like a reaction to a competitor or opportunistic development. But it is an adaptation of the standard core only, save shaving of L3.
With such a fat Zen 5 core, it makes sense now to use a more efficient and less wasteful design and purpose built architecture for the C variants.

For cloud it would makes sense to use a 256b AVX pipe instead of a full blown 512b pipe for instance. Then physical design as well should target the density and frequency sweet spots not just port a high frequency physical implementation to a dense library. Then they could tailor the OOO structures to a certain throughput instead of going for the last drop of ST perf they can squeeze from the architecture.

I would guess the C cores are a development for Enterprise and the possibility of having to abandon multi-threading due to the security issues (they didn't give up multi-threading because it wasn't need to yet, but they did want to prove they can increase core density if required). I feel like Intel's e-cores are basically the same, but they likely already had something like them in development due to the Atom development already.

I could see it being opportunistic for AMD in that they were looking at separate cache chips giving them the ability to reduce cache per core on the compute chiplet but being able to add it back via cache chiplet.

I feel like the E cores are in response to Intel (who is getting aggressive about core counts, and that seems to be fairly effective; how much of that came from Microsoft I wonder though, as they put in work to get Intel's scheduler to work with Windows 11 really quickly) but also likely customers asking for cheaper smaller more efficient cores for potential mobile use (Microsoft both for consoles but also to put pressure on ARM designs, being able to stick with x86 would save Microsoft a lot of dev work and ARM keeps disappointing as far as Windows platform potential goes).

I think I've posted this before, but I could see AMD changing cache to where it makes sense. They go for modest cache in many products, but some like enterprise that can afford big caches can pay for them. And other markets like gaming, they could change things up to basically combine cache for both CPU and GPU, as part of the overall memory system.

Perhaps the C cores also are meant for integrating elsewhere (like in GPU/APU), where they serve as sorta translation/director role to enable AMD GPUs to be more programmable to compete with CUDA.
 
  • Like
Reactions: Tlh97

nagus

Junior Member
Aug 15, 2020
3
1
46
funny, nobody is talking about timing. because the zen5 box ends right before 2024.... same as zen3 ends right before 2021. could mean that zen5 launches around the same time as zen3. november.
 

adroc_thurston

Platinum Member
Jul 2, 2023
2,040
2,613
96
Jul 27, 2020
16,079
10,122
106
They don't need to, ARL-S is a nonfactor.
We'll see. I just have a hard time believing that Intel can get totally knee capped by Zen 5. They countered Zen 3 with Alder Lake and Zen 4 with Raptor Lake. While not great responses, they at least prevented themselves from being thoroughly embarrassed. The 13600K is still a nice gaming CPU at its price point. The 15600K may not beat the Zen 5 equivalent but it should still sell in good numbers.
 

adroc_thurston

Platinum Member
Jul 2, 2023
2,040
2,613
96

Ajay

Lifer
Jan 8, 2001
15,388
7,824
136
In the US, maybe not that well. But the rest of the world doesn't even know that AMD exists (or rather, the shopkeepers prefer to stick with Intel). I see that the 7800X3D is selling remarkably well and outselling it is...Zen 3!
Desktop AMD sells well in most of Europe. Just maybe not where you live.