Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

DisEnchantment · Sep 29, 2022

Speculate at will

Josh128 · Aug 4, 2025

igor_kavinski said:
They will probably do both, consumer and Epyc 4005 models.

Hmm, now that you mention it, nothing about this leak says Ryzen...being AMD likes to disappoint the enthusiasts and all, I bet the dual X3D chip is merely that, an AM5 compatible EPYC SKU. Probably not going to be for consumers at all.

gdansk · Aug 4, 2025

Josh128 said:
Hmm, now that you mention it...being AMD likes to disappoint the enthusiasts and all, I bet the dual X3D chip is merely that, an AM5 compatible EPYC SKU. Probably not going to be for consumers at all.

Doesn't matter. It's AM5. It would be nice to launch it as an Epyc to avoid gamers thinking they should buy them.

Win2012R2 · Aug 4, 2025

Josh128 said:
I bet the dual X3D chip is merely that, an AM5 compatible EPYC SKU.

Which it should have been from start - with AAA level chiplets, both of them - $999, but plenty of gamers would buy it to feed 5090

igor_kavinski · Aug 4, 2025

Could the additional 30W be to feed the extra V-cache die?

I'm hoping it's a newer, more refined V-cache die with slightly more bandwidth.

QuickyDuck · Aug 4, 2025

It would be interesting if they put 2 top binned chip

igor_kavinski · Aug 4, 2025

QuickyDuck said:
It would be interesting if they put 2 top binned chip

Even more if they raise the Fmax to 6 GHz.

Now I'm hallucinating

StefanR5R · Aug 5, 2025

MS_AT said:
https://chipsandcheese.com/p/running-gaming-workloads-through a pity he doesn't have a x3d part to compare.

igor_kavinski said:
The most important chart from there, in my opinion, and the corresponding 285K chart: [...]

For those who haven't read the article yet: As far as I have registered, it contains some good discussion of how Zen 5 and Lion Cove have similar yet different bottlenecks in game workloads, furthermore game workload traces put in contrast to SPEC traces, and also some IMO enlightening bits about games versus inter-CCX traffic. Finally somebody who, rather than merely speculating about the latter effects, did some actual performance counter based monitoring.

ToTTenTranz · Aug 5, 2025

Josh128 said:
That 4% advantage example is a super dubious claim for the benefits of X3D in transcoding. Thats margin of error stuff. I'd bet adding a second vcache die would not add another 4%, if anything at all. Certainly nothing worth the extra $100-$200 (15%-30%) that they will surely be asking.

View attachment 128216

What about putting 2x VCache dies underneath one of the CCDs? Can't they stack?
Give one CCD access to a whopping 160MB of cache.

MS_AT · Aug 5, 2025

StefanR5R said:
IMO enlightening bits about games versus inter-CCX traffic. Finally somebody who, rather than merely speculating about the latter effects, did some actual performance counter based monitoring.

Advocates of dual x3D chips might not like the conclusions. On the other hand, I wonder how the author has handled the internal game settings for SMT. As the game by default, in theory, should use 1 worker per core for 2CCD chips. It should be now possible to modify that in the options, but I never really tried to see what these are really doing. Still I wonder if anything could interfere with manually setting the affinity + whatever the special chipset driver is doing. Especially that the very last plot suggest, that by default almost all the load is rather contained on one CCD, within SMT threads.

fastandfurious6 · Aug 5, 2025

what is easily observable during any bench + hwinfo is that the boost core changes every second

t1: core 13 5ghz
t2: core 6 5ghz
t3: core 7 + 10 5ghz
.
.
. etc. not efficient

the assumption is, scheduler throws too much work over random cores, waste

Josh128 · Aug 5, 2025

fastandfurious6 said:
what is easily observable during any bench + hwinfo is that the boost core changes every second

t1: core 13 5ghz
t2: core 6 5ghz
t3: core 7 + 10 5ghz
.
.
. etc. not efficient

the assumption is, scheduler throws too much work over random cores, waste

Thats a function of the CPU to sustain high 1t boost frequencies. The CPU "hands off" the load between preferred cores to allow cores to rest & cool while others take up the load. Max frequency is sustained much longer by doing this. The algorithm is so efficient that there is basically no perf loss from doing it. This has been the case since at least Zen 2 on the AMD side.

Kryohi · Aug 7, 2025

A few very interesting benchmarks here

AMD Ryzen AI Max+ 395 vs. Ryzen 9 9950X vs. Ryzen 9 9950X3D Linux Performance Review - Phoronix

www.phoronix.com

Besides the obvious efficiency numbers, Halo has better absolute CPU performance in quite a few ML/AI workloads, Gromacs, Palabos, GPAW and other stuff.

Hail The Brain Slug · Aug 7, 2025

Kryohi said:
A few very interesting benchmarks here

AMD Ryzen AI Max+ 395 vs. Ryzen 9 9950X vs. Ryzen 9 9950X3D Linux Performance Review - Phoronix

www.phoronix.com

Besides the obvious efficiency numbers, Halo has better absolute CPU performance in quite a few ML/AI workloads, Gromacs, Palabos, GPAW and other stuff.

I'd like to have seen the Granite Ridge parts tested in eco mode.

In the testing I've been doing against a strix halo mini PC I have, once I enable eco mode on the 9950X3D it performs similarly (they are workloads that don't benefit from strix halo's design) and draws similar power from the wall, even with the big desktop idle overhead.

I imagine outside of the workloads that benefit significantly from strix halo's unique configuration, it actually isn't significantly more efficient than granite ridge at lower power.

LightningZ71 · Aug 7, 2025

From what we know, it's the same cores and the same node. There shouldn't be much of a difference in non bandwidth limited scenarios.

Hail The Brain Slug · Aug 7, 2025

LightningZ71 said:
From what we know, it's the same cores and the same node. There shouldn't be much of a difference in non bandwidth limited scenarios.

All the proclamations of a big efficiency advantage really seem like, for a lot of cases, it may just boil down to comparing something pushed way out of the pocket of the V/F curve against the same thing right in the pocket.

MS_AT · Aug 7, 2025

Hail The Brain Slug said:
draws similar power from the wall, even with the big desktop idle overhead.

Do you mean that, at idle, the Strix Halo draws a similar amount of power from the wall as a desktop part despite the SoC reporting lower power consumption? Or are you saying that if you enable Eco Mode on the 9950X3D, it will draw a similar amount of power under load as the Strix Halo does under load?

Hail The Brain Slug · Aug 7, 2025

MS_AT said:
Do you mean that, at idle, the Strix Halo draws a similar amount of power from the wall as a desktop part despite the SoC reporting lower power consumption? Or are you saying that if you enable Eco Mode on the 9950X3D, it will draw a similar amount of power under load as the Strix Halo does under load?

I have seen workloads where my 9950X3D with 88W PPT draws 190 watts at the wall and completes work slightly faster than strix halo in 120W mode, also drawing 190 watts at the wall.

The power per core in this specific case was only 3.1 watts for the 9950X3D, but 6.5 watts on Strix Halo. I assume it's at a disadvantage from the memory latency, unless something else is going on with it.

9950X3D was sustaining ~3.8GHz while the Strix Halo sustained ~4.3 GHz. It's memory intensive, but not bandwidth intensive, hence the thinking maybe the incredibly high memory latency from the LPDDR5X is at fault.

Edit: The workload was compiling unreal engine. It doesn't benefit very much, maybe 3.25% from V$ on the 9950X3D. Nothing significant. Other related workloads varied back and forth a bit, but in my line of work there was nowhere I found a compelling performance or efficiency advantage for Strix Halo.

Geddagod · Aug 7, 2025

Has anyone checked Zen 5 L3 latency vs Zen 4 btw? I remember AMD saying in some slide that it stayed the same or improved, however I remember some aida64 pic showing that it got worse.
Looking at the impact of switching to mesh vs ring is bound to be some what interesting for Zen 5.
There should also be a way to compare power between ring and mesh too- perhaps by subtracting CCD power vs core only power? IIRC AMD does have software power reporting for both of those metrics.

adroc_thurston · Aug 7, 2025

Geddagod said:
Has anyone checked Zen 5 L3 latency vs Zen 4 btw?

47 cycles (i.e. it's the same).

Geddagod · Aug 7, 2025

adroc_thurston said:
47 cycles (i.e. it's the same).

does the mesh run at the same frequency as the ring in previous gen? The ring boosting algo for desktop is IIRC ring frequency = frequency of highest boosting core, and in mobile it's some specific ratio of the highest boosting core?

adroc_thurston · Aug 7, 2025

Geddagod said:
does the mesh run at the same frequency as the ring in previous gen?

Yeah.

Geddagod said:
The ring boosting algo for desktop is IIRC ring frequency = frequency of highest boosting core, and in mobile it's some specific ratio of the highest boosting core?

It's always cclk for the ring.
There is no 'ratio', that's Intel gibberish.

poke01 · Aug 7, 2025

Review: Framework Desktop is a mash-up of a regular desktop PC and the Mac Studio

Size matters most for Framework’s first stab at a desktop workstation/gaming PC.

arstechnica.com

Good review. The Framework included cooler isn’t that good other than that looks good.

Hail The Brain Slug · Aug 7, 2025

poke01 said:
Review: Framework Desktop is a mash-up of a regular desktop PC and the Mac Studio

Size matters most for Framework’s first stab at a desktop workstation/gaming PC.

arstechnica.com

Good review. The Framework included cooler isn’t that good other than that looks good.

Kinda weird, my mini pc with the same cooler that most of the other ones have never throttles. It looked like frameworks heatsink was going to be better than that cooler that's making the rounds in the other designs.

poke01 · Aug 8, 2025

This review is excellent too

AMD Ryzen AI Max+ 395 vs. Ryzen 9 9950X vs. Ryzen 9 9950X3D Linux Performance Review - Phoronix

www.phoronix.com

igor_kavinski · Aug 8, 2025

Hail The Brain Slug said:
maybe the incredibly high memory latency from the LPDDR5X is at fault.

This is a fatal flaw with LPDDR5X. Sure, when on battery, it should run as-is but when on AC power, it should switch to low latency power burning mode. Always conserving energy was just a bad engineering decision. It badly needs a turbo mode.

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Golden Member

Golden Member

Diamond Member

Golden Member

Lifer

Member

Lifer

Elite Member

Senior member

Senior member

Senior member

Golden Member

Member

Diamond Member

Platinum Member

Diamond Member

Senior member

Diamond Member

Golden Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Lifer