Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 682 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Rheingold

Member
Aug 17, 2022
72
204
76
Cheaper (50% price drop!) Elite X laptops for us to mess with. Win-Win!
Something like this. The argument for the Snapdragon was lower prices but so far that hasn't really materialized. At least in the EU the cheapest X Plus models are still around 1100€, with X80-100 Elite models mostly around 1500€, right where the cheapest HX 370 models are starting on launch day. Also, you can get nice Hawk Point models for 700-800 euros. (insert Ape "Where cheaper?" meme)
 
  • Like
Reactions: igor_kavinski

Philste

Senior member
Oct 13, 2023
300
474
96
Could this also be the reason for the fact that Strix Laptops with dGPU max out at 4070 so far? I mean despite the fact that Strix is underwhelming it is still the best APU/Laptop CPU right now and at least for 5-6 more months until Arrow Lake launches. So one would expect devices with 4080 or 4090. But looking at the Inter-CCX-Latency I wonder if Strix is a massive step Backward for dGPU Gaming performance. I mean it's technically a quad core. So probably it stops at a 4070 to make CPU limit situations more unlikely?
 

naukkis

Golden Member
Jun 5, 2002
1,020
853
136
How are they useless?

Performance cores are in separate cache domain to efficiency cores. Pretty much similar approach as Meteorlake LPE-cores - they are only useful to save power when performance cores are totally shut off.
 

Abwx

Lifer
Apr 2, 2011
11,884
4,873
136
Their geekbench numbers(that they claimed they ran were copied from a asus geekbench result) https://browser.geekbench.com/v6/cpu/6915874


We ll know more as the reviews get more accurate.

Looking at NBC s CB R23 score for the Pro Art, wich is 23236 pts, this should yield 1350 pts in R 2024, for some reason they have a much lower score wich is not much better than the one Computerbase got at 30-33W.


David Huang leaked 3.3GHz clocks for 365 and 3.7GHz for 370 weaks ago. Yesterday @StinkyPinky confirmed the 3.3GHz for 365 with his retail unit, so there is no reason to believe the 3.7GHz for 370 are wrong. I wish they were tho.

That s the frequencies for a 30-33W TDP, but if we compare his 15400 pts CB R23 score, wich is at 3.3GHz or so, what is the frequency in NBC s Pro Art that score 23236 pts, that is, 50% more, and even accounting for 20% more cores..?.
 
Last edited:

GTracing

Senior member
Aug 6, 2021
478
1,114
106
Performance cores are in separate cache domain to efficiency cores. Pretty much similar approach as Meteorlake LPE-cores - they are only useful to save power when performance cores are totally shut off.
That's utterly ridiculous and completely untrue.

AMD 12 and 16 desktop CPUs also have multiple CCXs, the cores on different CCXs aren't useless.
 

ryanjagtap

Member
Sep 25, 2021
146
214
126
Performance cores are in separate cache domain to efficiency cores. Pretty much similar approach as Meteorlake LPE-cores - they are only useful to save power when performance cores are totally shut off.
I think the inverse is true. The Zen 5c CCX is the most active and the 4C Zen 5 CCX turns on for bursty workloads. There is a video of Wendell from Level1Techs interviewing an AMD fellow who said this I think.


The video has no timestamps. You can transcribe the video if you don't want to watch it.
 

naukkis

Golden Member
Jun 5, 2002
1,020
853
136
That's utterly ridiculous and completely untrue.

AMD 12 and 16 desktop CPUs also have multiple CCXs, the cores on different CCXs aren't useless.

Dual CCX-cpu's aren't as useful as single cache domain cpus with similar core count. But that's not the same situation than with Strix point - in desktop both CCX has fast cores and equal amount of cache. In Strix point there's slow CCX and fast CCX with more cache but only 4 cores. AMD could do desktop cpu with one normal and one dense CCD to make similar situation but they probably won't even consider doing such a dog to shame desktop performance reputation. But they did it for mobile - after years of pretty flawless execution. Have to wonder if they are losing their mojo.
 

GTracing

Senior member
Aug 6, 2021
478
1,114
106
Dual CCX-cpu's aren't as useful as single cache domain cpus with similar core count. But that's not the same situation than with Strix point - in desktop both CCX has fast cores and equal amount of cache. In Strix point there's slow CCX and fast CCX with more cache but only 4 cores. AMD could do desktop cpu with one normal and one dense CCD to make similar situation but they probably won't even consider doing such a dog to shame desktop performance reputation. But they did it for mobile - after years of pretty flawless execution. Have to wonder if they are losing their mojo.

The cross-CCX latency is not nearly as big of an issue as some people make it out to be.

The lower clocks on the dense core cluster barely matter because the chip isn't able to run all cores at high clock speeds, and single threaded workloads will run on the fast cores anyways.
 

HurleyBird

Platinum Member
Apr 22, 2003
2,811
1,544
136
The fact that a single CCX can only support upto 8 cores seems to be a problem for AMD. More cores would require another CCX, and a seperate L3 block. Perhaps they should work on larger CCXes, or even do a rework of their core cluster hierarchy?

Turin dense is a 16-core CCX already. There's nothing fundamental that stopped AMD from putting everything on one CCX in Strix.
 

ryanjagtap

Member
Sep 25, 2021
146
214
126
Turin dense is a 16-core CCX already. There's nothing fundamental that stopped AMD from putting everything on one CCX in Strix.
Maybe they had to make two CCX instead of a unified CCX to power gate the zen 5 cores? I don't think they could turn off the zen 5 cores when idling without having put it on a different CCX. (Just a speculation)
 

naukkis

Golden Member
Jun 5, 2002
1,020
853
136
The cross-CCX latency is not nearly as big of an issue as some people make it out to be.

The lower clocks on the dense core cluster barely matter because the chip isn't able to run all cores at high clock speeds, and single threaded workloads will run on the fast cores anyways.

Cross-latency isn't problem biggest problem- being different cache domains is main problem. Have really, really wonder why AMD selected that CCX arrangement as it's the worst possible. They could have done 4+4 CCX with addtional 4 core low power CCX to have at least 8-core CCX with enough cache and fast cores for MT scalability. But they choose that - and result seems to be not that great.
 

GTracing

Senior member
Aug 6, 2021
478
1,114
106
Cross-latency isn't problem biggest problem- being different cache domains is main problem. Have really, really wonder why AMD selected that CCX arrangement as it's the worst possible. They could have done 4+4 CCX with addtional 4 core low power CCX to have at least 8-core CCX with enough cache and fast cores for MT scalability. But they choose that - and result seems to be not that great.

Can you point to benchmarks from any reputable reviewer to show that this is an actual problem?
 

SarahKerrigan

Senior member
Oct 12, 2014
735
2,036
136
So Zen5 = Bulldozer 2.0?

Fascinating.

Not in the sense of being a completely uncompetitive product. It looks like AMD made some odd choices, tried some new tricks, and performance is kind of underwhelming, but it's not meaningfully worse than Zen4 on any axis and is better on some others.

Folks making the Dozer comparison need to remember that Intel was doing 50%+ more iso-clock ST int against it.
 

naukkis

Golden Member
Jun 5, 2002
1,020
853
136
Can you point to benchmarks from any reputable reviewer to show that this is an actual problem?

So with their CCX MT job has to choose between 8 cores and 8MB L3 or 4 cores and 16MB L3. First that is a big problem to scheluder, where should job scheluded for best performance? Why did they choose that CCX arrangement instead of putting 8 cores with big L3 and big cores cluster and leave 4 small cores to CCX with low amount of cache if their intention is to have low power CCX for power saving( to mimic what Intel is doing?)? That CCX arrangement they chose ain't making any sense.
 
Last edited:
  • Like
Reactions: FlameTail

Rheingold

Member
Aug 17, 2022
72
204
76
I think the inverse is true.
Thank you. I also had those statements still in mind but couldn't remember which interview it was. The interesting part about Zen 5c starts at 8m20s. Mehesh even envisions the separation of CCXs and thus cache contexts as beneficial for applications with different performance requirements. It just remains to be seen if these potential advantages will be leveraged in real usage.