Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

DisEnchantment · Sep 29, 2022

Speculate at will

Rheingold · Jul 28, 2024

igor_kavinski said:
Cheaper (50% price drop!) Elite X laptops for us to mess with. Win-Win!

Something like this. The argument for the Snapdragon was lower prices but so far that hasn't really materialized. At least in the EU the cheapest X Plus models are still around 1100€, with X80-100 Elite models mostly around 1500€, right where the cheapest HX 370 models are starting on launch day. Also, you can get nice Hawk Point models for 700-800 euros. (insert Ape "Where cheaper?" meme)

Philste · Jul 28, 2024

Could this also be the reason for the fact that Strix Laptops with dGPU max out at 4070 so far? I mean despite the fact that Strix is underwhelming it is still the best APU/Laptop CPU right now and at least for 5-6 more months until Arrow Lake launches. So one would expect devices with 4080 or 4090. But looking at the Inter-CCX-Latency I wonder if Strix is a massive step Backward for dGPU Gaming performance. I mean it's technically a quad core. So probably it stops at a 4070 to make CPU limit situations more unlikely?

naukkis · Jul 28, 2024

GTracing said:
How are they useless?

Performance cores are in separate cache domain to efficiency cores. Pretty much similar approach as Meteorlake LPE-cores - they are only useful to save power when performance cores are totally shut off.

Abwx · Jul 28, 2024

Malachijtjfjf said:
Their geekbench numbers(that they claimed they ran were copied from a asus geekbench result) https://browser.geekbench.com/v6/cpu/6915874

https://twitter.com/x/status/1815314343942324464

We ll know more as the reviews get more accurate.

Looking at NBC s CB R23 score for the Pro Art, wich is 23236 pts, this should yield 1350 pts in R 2024, for some reason they have a much lower score wich is not much better than the one Computerbase got at 30-33W.

Philste said:
David Huang leaked 3.3GHz clocks for 365 and 3.7GHz for 370 weaks ago. Yesterday @StinkyPinky confirmed the 3.3GHz for 365 with his retail unit, so there is no reason to believe the 3.7GHz for 370 are wrong. I wish they were tho.

That s the frequencies for a 30-33W TDP, but if we compare his 15400 pts CB R23 score, wich is at 3.3GHz or so, what is the frequency in NBC s Pro Art that score 23236 pts, that is, 50% more, and even accounting for 20% more cores..?.

GTracing · Jul 28, 2024

naukkis said:
Performance cores are in separate cache domain to efficiency cores. Pretty much similar approach as Meteorlake LPE-cores - they are only useful to save power when performance cores are totally shut off.

That's utterly ridiculous and completely untrue.

AMD 12 and 16 desktop CPUs also have multiple CCXs, the cores on different CCXs aren't useless.

ryanjagtap · Jul 28, 2024

naukkis said:
Performance cores are in separate cache domain to efficiency cores. Pretty much similar approach as Meteorlake LPE-cores - they are only useful to save power when performance cores are totally shut off.

I think the inverse is true. The Zen 5c CCX is the most active and the 4C Zen 5 CCX turns on for bursty workloads. There is a video of Wendell from Level1Techs interviewing an AMD fellow who said this I think.

The video has no timestamps. You can transcribe the video if you don't want to watch it.

naukkis · Jul 28, 2024

GTracing said:
That's utterly ridiculous and completely untrue.

AMD 12 and 16 desktop CPUs also have multiple CCXs, the cores on different CCXs aren't useless.

Dual CCX-cpu's aren't as useful as single cache domain cpus with similar core count. But that's not the same situation than with Strix point - in desktop both CCX has fast cores and equal amount of cache. In Strix point there's slow CCX and fast CCX with more cache but only 4 cores. AMD could do desktop cpu with one normal and one dense CCD to make similar situation but they probably won't even consider doing such a dog to shame desktop performance reputation. But they did it for mobile - after years of pretty flawless execution. Have to wonder if they are losing their mojo.

yottabit · Jul 28, 2024

So how long before we find out Strix Halo was restructured to have a 300 TOPS NPU and 2 Bobcat CPU cores

ryanjagtap · Jul 28, 2024

yottabit said:
So how long before we find out Strix Halo was restructured to have a 300 TOPS NPU and 2 Bobcat CPU cores

😱
What! No igpu! How would they run the display! /s

GTracing · Jul 28, 2024

naukkis said:
Dual CCX-cpu's aren't as useful as single cache domain cpus with similar core count. But that's not the same situation than with Strix point - in desktop both CCX has fast cores and equal amount of cache. In Strix point there's slow CCX and fast CCX with more cache but only 4 cores. AMD could do desktop cpu with one normal and one dense CCD to make similar situation but they probably won't even consider doing such a dog to shame desktop performance reputation. But they did it for mobile - after years of pretty flawless execution. Have to wonder if they are losing their mojo.

The cross-CCX latency is not nearly as big of an issue as some people make it out to be.

The lower clocks on the dense core cluster barely matter because the chip isn't able to run all cores at high clock speeds, and single threaded workloads will run on the fast cores anyways.

HurleyBird · Jul 28, 2024

FlameTail said:
The fact that a single CCX can only support upto 8 cores seems to be a problem for AMD. More cores would require another CCX, and a seperate L3 block. Perhaps they should work on larger CCXes, or even do a rework of their core cluster hierarchy?

Turin dense is a 16-core CCX already. There's nothing fundamental that stopped AMD from putting everything on one CCX in Strix.

yuri69 · Jul 28, 2024

inf64 said:
Some very odd results, especially Specint in AT review. Zen 5 does look underwhelming considering AMD's claims.

Regressing in the GCC subtest after all the accolades about that zero-bubble, 2-branches BPU with 16k L1 BTB... Jesus, this is Bulldozer vibes.

ryanjagtap · Jul 28, 2024

HurleyBird said:
Turin dense is a 16-core CCX already. There's nothing fundamental that stopped AMD from putting everything on one CCX in Strix.

Maybe they had to make two CCX instead of a unified CCX to power gate the zen 5 cores? I don't think they could turn off the zen 5 cores when idling without having put it on a different CCX. (Just a speculation)

Nothingness · Jul 28, 2024

yuri69 said:
Regressing in the GCC subtest after all the accolades about that zero-bubble, 2-branches BPU with 16k L1 BTB... Jesus, this is Bulldozer vibes.

Exactly my thought. I will wait for desktop results but I'm starting to be quite concerned about the choices AMD made.

Joe NYC · Jul 28, 2024

Philste said:
Strix Embargo fell right now... lets read the articles

Seriously? 6 pages since 9am?

yuri69 · Jul 28, 2024

Nothingness said:
Exactly my thought. I will wait for desktop results but I'm starting to be quite concerned about the choices AMD made.

Yea, but somehow "the try new things core" got fkced up in good old AMD way.

igor_kavinski · Jul 28, 2024

Joe NYC said:
Seriously? 6 pages since 9am?

Can't blame people for being underwhelmed. I think they were expecting an M4 killer 😀

naukkis · Jul 28, 2024

GTracing said:
The cross-CCX latency is not nearly as big of an issue as some people make it out to be.

The lower clocks on the dense core cluster barely matter because the chip isn't able to run all cores at high clock speeds, and single threaded workloads will run on the fast cores anyways.

Cross-latency isn't problem biggest problem- being different cache domains is main problem. Have really, really wonder why AMD selected that CCX arrangement as it's the worst possible. They could have done 4+4 CCX with addtional 4 core low power CCX to have at least 8-core CCX with enough cache and fast cores for MT scalability. But they choose that - and result seems to be not that great.

FlameTail · Jul 28, 2024

So Zen5 = Bulldozer 2.0?

Fascinating.

GTracing · Jul 28, 2024

naukkis said:
Cross-latency isn't problem biggest problem- being different cache domains is main problem. Have really, really wonder why AMD selected that CCX arrangement as it's the worst possible. They could have done 4+4 CCX with addtional 4 core low power CCX to have at least 8-core CCX with enough cache and fast cores for MT scalability. But they choose that - and result seems to be not that great.

Can you point to benchmarks from any reputable reviewer to show that this is an actual problem?

SarahKerrigan · Jul 28, 2024

FlameTail said:
So Zen5 = Bulldozer 2.0?

Fascinating.

Not in the sense of being a completely uncompetitive product. It looks like AMD made some odd choices, tried some new tricks, and performance is kind of underwhelming, but it's not meaningfully worse than Zen4 on any axis and is better on some others.

Folks making the Dozer comparison need to remember that Intel was doing 50%+ more iso-clock ST int against it.

Hitman928 · Jul 28, 2024

FlameTail said:
So Zen5 = Bulldozer 2.0?

Fascinating.

Uhhh, no. No where near the same.

naukkis · Jul 28, 2024

GTracing said:
Can you point to benchmarks from any reputable reviewer to show that this is an actual problem?

So with their CCX MT job has to choose between 8 cores and 8MB L3 or 4 cores and 16MB L3. First that is a big problem to scheluder, where should job scheluded for best performance? Why did they choose that CCX arrangement instead of putting 8 cores with big L3 and big cores cluster and leave 4 small cores to CCX with low amount of cache if their intention is to have low power CCX for power saving( to mimic what Intel is doing?)? That CCX arrangement they chose ain't making any sense.

Rheingold · Jul 28, 2024

ryanjagtap said:
I think the inverse is true.

Thank you. I also had those statements still in mind but couldn't remember which interview it was. The interesting part about Zen 5c starts at 8m20s. Mehesh even envisions the separation of CCXs and thus cache contexts as beneficial for applications with different performance requirements. It just remains to be seen if these potential advantages will be leveraged in real usage.

yuri69 · Jul 28, 2024

naukkis said:
Cross-latency isn't problem biggest problem- being different cache domains is main problem. Have really, really wonder why AMD selected that CCX arrangement as it's the worst possible.

AMD went for the cheapest way possible as always...

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Golden Member

Member

Senior member

Golden Member

Lifer

Senior member

Member

Golden Member

Golden Member

Member

Senior member

Platinum Member

Senior member

Member

Diamond Member

Diamond Member

Senior member

Lifer

Golden Member

Diamond Member

Senior member

Senior member

Diamond Member

Golden Member

Member

Senior member