Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 989 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

poke01

Diamond Member
Mar 8, 2022
4,027
5,354
106
How good is the memory controller on Strix Halo? As far I know it’s not very good and AMD had trouble with this.

There is a reason why it’s stuck to 8000MHz as well. The latency issue is probably exacerbated due to this
 

Doug S

Diamond Member
Feb 8, 2020
3,448
6,117
136
This is a fatal flaw with LPDDR5X. Sure, when on battery, it should run as-is but when on AC power, it should switch to low latency power burning mode. Always conserving energy was just a bad engineering decision. It badly needs a turbo mode.

Yes what a terrible engineering decision to make LOW POWER DDR conserve power! They should have anticipated people would inappropriately use it in situations for which it isn't intended, and designed it so it is able to discard its main reason for existence. :rolleyes:
 

MS_AT

Senior member
Jul 15, 2024
822
1,664
96
Kinda weird, my mini pc with the same cooler that most of the other ones have never throttles. It looked like frameworks heatsink was going to be better than that cooler that's making the rounds in the other designs.
It's the only review that mentions throttling, where others usually say positive things about the thermal solution. At least from those I read.

Edit: The workload was compiling unreal engine. It doesn't benefit very much, maybe 3.25% from V$ on the 9950X3D. Nothing significant. Other related workloads varied back and forth a bit, but in my line of work there was nowhere I found a compelling performance or efficiency advantage for Strix Halo
Have you looked at cpu load vs time plots? It is possible that x3D is able to pull ahead on critical path thanks to higher boost, where the power limit is not a problem. I never compiled unreal engine so have no intuition about its cpu load vs time profile, so might be I am completely off the mark.

47 cycles (i.e. it's the same).
AMD slides cite some improvement within CCX if I gets the slide right https://chipsandcheese.com/?attachment_id=31946
 

fastandfurious6

Senior member
Jun 1, 2024
689
871
96

it's very interesting bc it's basically the exact same CCD... Halo only has better IOD, low power interconnect and lpddr5x 8000

how is halo really achieving even up to x2 less power for same perf?

also both 10% faster and 20% slower in some scenarios? the slower is bc memory not designed for high perf?


basically it shows that zen 5 full power is still bottlenecked/untapped and minimal improvements dramatically increase performance
 
  • Like
Reactions: Joe NYC

Thunder 57

Diamond Member
Aug 19, 2007
3,889
6,557
136
it's very interesting bc it's basically the exact same CCD... Halo only has better IOD, low power interconnect and lpddr5x 8000

how is halo really achieving even up to x2 less power for same perf?

also both 10% faster and 20% slower in some scenarios? the slower is bc memory not designed for high perf?


basically it shows that zen 5 full power is still bottlenecked/untapped and minimal improvements dramatically increase performance

Doesn't Halo use LPDDR? The latency on that sucks and could explain some of it.
 
Jul 27, 2020
27,155
18,662
146
Yes what a terrible engineering decision to make LOW POWER DDR conserve power! They should have anticipated people would inappropriately use it in situations for which it isn't intended, and designed it so it is able to discard its main reason for existence. :rolleyes:
It is delusional to think that laptops aren't used plugged in.

You don't need an engineering degree to realize that. Unfortunately, the problem is that meetings are often just nothing more than a bunch of scared people nodding heads. If someone disagrees, people either laugh at the person or use other psychological tricks to undermine the person's idea. If a meeting began with the reassurance that every idea, no matter how stupid sounding, would be welcome and there is a guarantee that no one will be ridiculed for speaking their mind, maybe we wouldn't need to keep revising standards with incremental improvements.
 

Joe NYC

Diamond Member
Jun 26, 2021
3,456
5,048
136
it's very interesting bc it's basically the exact same CCD... Halo only has better IOD, low power interconnect and lpddr5x 8000

how is halo really achieving even up to x2 less power for same perf?

also both 10% faster and 20% slower in some scenarios? the slower is bc memory not designed for high perf?


basically it shows that zen 5 full power is still bottlenecked/untapped and minimal improvements dramatically increase performance

It seems that with the latency of LPDDR, V-Cache would have even more dramatic effect.
 

Josh128

Golden Member
Oct 14, 2022
1,164
1,758
106
it's very interesting bc it's basically the exact same CCD... Halo only has better IOD, low power interconnect and lpddr5x 8000

how is halo really achieving even up to x2 less power for same perf?

also both 10% faster and 20% slower in some scenarios? the slower is bc memory not designed for high perf?


basically it shows that zen 5 full power is still bottlenecked/untapped and minimal improvements dramatically increase performance
Its not the same CCD. AMD confirmed that somewhere (I dont have a link at the moment)-- it was produced on a low power track and uses low power transistors-- this is why it maxes out at 5.1 GHz.

EDIT: This is quote about the CCD being changed due to the interconnect.

It is a high bandwidth interface. But that had low-power states that could only take it so far. And you had retraining and latency implications every time the chip went down and came back up and so on. So for an always-on kind of a desktop kind of machine, that seemed like the best interconnect to connect that as we try to build this into an APU. The first thing we had to do was to change the interconnect between the two dies. And so the CCD that you see here, the core die that you see here, has a different item. That's the first change.
 
Last edited:

fastandfurious6

Senior member
Jun 1, 2024
689
871
96
Its not the same CCD. AMD confirmed that somewhere (I dont have a link at the moment)-- it was produced on a low power track and uses low power transistors-- this is why it maxes out at 5.1 GHz.

very interesting!!! I wonder what low power transistors are. up to x2 less power than 9950x for the same or better perf is crazy (though lower ghz upper limit)
 

Hail The Brain Slug

Diamond Member
Oct 10, 2005
3,840
3,220
146
Have you looked at cpu load vs time plots? It is possible that x3D is able to pull ahead on critical path thanks to higher boost, where the power limit is not a problem. I never compiled unreal engine so have no intuition about its cpu load vs time profile, so might be I am completely off the mark.
Its 100% load the entire duration for the specific example case I've been using. The frequencies I stated are for effective frequency, as well.

Granite ridge outperforms at less than half the power per core and lower clocks. Remember, I'm saying that in eco mode where its limited to 88W PPT it still performs better, while the desktop as a whole draws the same wall power as strix halo.

I expected if it was busy waiting on memory, the effective clock and power draw would reduce to reflect that. But I might be wrong on that. Every reading I can take indicates the CPU is completely maxed out and working hard.

Edit:
CPU/ConfigTimeEffective ClockSystem Wall DrawWatts per core
Strix Halo2414.50s4.6GHz190W6.5W
9950X3D1742.13s5.2GHz???W~10W
9950X3D Eco Mode2366.24s3.8GHz190W3.1W
 
Last edited:

poke01

Diamond Member
Mar 8, 2022
4,027
5,354
106
.
Its 100% load the entire duration for the specific example case I've been using. The frequencies I stated are for effective frequency, as well.

Granite ridge outperforms at less than half the power per core and lower clocks. Remember, I'm saying that in eco mode where its limited to 88W PPT it still performs better, while the desktop as a whole draws the same wall power as strix halo.

I expected if it was busy waiting on memory, the effective clock and power draw would reduce to reflect that. But I might be wrong on that. Every reading I can take indicates the CPU is completely maxed out and working hard.

Edit:
CPU/ConfigTimeEffective ClockSystem Wall DrawWatts per core
Strix Halo2414.50s4.6GHz190W6.5W
9950X3D1742.13s5.2GHz???W~10W
9950X3D Eco Mode2366.24s3.8GHz190W3.1W
How are you determining those per core readings? hwinfo?
 

MS_AT

Senior member
Jul 15, 2024
822
1,664
96
CPU/ConfigTimeEffective ClockSystem Wall DrawWatts per core
Strix Halo2414.50s4.6GHz190W6.5W
9950X3D1742.13s5.2GHz???W~10W
9950X3D Eco Mode2366.24s3.8GHz190W3.1W

Since it's 100% load, and values between 9950X3D runs scale nicely with the clock (~35% faster build with ~35% higher clock) further supporting your claim about the 100% load nature, the only thing that comes to my mind is that either there is something wrong with sensors on that Halo machine or your workload is really sensitive to latency, as when going over various benchmarks there was little benefit generally to lower latency memory during code compilation. Personally I never tested for latency influence as usually I have access to setups that have the same memory config. Nevertheless, interesting observation.
 

poke01

Diamond Member
Mar 8, 2022
4,027
5,354
106
It’s interesting to note that Ars also had similar results, in fact I would say the 9700X was doing much better relative to the power consumption when compared to the Max+ 395 in Handbrake.
Note: All AMD readings include the package power.

IMG_2389.pngIMG_2388.pngIMG_2387.png

However the 9700X did poorly against the Max+ 395 in Cinebench/Blender.
 
  • Like
Reactions: igor_kavinski

Hail The Brain Slug

Diamond Member
Oct 10, 2005
3,840
3,220
146
How are you determining those per core readings? hwinfo?
Yeah. They might not be completely accurate but at 120W PPT, strix halo has significantly more power to play with per core than GNR at 88W PPT thanks to the big IOD overhead and lower power level.
 
Last edited:

Hail The Brain Slug

Diamond Member
Oct 10, 2005
3,840
3,220
146
However the 9700X did poorly against the Max+ 395 in Cinebench/Blender.
All the canned benchmarks I ran showed exactly what people are saying - Nearly desktop performance for a fraction of the power.

It's just any real workload I have that I subjected it to shows a completely different story.
 

Hail The Brain Slug

Diamond Member
Oct 10, 2005
3,840
3,220
146
Since it's 100% load, and values between 9950X3D runs scale nicely with the clock (~35% faster build with ~35% higher clock) further supporting your claim about the 100% load nature, the only thing that comes to my mind is that either there is something wrong with sensors on that Halo machine or your workload is really sensitive to latency, as when going over various benchmarks there was little benefit generally to lower latency memory during code compilation. Personally I never tested for latency influence as usually I have access to setups that have the same memory config. Nevertheless, interesting observation.
Sensor readings being wrong doesn't change the outcome - It uses the same power to do the work more slowly. I am measuring wall draw with a kill-a-watt, so it's not like I'm relying on sensor readings to make this claim.
 

MS_AT

Senior member
Jul 15, 2024
822
1,664
96
Sensor readings being wrong doesn't change the outcome - It uses the same power to do the work more slowly. I am measuring wall draw with a kill-a-watt, so it's not like I'm relying on sensor readings to make this claim.
Well, you measure power supply efficiency this way too ;)

Anyway, I am not arguing with your results. I just wonder what is the reason the performance is worse than expected given apparently higher clocks. If the actual clocks were similar to power limited 9950x3d it would make more sense, that is why I suggested the readings might be off. And which of the miniPCs you have?
 

Hail The Brain Slug

Diamond Member
Oct 10, 2005
3,840
3,220
146
Well, you measure power supply efficiency this way too ;)

Anyway, I am not arguing with your results. I just wonder what is the reason the performance is worse than expected given apparently higher clocks. If the actual clocks were similar to power limited 9950x3d it would make more sense, that is why I suggested the readings might be off. And which of the miniPCs you have?
It's the Nimo Mini PC Pro. It has a GameMax 12VO Flex ATX unit that appears to be a modified version of their Flex ATX gold rated 350W, and back-of-the-napkin math comparing it to the mini pc's using a power brick, which has a datasheet-provided efficiency figure, it looks to be 90%+ efficient, like GameMax claims.

It doesn't seem to me like the PSU efficiency is a major factor here, especially considering my desktop is
1) using a 1000W unit so 190 watts is not meaningfully more efficient than this PSU appears to be
2) has the GPU idling at 35W because high refresh rate displays
3) has 8 fans, leds, a 4x25GbE SFP28 NIC chugging additional power overhead
and it still draws the same power at the wall.

my desktop idles at 100W at the wall, so the relatively smaller idle to load power draw delta is pretty definitive IMO.
 
Last edited:
  • Like
Reactions: igor_kavinski

Doug S

Diamond Member
Feb 8, 2020
3,448
6,117
136
It is delusional to think that laptops aren't used plugged in.

You don't need an engineering degree to realize that. Unfortunately, the problem is that meetings are often just nothing more than a bunch of scared people nodding heads. If someone disagrees, people either laugh at the person or use other psychological tricks to undermine the person's idea. If a meeting began with the reassurance that every idea, no matter how stupid sounding, would be welcome and there is a guarantee that no one will be ridiculed for speaking their mind, maybe we wouldn't need to keep revising standards with incremental improvements.

No the problem is people trying to make something designed for a particular purpose into something that's general purpose, because they are bloody mindedly stubborn and refuse to accept that real engineering involves tradeoffs.

If you want general purpose RAM in your laptop so it can run faster on AC, then don't use LPDDR. If you don't want to accept the limitations of SO-DIMMs there's nothing stopping you from buying laptops that have soldered standard DDR5 instead of LPDDR5. What's that, now it takes more power when its on battery? Welcome to the real world, where there are tradeoffs!
 

Hail The Brain Slug

Diamond Member
Oct 10, 2005
3,840
3,220
146
Welcome to the real world, where there are tradeoffs!
Couldn't be more true, and all of the hype surrounding Strix Halo seems to be completely ignoring the fact there are substantial tradeoffs compared to a normal desktop.

I was definitely buying into the hype until reality slapped me in the face with the first real workload I tested on it.
 
  • Like
Reactions: Gideon and Doug S

jpiniero

Lifer
Oct 1, 2010
16,658
7,136
136
Couldn't be more true, and all of the hype surrounding Strix Halo seems to be completely ignoring the fact there are substantial tradeoffs compared to a normal desktop.

I was definitely buying into the hype until reality slapped me in the face with the first real workload I tested on it.

The point of Strix Halo is AI, not something like Handbrake.