Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 992 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

StefanR5R

Elite Member
Dec 10, 2016
6,851
11,029
136
Can anybody comment on the degree of performance determinism which is to be expected from AWS VMs? Also, while the article details the processor topology of m8a.8xlarge (1 NUMA node, 1 socket, 4× 32 MB L3$), the same details are not provided for m7a.8xlarge, and I haven't found respective documentation in an admittedly quick search.

Phoronix's bare metal tests have shown about ≈1.4× geomean performance of 9005 over 9004 at same core counts and Wattage. I suppose his result of ≈1.6× of M8a over M7a includes either tweaks to the VM config by AWS on top of the CPU and memory upgrade, higher socket power, VM performance variation (noisy neighbors), or a combination of those.
 

Bigos

Senior member
Jun 2, 2019
234
592
136
From my experience working with VMs in AWS and GCP, there can be quite significant performance discrepancy of 5-10%. This is usually due to the "noisy neighbor" effect but also different topology like different amount of NUMA nodes [1] or maybe different IOD quadrants. This can be mitigated by using AWS dedicated host (and carving a portion of a host for a VM with the rest unused) or a similar solution for GCP (sole-tenant node).

The "5-10%" value is taken from memory, older hardware and a specific (proprietary) workload I was testing. It might be different on EPYC 4/5 machines and the PTS.

This should not invalidate the test results completely as the performance boost is a lot higher than the possible discrepancy.

[1] From my experience, the number of advertised NUMA nodes is the upper bound. So when a VM of a specific configuration says there are 2 NUMA nodes, it could be scheduled on actual 2 NUMA nodes or just one - which would enhance performance. In this case it seems like m8a advertises 1 NUMA node so it should never cross both, though. Not sure about m7a, but the PTS table shows "1x128GB" RAM for both so maybe it also uses 1 NUMA node.
 

fastandfurious6

Senior member
Jun 1, 2024
984
1,092
96
https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1657e5b-6b0a-4c0e-bb38-2c76bd9eb782_1600x1395.png






incredible............

nothing screams more maximum power than those huge red lines

in one year... Medusa Halo Max Ultra X3D red lines will pierce through the graph and the monitor itself


get ready
 

StefanR5R

Elite Member
Dec 10, 2016
6,851
11,029
136
Now add the MI300A APU into the graphs — and suddenly, STXH's bars don't look so impressive anymore. :-P

(As will Medusa Halo's, once it arrives.)
 
  • Like
Reactions: lightmanek

StefanR5R

Elite Member
Dec 10, 2016
6,851
11,029
136
Which leads us to the question how Medusa Halo is supposed to have monitor-piercing benchmark bars, while still fitting into laptops.
Will it have HBM? Certainly not.
(Wrong thread, but to my excuse, I am asking merely rhetorically.)

Edit,
apropos — is the bar graph shown above corresponding with laptop power levels? Sustained power levels even?
[Going by who it posted, it likely does not.]

Edit 2,
the source of #24,780 is https://chipsandcheese.com/p/amds-chiplet-apu-an-overview-of-strix. Folks, when you post stuff that's not your own, state your source.

The source does not say what exact power level the graph corresponds with, except giving a 75 W upper bound for the RTX 5070M, nor whether it's peak or sustained.
 
Last edited:

LightningZ71

Platinum Member
Mar 10, 2017
2,697
3,396
136
There's enough mini-PCs out there with Strix Point to satisfy much of that market anyway. There's no world where a vanilla 9700x and a low end GPU don't pants Strix Point for similar costs. Also, it's not performant enough to make use of it's one possible advantage: having up to 128GB unified RAM.
 

StefanR5R

Elite Member
Dec 10, 2016
6,851
11,029
136
Has there been any info ever whether Krackan Point is made of one or of two core complexes?

(Edit: The early geekbench database entries of Krackan Point showed two CCXs, but folks claimed back at that time that this was merely geekbench reporting the topology incorrectly. I don't recall that anybody ever really confirmed it to be one or the other.)
 
Last edited:

Josh128

Banned
Oct 14, 2022
1,542
2,295
106
Last edited:

Thunder 57

Diamond Member
Aug 19, 2007
4,294
7,100
136
I hope they release it so people can see just how pointless it is. As if there is no inter-CCD latency. What's more interesting is that if this leak is true AMD seems to have figured out how to get (near) full speed (GHz) with X3D chips.
 
  • Like
Reactions: Tlh97 and Joe NYC

MS_AT

Senior member
Jul 15, 2024
940
1,860
96
Oh no, here we go again... ;) Still if by any chance this turns out to be true, this will be the fastest x64 CPU on the market till the next gen ships. It will even be able to compete with M4 in SPECint.
 

ToTTenTranz

Senior member
Feb 4, 2021
955
1,575
136
Here's the original post:



I'm honestly anticipating this. I'm holding off on it for a long needed system upgrade from my current 5900X setup.



I hope they release it so people can see just how pointless it is. As if there is no inter-CCD latency.
Its performance might surprise some people because of the core parking issues in the 9950X3D.

It could also become a champion of Mixture-of-Experts models running on GPU+CPU.
 
  • Like
Reactions: lightmanek

Josh128

Banned
Oct 14, 2022
1,542
2,295
106
What's more interesting is that if this leak is true AMD seems to have figured out how to get (near) full speed (GHz) with X3D chips.
100%. I think it was possible all along, but 9800X3D was a pretty early release in the Zen 5 lifecycle, used brand new design and location for the X3D die, and so they had to be conservative for safety and reliability reasons.

Now that they've had almost a year more time to R&D, they most likely:

1. Were able to get enough /divert enough good bins to allow it (remember even vanilla 9700X was limited to 5.5GHz)
2. Have more confidence in the reliability of the design at the required voltage and current for these frequencies.

X3D2 is very interesting as it has both high clock and dual X3D. Likely will be pointless over 9850X3D for gaming, but will command a new high price for the elitist technophiles.
 

Det0x

Golden Member
Sep 11, 2014
1,488
5,114
136
Anybody seen this yet?? Dual v-cache 9950X3D2 and 5.6GHz 9850X3d?! To combat Arrow Lake refresh??


View attachment 132352
Wow new toys incoming !? :innocent:
Can be fun to tryout together with new a-die 24GB sticks
 

eek2121

Diamond Member
Aug 2, 2005
3,489
5,179
136
Here's the original post:



I'm honestly anticipating this. I'm holding off on it for a long needed system upgrade from my current 5900X setup.




Its performance might surprise some people because of the core parking issues in the 9950X3D.

It could also become a champion of Mixture-of-Experts models running on GPU+CPU.

The rumor that just won’t die! 🤣

If it comes out, I’ll buy it.
 

Joe NYC

Diamond Member
Jun 26, 2021
4,228
5,829
136
Anybody seen this yet?? Dual v-cache 9950X3D2 and 5.6GHz 9850X3d?! To combat Arrow Lake refresh??


View attachment 132352

If these CPUs are released, the doubters who doubt the benefit of V-Cache across the board (such as @adroc_thurston ) will finally be convinced - hopefully.

V-Cache CPUs that run at a clock speed deficit of 200-500 MHz first have to make up the clock speed difference and then surpass the performance of CPUs clocked much higher.

So, with these CPUs instead of first subtracting and then adding, you would be only adding. The apps (that were said to not benefit from V-Cache) were just more sensitive to subtracting than to adding. With no more subtracting, these apps will also benefit.
 
  • Like
Reactions: lightmanek

Joe NYC

Diamond Member
Jun 26, 2021
4,228
5,829
136
I hope they release it so people can see just how pointless it is. As if there is no inter-CCD latency. What's more interesting is that if this leak is true AMD seems to have figured out how to get (near) full speed (GHz) with X3D chips.

There are plenty of workstation / server apps that are note noticeably penalized by multiple CCDs.

Gains in gaming of 9950x3d2 will likely be minimal (vs 9850x3d), but in other apps, like those that Phoronix typically tests, we will see benefits.
 

Thunder 57

Diamond Member
Aug 19, 2007
4,294
7,100
136
There are plenty of workstation / server apps that are note noticeably penalized by multiple CCDs.

Gains in gaming of 9950x3d2 will likely be minimal (vs 9850x3d), but in other apps, like those that Phoronix typically tests, we will see benefits.

The benefits seem to be rare enough to not be worth it as it looks like AMD stopped putting 3D cache on Epyc with Zen 5.