Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 784 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

eek2121

Diamond Member
Aug 2, 2005
3,491
5,183
136
@Timmah! probably as designed, but it really doesn't matter as much as folks claim. Zen 5 is faster than Zen 4 (how much depends on your workloads, gamers not happy...yet, AI guys super happy! Everyone else is between those two folks) The chip generally is an upgrade in many scenarios, albeit a minor one.

Honestly, EPYC and Threadripper are going to be exciting for non-gamers/non casual users. For everyone that doesn't need an upgrade now? Just wait.

Also, (sorry, offtopic) side note: I guess SarahKerrigan is gone? :(
 

Josh128

Banned
Oct 14, 2022
1,542
2,295
106
So the higher power consumption of Zen 5 carries to EPYC. 9654 Genoa was 2.4Ghz base, 360W TDP. 9655 Turin is 2.6GHz base, but 400W(!) TDP. Thats 11% more power, I'd be very curious to see the average uplift-- probably wont be much more than 11%, so the perf/power will likely be very similar to last gen.

Whats going to be really interesting is what 128 core 3nm Zen 5C (if they make such an SKU) can do vs 128 core 4nm Zen 4C. I expect it to be the most interesting comparison of this whole generation.
 

marees

Platinum Member
Apr 28, 2024
2,292
2,902
96
"While I only dug into additional performance monitoring events on a few cores, I hope AMD will increase the L1 instruction cache’s size later on. Going to 64 KB probably isn’t enough. Apple and Qualcomm have gone for very large 192 KB L1 instruction caches, and with good reason. But even that won’t be enough to contain the instruction footprint on certain workloads."

 

eek2121

Diamond Member
Aug 2, 2005
3,491
5,183
136
So the higher power consumption of Zen 5 carries to EPYC. 9654 Genoa was 2.4Ghz base, 360W TDP. 9655 Turin is 2.6GHz base, but 400W(!) TDP. Thats 11% more power, I'd be very curious to see the average uplift-- probably wont be much more than 11%, so the perf/power will likely be very similar to last gen.

Whats going to be really interesting is what 128 core 3nm Zen 5C (if they make such an SKU) can do vs 128 core 4nm Zen 4C. I expect it to be the most interesting comparison of this whole generation.
See earlier posts. Much greater than 11%.
 
  • Like
Reactions: lightmanek

itsmydamnation

Diamond Member
Feb 6, 2011
3,139
4,010
136
So the higher power consumption of Zen 5 carries to EPYC. 9654 Genoa was 2.4Ghz base, 360W TDP. 9655 Turin is 2.6GHz base, but 400W(!) TDP. Thats 11% more power, I'd be very curious to see the average uplift-- probably wont be much more than 11%, so the perf/power will likely be very similar to last gen.

Whats going to be really interesting is what 128 core 3nm Zen 5C (if they make such an SKU) can do vs 128 core 4nm Zen 4C. I expect it to be the most interesting comparison of this whole generation.
TDP and base clock have to account for AVX 512 implementation , which is substantially bigger then Zen4.
 
  • Like
Reactions: lightmanek

Thunder 57

Diamond Member
Aug 19, 2007
4,294
7,100
136
"While I only dug into additional performance monitoring events on a few cores, I hope AMD will increase the L1 instruction cache’s size later on. Going to 64 KB probably isn’t enough. Apple and Qualcomm have gone for very large 192 KB L1 instruction caches, and with good reason. But even that won’t be enough to contain the instruction footprint on certain workloads."


I found that curious. Those chips lack uop caches, right? Hence the need for a larger L1i. Zen originally had a 64KB L1i but shrunk it to 32KB in Zen 2 to make room to increase the uop cache size.
 

Hotrod2go

Senior member
Nov 17, 2021
349
233
86
These caches are indeed faster, and L1d has become bigger. (And it doesn't stop at these caches, as e.g. BTB is bigger, ITLB is bigger, µop cache is dual ported, ROB is larger...) *However*, bigger L1 and faster L1/2/3 are benefiting workloads more which hit L1/2/3 a lot, whereas the returns are diminishing in workloads which have a sizable amount of cache misses to begin with. This goes without saying but I am mentioning it because you brought up games. Many video games empirically benefit from increase of L3 cache size and from decrease of main memory latency. So they are more akin to the latter type of workload.
Yes, I get your point, but I don't play "many games" I'm a Bethesda game fan & with their latest release - Starfield & its creation 2 engine I notice an increase in fps with my 9700X using the same memory profile as before with Zen 4. So its engine specific if its noticeable the increased cache efficiency at all levels of it.
 

Hotrod2go

Senior member
Nov 17, 2021
349
233
86
You can load from L1 128B/c with AVX512, 64B/c with AVX2 and 32B/c with pure scalar code. For Zen4 it was respectively, 64B/c, 64B/c, 24B/c.
Now about AIDA it's testing aggregate bandwidth so comparing 6 cores to 8 cores will leave you with false assumptions as 8 core will most likely show greater score due to having more cores.
Games won't see such dramatic increase in BW as AIDA might let you believe as they are rarely using SIMD and none of them are using AVX512. For games latency improvements are more welcome.
The latest iteration of non beta Aida64 warns the end user it is not optimised for Zen 5 yet. So take my claims as a guide only at this point in time. I'm well aware synthetic benchmarks have some detachment from real world scenarios.
 

StefanR5R

Elite Member
Dec 10, 2016
6,851
11,029
136
@StefanR5R , what NUMA settings do you recommend in the server BIOS for getting best performance out of consumer workloads/benchmarks, as opposed to NUMA-aware server workloads?
Impossible to say for me, because I have no own experience with Windows on NUMA machines, or even on machines with split last-level caches.

So Turin will use 12 channels of "sweet spot" DDR5-6000???

SWEEEEEEEEEEEETTTTTTTTTTTTT!!!!!!!
Nice, I thought the rumors said 5600 or something like that.
But timings of reg-ECC kits will probably be a shock to gamers to see. ;-) They should wait for Turin-X anyway. :-D

On the screenshot you can see that it is 2 socket configuration with each socket having 128 cores with SMT disabled. So this frequency doesn't have to be anything special
View attachment 105843
The thing with xwitter references is, one might be unable to follow such links, or the web browser at hand might not show anything.

So the higher power consumption of Zen 5 carries to EPYC.
Higher power consumption compared to what? (Rhetoric question. Always look at work gotten done for the power.)

9654 Genoa was 2.4Ghz base, 360W TDP. 9655 Turin is 2.6GHz base, but 400W(!) TDP.
Gasp! (Not really.)
If you put a vectorized workload onto Genoa, then even the 64 core SKU still runs very efficiently when set to its cTDP_high of 400 W. Guess what AMD's guides have always been recommending WRT the choice between cTDP_low/default TDP/cTDP_high of all the EPYC generations when you use them in HPC and similar scenarios.

Personally I am interested to see which cTDP_high the 64 core Turin will have. IMO, 500 W could be reasonable given the considerably wider core and to be expected core throughput compared to Genoa. — Edit: Running the IMCs at higher clock rate will certainly demand a decent amount of more power too. Sometimes it pays off to configure a lower RAM clock rate than the maximum supported.
 
Last edited:
  • Like
Reactions: lightmanek

Josh128

Banned
Oct 14, 2022
1,542
2,295
106
Higher power consumption compared to what? (Rhetoric question. Always look at work gotten done for the power.)
Zen 4. Its already established that low clock / TDP Zen 5 is scarcely better than Zen 4 at all in perf/w (7700 vs 9700X @65W), and actually can be worse at low enough TDP. Being that these server chips core frequencies are way down compared to desktop, way lower than what desktop can do at 65W, they are very possibly dropping near that "worse than" range.
 

CouncilorIrissa

Senior member
Jul 28, 2023
788
2,855
106

StefanR5R

Elite Member
Dec 10, 2016
6,851
11,029
136
Zen 4. Its already established that low clock / TDP Zen 5 is scarcely better than Zen 4 at all in perf/w (7700 vs 9700X @65W), and actually can be worse at low enough TDP.
Oh, you were referring to the sketchy scaling of GNR (and STX I think) to low per-core power levels. Yes, it will be interesting to see how Turin will manage. At the top end, there is a core count increase, the IOD will support faster RAM, the top end has more fabric links active... and at the same time the cores are that much wider. Quite some places to spend additional socket power at. Potentially one out of several debatable points why Turin gets a newer CCD stepping.
 

Josh128

Banned
Oct 14, 2022
1,542
2,295
106
If you follow the discussion, the first comment was specifically about server workloads.
Actually, my comment was about general performance increases from Genoa to Turin, not performance in server workloads. Yes, indeed these are server chips, but there are plenty of scientific applications that might be run on these chips that are not server specific.
 
Jul 27, 2020
28,174
19,218
146
1724271440207.png


Wonder if AEMP is EXPO 2.0 or just some ASUS marketing term.

@Det0x, tell us more :p
 
  • Like
Reactions: lightmanek

eek2121

Diamond Member
Aug 2, 2005
3,491
5,183
136
Performance? Outside of AVX-512 acceleration, core for core, no way.
Boost clocks for EPYC alone increase by more than 11%. 3.7 -> 4.5 ghz is a huge jump. I am expecting Threadripper to have a peak of at least 5.5ghz and will be sad if that is not the case.

Just because client is boring right now doesn’t mean all segments are.

side note: Postgres added AVX-512 support to some areas and managed more than a 6x performance increase in some cases. SQL databases can benefit significantly by utilizing AVX-512.
 

blackangus

Senior member
Aug 5, 2022
264
489
106
Boost clocks for EPYC alone increase by more than 11%. 3.7 -> 4.5 ghz is a huge jump. I am expecting Threadripper to have a peak of at least 5.5ghz and will be sad if that is not the case.

Just because client is boring right now doesn’t mean all segments are.

side note: Postgres added AVX-512 support to some areas and managed more than a 6x performance increase in some cases. SQL databases can benefit significantly by utilizing AVX-512.
This is interesting and super good to know, Thanks!
 
  • Like
Reactions: igor_kavinski