Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 674 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Jul 27, 2020
27,980
19,118
146
They claimed a 11-22% MT uplift in Blender. (11% for 9700X, 16% for 9600X, 17% for 9900X and 22% for 9950X). Blender has 23% IPC according to them. So for 17% IPC R23 you get like 8-16% uplift. 8% for 9700X, ~12% for 9600X and 9900X and 16% for 9950X.

1722089216522.png
1722089245358.png

There. Matter settled. It's roughly 16% average IPC lift. I don't see why we should keep beating this IPC dead horse over and over. Not gonna make the dead horse run any faster :D

AMD footnote for Zen 5 IPC calculation:

Testing as of May 2024 by AMD Performance labs. "Zen 5" system configured with: Ryzen 9 9950X GIGABYTE X670E AORUS MASTER motherboard, Balanced, DDR5-6000, Radeon RX 7900 XTX, VBS=ON, SAM=ON, KRACKENX63 vs. "Zen 4" system configured with: Ryzen 7 7700X, ASUS ROG Crosshair X670E motherboard, Balanced, DDR5-6000, Radeon RX 7900 XTX, VBS=ON, SAM=ON, KRAKENX62 {FixedFrequency=4.0 GHz}. Applications tested include: Handbrake, League of Legends, FarCry 6, Puget Adobe Premiere Pro, 3DMark Physics, Kraken, Blender, Cinebench (n-thread), Geekbench, Octane, Speedometer, and WebXPRT. System manufacturers may vary configurations, yielding different results. GNR-03

SO

Turn off VBS, push RAM all the way up to 7200 or 8000 MT/s for Zen 5 (maybe do that for Zen 4 too if possible), change power plan to High Performance, run all benchmarks at same fixed frequency above 5 GHz and there's a good chance of improvement in the average IPC uplift.

AMD IS SANDBAGGING!!!!

But here's something new to predict for you folks:

What will be the rough average IPC uplift for Turin?

I predict ~17%

:p
 
  • Like
Reactions: lightmanek

SarahKerrigan

Senior member
Oct 12, 2014
735
2,036
136
And you'd have to recompile all your Linux distro from scratch with the proper Zen5 compiler flags. I have never done that as I never felt the need during my 30 years of using Linux because my new CPU was always performing better by default than the previous one.

If a CPU requires recompilation of existing applications to perform well, I call that a failure. [...]


TLDR: A modern successful CPU should not require recompilation of applications to perform well.

Amen.

"Developers just need to optimize for it more!" is a bad excuse made by vendors of subpar microarchitectures for significantly longer than I've been alive.
 

SarahKerrigan

Senior member
Oct 12, 2014
735
2,036
136
Jul 27, 2020
27,980
19,118
146
"Developers just need to optimize for it more!" is a bad excuse made by vendors of subpar microarchitectures for significantly longer than I've been alive.
Yeah but there will be developers who WILL optimize for Zen 5.

Think Unreal Engine.

Think AAA games.

Think Windows itself (once they release an optimized Visual Studio).

And obviously Linux developers will optimize too coz that's their passion.

And many, many others.

If you are competing in a tough market, you will do whatever is necessary to maintain your edge.
 

SarahKerrigan

Senior member
Oct 12, 2014
735
2,036
136
Yeah but there will be developers who WILL optimize for Zen 5.

Think Unreal Engine.

Think AAA games.

Think Windows itself (once they release an optimized Visual Studio).

And obviously Linux developers will optimize too coz that's their passion.

And many, many others.

If you are competing in a tough market, you will do whatever is necessary to maintain your edge.

Okay. Elaborate. What kind of optimizations do you propose? Be specific.
 
Jul 27, 2020
27,980
19,118
146
Hardly a compelling argument for "recompiling to get acceptable perf is normal."
Because that's the quickest one I could find. My reasoning is that if an Intel optimized distro is giving Zen 4 an uplift, imagine what a distro focused on Zen 4/5 performance would do.

If I had the time and resources, I would find open source versions of software before Zen 4 announcement and one year after announcement and build them both with the latest compiler and maybe then we would have more data on what I'm claiming.
 

SarahKerrigan

Senior member
Oct 12, 2014
735
2,036
136
Because that's the quickest one I could find. My reasoning is that if an Intel optimized distro is giving Zen 4 an uplift, imagine what a distro focused on Zen 4/5 performance would do.

If I had the time and resources, I would find open source versions of software before Zen 4 announcement and one year after announcement and build them both with the latest compiler and maybe then we would have more data on what I'm claiming.

Except that doesn't say what you think it does, because there are generalized compiler improvements over time - that affect all microarchitectures.

What sorts of optimizations are you proposing?
 
  • Like
Reactions: Nothingness
Jul 27, 2020
27,980
19,118
146
Okay. Elaborate. What kind of optimizations do you propose? Be specific.
You are asking a code monkey who dropped out of the 42 main program coz I couldn't figure out in one month how to write my own secure malloc() function that passed strict Valgrind checks :D

I have no clue TBH. But my brain says it's possible.
 

SarahKerrigan

Senior member
Oct 12, 2014
735
2,036
136
You are asking a code monkey who dropped out of the 42 main program coz I couldn't figure out in one month how to write my own secure malloc() function that passed strict Valgrind checks :D

I have no clue TBH. But my brain says it's possible.

It's really not, for most stuff. The finer points of instruction scheduling are critical when you're on small or highly static cores. On massive OoO cores, the kinds of optimizations that matter tend to be the ones that compilers try to do anyway, and that affect most implementations of the instruction set rather than a specific one - ie, good instruction selection, minimizing unnecessary fills/spills, autovectorization, initiating loads early, etc, etc.

There's no goldmine of potential performance upside from optimization for a specific core. Most "hand optimization" wins in Free software tend to be manual vectorization, which isn't specific to any given microarchitecture.
 
Jul 27, 2020
27,980
19,118
146
What sorts of optimizations are you proposing?
Here's one idea:


1722091401918.png
Suppose a developer has done extensive profiling for Zen 4 and made changes to his application so that when Zen 4 is detected, he uses specific hot functions that matter a lot to his application's core performance.

Now suppose he uses the same tool to profile Zen 5 and sees some big differences. Some of his assumptions about Zen 4 no longer hold true with Zen 5. So he creates specific functions for Zen 5 again to make sure that his application can get the most out of the new architecture.

This isn't something unheard of in the software world. Yes, most monkey programmers won't go to all this trouble. But Linux gurus, benchmark writers, game engine developers and authors of widely used software like 7-zip and WinRAR may do that. Can't say anything about the latter since it's closed source but maybe someone can look at 7-zip source and see if there are architecture specific optimizations in that.

Hackers love to hack. If someone like that thinks there is more performance to be squeezed out of a new architecture, you bet they will love to tackle that challenge. Because that's the real fun of hacking. The feeling of satisfaction when you crack a hard problem.
 

gdansk

Diamond Member
Feb 8, 2011
4,568
7,681
136
Zen 5 is expected to offer around 16% IPC in non-tuned code.
But with its weird front end and big FPU I do expect hand-tuned assembly could be much better. But no one outside of Oak Ridge and Lawrence Livermore will even consider that.
 
  • Like
Reactions: lightmanek

SarahKerrigan

Senior member
Oct 12, 2014
735
2,036
136
Here's one idea:


View attachment 103997
Suppose a developer has done extensive profiling for Zen 4 and made changes to his application so that when Zen 4 is detected, he uses specific hot functions that matter a lot to his application's core performance.

Now suppose he uses the same tool to profile Zen 5 and sees some big differences. Some of his assumptions about Zen 4 no longer hold true with Zen 5. So he creates specific functions for Zen 5 again to make sure that his application can get the most out of the new architecture.

This isn't something unheard of in the software world. Yes, most monkey programmers won't go to all this trouble. But Linux gurus, benchmark writers, game engine developers and authors of widely used software like 7-zip and WinRAR may do that. Can't say anything about the latter since it's closed source but maybe someone can look at 7-zip source and see if there are architecture specific optimizations in that.

Hackers love to hack. If someone like that thinks there is more performance to be squeezed out of a new architecture, you bet they will love to tackle that challenge. Because that's the real fun of hacking. The feeling of satisfaction when you crack a hard problem.

Oh. Well, "runtime performance bottlenecks" are totally a specific answer.

7zip has x86 optimizations. It doesn't have optimizations for any specific microarchitecture AFAIK. See for yourself: https://github.com/mcmilk/7-Zip/tree/master/Asm/x86

You continue to show yourself totally incapable of naming a specific optimization Zen5 would benefit from, and to keep leaning on magical thinking.
 
Jul 27, 2020
27,980
19,118
146
But with its weird front end and big FPU I do expect hand-tuned assembly could be much better. But no one outside of Oak Ridge and Lawrence Livermore will even consider that.
Don't forget John Carmack (his personal pet projects that he doesn't release to the world), Tim Sweeney, all the engine developers of AAA studios, Adobe and other workstation software developers.
 
Jul 27, 2020
27,980
19,118
146
With how starved the uncore / memory of Granite Ridge is i'm gonna go out on a limb and say 19% IPC for Turin (at 5600MT/s).
It's already starved on Raphael and I can only see the gap widening between Desktop and Epyc.
Good point. Yeah, twelve channels of DDR5-8800 could really make that baby rock! :p
 

gdansk

Diamond Member
Feb 8, 2011
4,568
7,681
136
Don't forget John Carmack (his personal pet projects that he doesn't release to the world), Tim Sweeney, all the engine developers of AAA studios, Adobe and other workstation software developers.
I don't know but it is possible. Epic/Rad did write extremely optimized Zen 2 specific code for Unreal. They even had an article about it but I can't find it now.
 

soresu

Diamond Member
Dec 19, 2014
4,101
3,560
136
Don't forget John Carmack (his personal pet projects that he doesn't release to the world), Tim Sweeney, all the engine developers of AAA studios, Adobe and other workstation software developers.
Don't forget about Frostbite engine's former lead dev Johan Andersson who spearheaded the Mantle work with AMD all those years ago.

He and some of the former DICE people left and formed a new company called Embark Studios, which is now a subsidiary of Nexon.