Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 602 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Jul 27, 2020
28,173
19,203
146
So only 10% uplift on windows? Excuse me but wtf
You only need to be concerned if Geekbench's actual code for the subtests is being used in any or most commercial software.

Geekbench score is only a reflection of how well the CPU runs Geekbench code. Nothing else. It's an approximation of real world code but nobody spends AS MUCH time on tweaking their code as this Geekbench developer dude. Most commercial real world code is either super optimized or super turdy, with the majority in the latter class.

In like maybe 99% cases, if a software is fulfilling its purpose, the developer will NEVER again touch its code to tweak it because a codebase is a virtual bomb that can explode with the slightest change and that explosive probability is proportional to the size of the codebase and number of developers involved.
 

Hitman928

Diamond Member
Apr 15, 2012
6,748
12,474
136
What is funny is that most of the public Strix GB6 scores (that I could discover online) land around that range irrespective of frequency :laughing:

View attachment 102246

GB reported frequency for Strix has been obviously unreliable from the first samples. I was hoping that with the more retail samples showing up, that would get fixed, but it seems to still be an issue.
 

eek2121

Diamond Member
Aug 2, 2005
3,449
5,101
136
That would be called a dGPU - the most modern ones are able to be used for this purpose, and they can be used for gaming, too!
Not really, NPUs lack dedicated TMUs and a bunch of other stuff.
You only need to be concerned if Geekbench's actual code for the subtests is being used in any or most commercial software.

Geekbench score is only a reflection of how well the CPU runs Geekbench code. Nothing else. It's an approximation of real world code but nobody spends AS MUCH time on tweaking their code as this Geekbench developer dude. Most commercial real world code is either super optimized or super turdy, with the majority in the latter class.

In like maybe 99% cases, if a software is fulfilling its purpose, the developer will NEVER again touch its code to tweak it because a codebase is a virtual bomb that can explode with the slightest change and that explosive probability is proportional to the size of the codebase and number of developers involved.
Geekbench does use many libraries that are used in both commercial and open source applications.

Whitepaper is here: https://www.geekbench.com/doc/geekbench6-benchmark-internals.pdf
 

eek2121

Diamond Member
Aug 2, 2005
3,449
5,101
136
GB reported frequency for Strix has been obviously unreliable from the first samples. I was hoping that with the more retail samples showing up, that would get fixed, but it seems to still be an issue.

It is likely due to the frequencies of the Zen5c cores being detected instead of the “big” cores.
 
  • Like
Reactions: Tlh97 and Gideon

leoneazzurro

Golden Member
Jul 26, 2016
1,114
1,867
136
Not really, NPUs lack dedicated TMUs and a bunch of other stuff.
What I mean is, if you need to reach the 40+ TOPS for the Copilot certification and you need to use a PCIE device for it, a dGPU would be the most sensible option, as it serves to multiple purposes.
 

eek2121

Diamond Member
Aug 2, 2005
3,449
5,101
136
What I mean is, if you need to reach the 40+ TOPS for the Copilot certification and you need to use a PCIE device for it, a dGPU would be the most sensible option, as it serves to multiple purposes.
It is going to come down to cost, efficiency, and performance scaling. In theory, an NPU should cost less and use less power.
 
  • Like
Reactions: Fjodor2001

leoneazzurro

Golden Member
Jul 26, 2016
1,114
1,867
136
It is going to come down to cost, efficiency, and performance scaling. In theory, an NPU should cost less and use less power.
Well, if you plan to have a dGPU you don't need a TPU at all, except corner cases whereyou need LOT of TOPS and/or you don't want to tax the GPU at the same time.
 

Fjodor2001

Diamond Member
Feb 6, 2010
4,389
656
126
Well, if you plan to have a dGPU you don't need a TPU at all, except corner cases whereyou need LOT of TOPS and/or you don't want to tax the GPU at the same time.
The discussion was started regarding cases where no dGPU is used. I.e. how to fulfill the 40 TOPS requirement then.

If an NPU is included in the CPU like on Arrow Lake DT then the problem is solved. But on Zen5 DT it needs to be solved in some other way, unless it will include an NPU too (which we’ve not heard anyhing about so far). Possibly there can be PCIe NPUs that can be bought as add-on. Just like you can buy TPMs as add-on for PCs that don’t have that.

Should be cheaper to buy an NPU than dGPU, if the intention only is to reach 40 TOPS and there is no need for dGPU and the additional functionality that it provides. For the reasons eek2121 mentioned.
 

Doug S

Diamond Member
Feb 8, 2020
3,738
6,604
136
Yeah that’s on Linux. Linux typically scores 6-10% higher.

Has there ever been a decent explanation as to why? Is it running significantly different code on Windows vs Linux? Using suboptimal compiler options on Windows? Making "libc" calls that are translated through some intermediate library? Perhaps Windows is more aggressive in trying to keep the CPU in a lower power / lower frequency state?
 
Jul 27, 2020
28,173
19,203
146
Has there ever been a decent explanation as to why?
Linux Kernel isn't a bloated mess. Certain distros even run Win32 games faster than Windows.


When dumb HR folks measure programmer productivity in lines of code (loc), there's bound to be inventions of artificial complexity in code to inflate the loc. And then future programmers with common sense won't touch that codebase because pointing out flaws or justification for a complete rewrite of even a simple oft used function would be seen as disrespect of the senior programmers.
 

leoneazzurro

Golden Member
Jul 26, 2016
1,114
1,867
136
The discussion was started regarding cases where no dGPU is used. I.e. how to fulfill the 40 TOPS requirement then.

If an NPU is included in the CPU like on Arrow Lake DT then the problem is solved. But on Zen5 DT it needs to be solved in some other way, unless it will include an NPU too (which we’ve not heard anyhing about so far). Possibly there can be PCIe NPUs that can be bought as add-on. Just like you can buy TPMs as add-on for PCs that don’t have that.

Should be cheaper to buy an NPU than dGPU, if the intention only is to reach 40 TOPS and there is no need for dGPU and the additional functionality that it provides. For the reasons eek2121 mentioned.
Are there 40+ TOPS PCI-E TPUs costing less than a low-end modern dGPU? Because the ones I see around are quite expensive.
 
  • Like
Reactions: igor_kavinski

Fjodor2001

Diamond Member
Feb 6, 2010
4,389
656
126
Are there 40+ TOPS PCI-E TPUs costing less than a low-end modern dGPU? Because the ones I see around are quite expensive.
I haven’t seen any NPUs in the 40 TOPS range. Only models with many more TOPS, which thus of course are more expensive than a low-end dGPU.

The reason is likely that there hasn’t been any need for NPUs in the 40 TOPS range until now. Regarding potential market for such NPUs, I guess they are not only for Zen5, but older CPUs too such as Zen4 and earlier, and corresponding Intel CPUs too. Basically for all CPUs without included NPU, that you’d like to upgrade to fulfill the Win12 AI PC requirements.
 

whoshere

Member
Feb 28, 2020
45
99
91
itvision.altervista.org
Has there ever been a decent explanation as to why? Is it running significantly different code on Windows vs Linux? Using suboptimal compiler options on Windows? Making "libc" calls that are translated through some intermediate library? Perhaps Windows is more aggressive in trying to keep the CPU in a lower power / lower frequency state?

I'm curious myself and I even asked John Poole this question:


Speaking of compilers:

For GB6.3.0:
  • The Linux version could be using CLANG 16.0.0 and GCC Ubuntu 7.5.0-3ubuntu1~18.04.
  • The Windows version could be using MSVC 10.0. But then it contains references to GCC and LLVM, so I've no clue. Looks like it contains object files made by different compilers.