- Mar 3, 2017
- 1,777
- 6,786
- 136
Sustained clocks should be higher. Just compare on Epyc https://www.amd.com/en/products/pro...ation-9004-and-8004-series/amd-epyc-9554.html vs https://www.amd.com/en/products/processors/server/epyc/9005-series/amd-epyc-9555.html (all core boost speed, this is not specified for threadripper parts) this is also what they were talking about around Zen5 release. That Zen5 parts don't boost higher but are able to hold higher clocks under heavier loads compared to Zen4. I mean it's few hundred MHz but if you add IPC increase and memBW increase it will all add up.His geomean results put 9980X as +30% over 7980X. Very interesting as it should have similar clocks as the 7980X. This holds true for 9970X vs 7970X as well (+28%). Whats accounting for the additional +14% overall perf over the claimed +16% IPC?
That's what the PS6 handheld is.A real experiment would be someone going to AMD's semi-custom division and hacking together a part that would actually target handhelds. Something that would hopefully wind up being better than a Z2 Extreme.
That Strix Halo vs. 9950X3D comparison also shows some weirdly great multicore wins for the Halo chip and I think membw is the only explanation.
The write direction is wider than in Granite Ridge, and serialization/deserialization is eliminated, otherwise they left it the same:STX Halo has a faster and more expensive inter-CCD connect.
That's what the PS6 handheld is.
What's so big about it? The LLVM results? Everything seemed normal there.big if true!!!!
The cache is helping unexpectedly in some workloads like CB 2024.What's so big about it? The LLVM results? Everything seemed normal there.
Idle power drain was also lowered. @aigomorla, I wonder if TR 7000 benefits in this regard from latest firmware too.
Well, hopefully in 3 or 4 years when you think you've gotten more out of the CPU than you paid for it and then you can go ballistic on it with tweaking. And if it dies, the better the excuse to upgrade to 9960X.And the last time i decided to be smart and started playing with PBO and other stuff, i had to spend about an hour resetting and debugging the bios settings that i swore i wouldnt do it again.
Apart from Zen 5's already known bottlenecks as register file capacity and weird frontend overrides, it seems interesting the Op cache bandwidth not being utilized properly. Is that maximum throughput also oriented at SMT or something?https://chipsandcheese.com/p/running-gaming-workloads-through a pity he doesn't have a x3d part to compare.
And now we will get people saying if 285K gets higher IPC than Zen5 in games, it was unjustly bashed for its gaming performanceThe most important chart from there, in my opinion, and the corresponding 285K chart
I think for clarity you could have mentioned these things, to avoid misunderstandings.I’ll be using the same games as in the Lion Cove gaming article, namely Palworld, COD Cold War, and Cyberpunk 2077. However, the data is not directly comparable; I’ve built up my Palworld base since then, COD Cold War multiplayer sessions are inherently unpredictable, and Cyberpunk 2077 received an update which annoyingly forces a 60 FPS cap, regardless of VSYNC or FPS cap settings. My goal here is to look for broad trends rather than do a like-for-like performance comparison.
The article attributes that to high ratio of branches to other instructions. Since games are low IPC workloads, I wouldn't draw conclusions about uOP cache based on them. In their other article when testing with 8B nops, single thread is able to achieve 8 inst / cycle (remember the rename is capped at 8) what would suggest it's able to draw more than 6 inst/c from the uop. The uOP has also various limitations about what each entry is able to hold, you would need to read the Software Manual from AMD for details.it seems interesting the Op cache bandwidth not being utilized properly. Is that maximum throughput also oriented at SMT or something?
The most important chart from there, in my opinion, and the corresponding 285K chart:
Somewhat ironically, the int suite (at least when compiled with gcc) runs better compiled without AVX-512 support.The interesting thing about those charts for me isn't the gaming stuff, but in the SPEC benchmarks where there is a big divergence between AMD and Intel. I don't know to what extent compilers are now able to generate AVX-512 instructions for which SPEC subtests, but I'm assuming that's likely to be the reason for why there are certain benchmarks where AMD dominates Intel.
According to this https://chipsandcheese.com/p/runnin...s-and-cheese?utm_campaign=post&utm_medium=web they are using -mcpu flag, which is deprecated version of -mtune, which in theory tunes the code for the native architecture but does not enable instruction set extensions over the generic set for a given architecture (x64 in this case) https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html#index-mtune-17 In other words, I guess at most it's using SSE4 so no AVX2 or AVX512 if I got the defaults right but for sure no AVX512.I don't know to what extent compilers are now able to generate AVX-512 instructions for which SPEC subtests, but I'm assuming that's likely to be the reason for why there are certain benchmarks where AMD dominates Intel.
There was a bug in GCC related to that. Not sure if he means after the bug was fixed or still refers to the bug itself.Somewhat ironically, the int suite (at least when compiled with gcc) runs better compiled without AVX-512 support.
They need to increase their OOO resources and Ports as well.Intel is 40% backend bound and 20% core bound which seems like the issue is more inherent?
It's unfortunate because a lot of their articles that look at this don't do direct comparisons. In this one it's due to updating their testing methodology, the lion cove one is because of the game selection was different than their Zen 4 article.The interesting thing about those charts for me isn't the gaming stuff, but in the SPEC benchmarks where there is a big divergence between AMD and Intel. I don't know to what extent compilers are now able to generate AVX-512 instructions for which SPEC subtests, but I'm assuming that's likely to be the reason for why there are certain benchmarks where AMD dominates Intel.
So what I'm really curious about are results that go the other way, because that lacks the easy explanation - why is AMD's povray result so much worse than Intel's? Is it just a matter of Intel having internal structures like ROB or number of rename registers that's a little bigger and that's just enough to make a big difference in povray? Or maybe (though I doubt it) it is particularly sensitive to branch prediction, and Intel's branch prediction is guessing better for that benchmark? Or perhaps differences in prefetch?
You know, the kind of stuff Anandtech used to dig into back in the day and write about. Chips and Cheese digs deeper into that stuff than just about anyone else these days, but I'd love to see them do a direct comparison of AMD's and Intel's cores and dig into the places where their performance differs and the reasons why. I guess the problem is most people reading reviews just want to know which one runs games better.
I would imagine fixing a front end bottleneck is far harder than fixing a backend bottleneck.AMD has frontend bottleneck which should be fixable, looks like something they didnt have time to adjust
and minimal core bound
Intel is 40% backend bound and 20% core bound which seems like the issue is more inherent?
TBH I don't think V-cache in laptops is all that impressive.The cache is helping unexpectedly in some workloads like CB 2024.
Makes you wonder where else it could be working its magic. Phoronix needs to pit these laptops against each other in hundreds of benchmarks to give us the full picture.
This is also very impressive:
View attachment 128078
The Zen 5 BPU is weird. Wonder if more people will look into it.Apart from Zen 5's already known bottlenecks as register file capacity and weird frontend overrides, it seems interesting the Op cache bandwidth not being utilized properly. Is that maximum throughput also oriented at SMT or something?
ICache size being a limiting factor can potentially see a fix by Zen 7.
I think we missed an.important part with V cache we can lower CPU Power so more power to GPU and if it's maxed lower total system power means less heatTBH I don't think V-cache in laptops is all that impressive.
Checking 1080p for these laptops don't make much sense IMO. High end laptops are being equipped with 1440p screens esentially at minimum, other than 1080p ultra high refresh options which aren't usually standard. Alienware? I think does it, as well as some MSI models IIRC?
And a 10% lead in 1440p 1% lows is undoubtedly nice, but not all that game changing.