Speculation: Ryzen 4000 series/Zen 3

Page 212 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Ajay

Lifer
Jan 8, 2001
16,094
8,109
136
But if they increased load and store width and queue depths, one of the biggest SMT bottlenecks will be relieved. So i wouldn't be surprised to see SMT yeild increase or stay the same. I also wouldn't be surprised to see memory bandwdith , IO die etc to be a bottleneck when scaling workloads.

The interesting rumors arethat AMD are doing both 12nm and 7nm IOD for EPYC, if so that would be a very interesting comparison and give good insight to what warhol/DDR5 might bring.
Well, then we have another problem, if in fact, IO bottle necks the load/store improvements - that and mem bandwidth didn't change. Like I said, I look forward to Ian's deep dive. Realworld thread on Zen 3 release (not the current thread) should be pretty epic.
 
  • Like
Reactions: lightmanek

lobz

Platinum Member
Feb 10, 2017
2,057
2,856
136

Zen 3 on GB5.

Noticable points. Compared to the fastest 1185G7 on Windows here (unfortunately, there are no TGL-U benches with 5.1.1 like the 5900X here, so this will have to do for now):


The 5900X falls 5 points below in the averaged out single-threaded score whilst clocking between 4.775GHz and 4.95GHz. looking at the score breakdowns, the 5900X loses heavily in crypto, (2757 vs 4095), the two effecticely tie in the integer workloads (1409 vs 1405) and the 5900X takes a noticable lead in FP workloads (1837 vs 1640).

The 5950X run is using 5.2.3 but overall talking points from me remain the same for the most part. The 5950X loses some points by scoring 2707 in crypto, 1400 in integer and 1764 in floating point, but comparisons vs the 1185g7 otherwise remain the same. Heavy loss in crypto, virtually the same score in Integer with a lead in floating point.
Noticeable points: Geekbench is a HORRIFICLY TRASH benchmark, has been for years, I don't really care how unpopular am I going to be with this opinion of mine. They should team up with userbenchmark, since they are both consistent in being confidently representative of everything but real world performance.
 

amd6502

Senior member
Apr 21, 2017
971
360
136
This tends to happen when the mis-predict rate goes down (fewer pipeline flushes). There are fewer stalls and threads must compete more competitively for resource use. Net throughput goes up, but the gains from SMT go down. Reduced memory/cache latency would also reduced thread stalls (mem waits). There could be other reasons, once Ian gets to do a deep dive on Zen3, we'll get a better idea.

The addition of an ALU might offset that effect. We may end up with a similar or possibly even higher SMT yield.

However, the unified L3 is going to work to the advantage of the single thread performance, thus having a negative effect on the SMT yield. For benchmarks that are not affected by a very large cache, the yield could still rise.
 

LightningZ71

Golden Member
Mar 10, 2017
1,793
2,151
136
The unified L3 shouldn’t have a notable effect on the SMT gains percentage. Remember that SMT isn’t just about hiding latency due to thread stalls. It’s also about increasing core throughout by being able to dispatch micro ops from two threads simultaneously. While having a single thread stall due to a cache miss will free up execution resources for the second thread during the context switch of the first thread for another thread that is waiting for execution, that only really helps if the second thread actually needs those resources during that timeframe.

SMT likes wide cores. SMT throughout shouldn’t change a whole lot if the core hasn’t changed effective width. The only thing that MIGHT give somewhat of a hit to SMT performance is a deliberate retuning of the dispatch logic to favor the primary thread over the secondary thread. That’s a very general way of saying that tuning can be done to increase single thread throughout at the expense of multithreaded performance. Given AMD’s improvements to the processors as stated, they may have felt that it was worth the trade off given the competition.
 
  • Like
Reactions: Tlh97 and amd6502

DrMrLordX

Lifer
Apr 27, 2000
22,035
11,620
136
Noticeable points: Geekbench is a HORRIFICLY TRASH benchmark, has been for years, I don't really care how unpopular am I going to be with this opinion of mine. They should team up with userbenchmark, since they are both consistent in being confidently representative of everything but real world performance.

Agreed.
 

Bigos

Member
Jun 2, 2019
151
367
136
SMT likes wide cores. SMT throughout shouldn’t change a whole lot if the core hasn’t changed effective width. The only thing that MIGHT give somewhat of a hit to SMT performance is a deliberate retuning of the dispatch logic to favor the primary thread over the secondary thread. That’s a very general way of saying that tuning can be done to increase single thread throughout at the expense of multithreaded performance. Given AMD’s improvements to the processors as stated, they may have felt that it was worth the trade off given the competition.

There is no such things as "primary thread" and "secondary thread". Both threads in an SMT-enabled core are equal. When only one runs it takes almost all of the core resources (some might still be reserved for the other thread, depending on implementation). When both run their throughput is reduced, but if each can run at over 50% rate you will see a net performance gain. E.g. if each runs at 60% then you will see +20% SMT gain (this doesn't take into account scalability of the given workload).

As you already mentioned, the unified cache doesn't benefit multi-thread workloads when each thread accesses its own set of data, showing smaller MT gain than ST gain with Zen 3. We might also be seeing insufficient memory bandwidth being a limiter in some situations, though that would favor 5600X and 5800X regarding MT gain over 5900X+ and 5950X. The reviews should tell us.
 

jeanlain

Member
Oct 26, 2020
159
136
86
Noticeable points: Geekbench is a HORRIFICLY TRASH benchmark, has been for years, I don't really care how unpopular am I going to be with this opinion of mine. They should team up with userbenchmark, since they are both consistent in being confidently representative of everything but real world performance.
Is geekbench trash or are tests simply not conducted in controlled conditions? Since anyone can post their results, you will find weird numbers in the database. Some launch geekbench while other tasks are running, for instance.
This analysis shows that geekbench integer results correlates extremely well with SPEC_INT (2006 and 2017) results. This is not a fully-fledge research paper, but this analysis is more thorough than most of what can be found online.
Of course, results from a particular app like blender may not reflect the results of synthetic benchmark tools, but does this indicate a problem with these tools? Since synthetic benchmark tools make averages from results obtained from an array of "real world" tasks, they cannot fully represent any particular scenario, but they are certainly more representative of overall CPU performance than any particular app.
 

coercitiv

Diamond Member
Jan 24, 2014
6,631
14,066
136
This analysis shows that geekbench integer results correlates extremely well with SPEC_INT (2006 and 2017) results. This is not a fully-fledge research paper, but this analysis is more thorough than most of what can be found online.
The same analysis points out correlation between GB and SPEC is not an inherent property and can break under a number of scenarios. It essentially reinforces what many object when it comes to GB performance estimates across platforms - controlled testing is mandatory to ensure correlation between GB and established industry benchmarks. And yet "controlled testing" is essentially the opposite of what GB offers: testing for ALL!

The irony of all this is some supporters of the benchmark acknowledge this fault only to profit from it even more through selective result selection. We've seen this tactic in full bloom on the forums, and one may argue we can see it to a lesser degree in Nuvia marketing. Even if we were to give Nuvia the full benefit of the doubt, using an arguably less reliable benchmark to convey their message is a bad idea when the message itself is under intense scrutiny. If you want to claim absolute performance and perf/wattt supremacy with no working demo in hand, the least you can do is use estimates for industry standard benchmarks. Why reinvent the wheel to sell a new steed?
 

jeanlain

Member
Oct 26, 2020
159
136
86
If you want to claim absolute performance and perf/wattt supremacy with no working demo in hand, the least you can do is use estimates for industry standard benchmarks. Why reinvent the wheel to sell a new steed?
Using SPEC on mobile platforms (has Nuvia wanted to include Apple's and qualcomm's) is significantly harder.
But that's not the point. We're not discussing Nuvia's decision, we're discussing whether geekbench is intrinsically flawed. I haven't seen clear evidence that it is. Weird scores can results from poor testing procedures, and differences between workloads are expected. Geekbench is not supposed to reflect cinebench, neither is SPEC.
 

coercitiv

Diamond Member
Jan 24, 2014
6,631
14,066
136
We're not discussing Nuvia's decision, we're discussing whether geekbench is intrinsically flawed. I haven't seen clear evidence that it is.
A minute ago you were willing to submit Nuvia's analysis as proof, now you place the burden of proof on the opposing side.

Using SPEC on mobile platforms (has Nuvia wanted to include Apple's and qualcomm's) is significantly harder.
They're trying to convince the world they're about to shake the entire computing industry. "Significantly harder" should be the norm for them.
 

jeanlain

Member
Oct 26, 2020
159
136
86
A minute ago you were willing to submit Nuvia's analysis as proof, now you place the burden of proof on the opposing side.
I questioned the claim about geekbench being trash. Isn't my nor anyone's job to provide proof that geekbench is not trash. No one can, as it is impossible to exclude the existence of some defect, however small. IOW, geekbench being trash is not a workable null hypothesis that can/must be disproven. One should instead provide evidence contradicting the null hypothesis that geekbench is not trash. Individual geekbench results found on the web do not constitute convincing evidence, nor do contradictions with results from particular apps. At best, these may show that some tests are poorly conducted and that geekbench algorithms are not representative of a unique use case. Since primate labs never claimed that geekbench should correct for user errors or be representative of any particular workload, I'm not yet convinced that their tool is trash.

Nuvia's analysis and future products are not my main interest. I still find your claim of absence of correlation between the SPEC and geekbench scores outlandish. Again, can you clarify?
 

coercitiv

Diamond Member
Jan 24, 2014
6,631
14,066
136
Nuvia's analysis and future products are not my main interest. I still find your claim of absence of correlation between the SPEC and geekbench scores outlandish. Again, can you clarify?
From the document itself:
While this observation is interesting from a benchmarking standpoint, Geekbench is generally less demanding of the micro-architecture than SPEC CPU is. For a subset of the micro-architectural features, Figure 3 shows the relative metric value for CPU2006 and CPU2017 normalized to a baseline of 1.0 for Geekbench 5. These were generated from detailed performance simulations of a modern CPU. It shows that the branch mispredicts and data cache (D-Cache), data TLB (D-TLB) misses are 1.1x — 2x higher in SPEC CPU compared to that seen in Geekbench 5. For this reason, chip architects tend to study a wide variety of benchmarks including SPEC CPU and Geekbench (among many others) to optimize the architecture for performance.

gb5-mis.jpg

It is important to note that the observed correlation is not a fundamental property and can break under several scenarios.

One example is thermal effects. Geekbench typically runs quickly (in minutes) and especially so in our testing where the default workload gaps are removed, whereas SPEC CPU typically runs for hours. The net effect of this is that Geekbench 5 may achieve a higher average frequency because it is able to exploit the system’s thermal mass due to its short runtime. However SPEC CPU will be governed by the long term power dissipation capability of the system due to its long run-time. This is something to watch out for when applying such correlation techniques to systems that see significant thermal throttling or power-capping while running these benchmarks.

Another scenario where the correlation can break is non-linear jumps in performance that one benchmark suite sees but not the other. The interplay between the active data foot-print of a test and the CPU caches is a classic source of such non-linearities. For example, a future CPU’s cache may be large enough that many sub-tests of one benchmark suite may fully fit in cache boosting performance many fold. However, the other benchmark suite may not see such a benefit if none of its tests fit in cache. In such cases, the correlation will not hold.

 
  • Like
Reactions: Tlh97 and Saylick

jeanlain

Member
Oct 26, 2020
159
136
86
@coercitiv, Nuvia's statement is directed at those who'd want to extrapolate from their results, as commonly done in a discussion section. It's good practice and I think they remain cautious, perhaps because they've been criticised for their previous blog post.
When they say that "the correlation will not hold", they certainly mean that it could be lower than what they observed. I don't see how to get an R^2 of zero in any realistic scenario.
If the CPU is downclocked due to overheating in SPEC and not in geekbench, then sure, the correlation will decrease. I don't see that as an issue, as I don't consider that geekbench should take overheating into account in its score.