Discussion [Video]Ryzen 7 3800X 5GHz vs. Core i9 9900K 5 GHz

tamz_msc

Diamond Member
Jan 5, 2017
3,710
3,554
136

Skylake wins in everything gaming except WoW 1% and 0.1% lows. This proves that the bottleneck in Zen 2 isn't frequency, but rather the memory subsystem. Hopefully Zen 3 addresses this, otherwise Intel would have nothing to worry about.
 
Last edited:

Gideon

Golden Member
Nov 27, 2007
1,608
3,573
136
These tests were run with Cascade Phase Change cooling under -90 degrees Celsius o_O

I saw Hardwarenumb3rs mention this in twitter a few days back that 5Ghz improved min-fps but average almost didn't change. And that is despite running 3800X at 5 GHz is a 10% ST increase from stock and 20%. MT.

Here are his AIDA64 scores:


Overall yeah, the main bottleneck for Ryzen is the memory-subsystem, particularily the latency (and that is almost entirely due-to the FCLK limit).

IMO AMD should do 2 things with Ryzen 4xxx series to improve the situation:

1. Significantly improve the FCLK ceiling. Hopefully they manage to get it to at least 2GHz, though IMO at 2.2Ghz would be preferred. That would allow it to run DDR3 4400Mhz with 1:1 config. What's probably as important, that would allow mobile APUs to run LPDDR4X 4233Mhz with 1:1 config. Besides considering DDR5 starts with 4800 Mhz (and goes to at-least 8400 Mhz), they need to redesign their memory-subsystem anyway or allow much higher FCLK speeds. Running DDR5 at 1:2 would really hamper Zen 4.
2. Bring back AMP which would also tune subtimings (similar to Ryzen DRAM Calcualtor) and ask reviewers to also review with this on (compared to XMP). While Ryzen is still small in OEM sales, It's big enough now in DIY to have some weight around mobo/memory-module companies to pull this off. Ideally AMP could even be automatically loaded with compatible modules (though this is maybe too much like MCE on Intel).

before Zen 2 release somebody at AMD at least mentioned they were "scoping" the idea of bringing back AMP. I really hope they weren't lazy and do follow through with Zen 3 launch. Subtimings alone could mean win or loss in gaming vs i9 (and most review sites will not be tuning them manually)
 

tamz_msc

Diamond Member
Jan 5, 2017
3,710
3,554
136
Significantly improve the FCLK ceiling
I think it was Hardware Unboxed who did the testing which showed that Renoir in the Zephyrus G14 even with the integrated IMC does much worse in terms of latency than Skylake at 3200 Mhz vs 2666 Mhz(most probably). Doesn't it run 1:1 at 3200 MHz? If so there is a significant drawback to Ryzen's memory controller design compared to Intel.
 
  • Like
Reactions: Burpo and Gideon

Thunder 57

Platinum Member
Aug 19, 2007
2,647
3,706
136

Skylake wins in everything gaming except WoW 1% and 0.1% lows. This proves that the bottleneck in Zen 2 isn't frequency, but rather the memory subsystem. Hopefully Zen 3 addresses this, otherwise Intel would have nothing to worry about.

CPU's are used for things other than games. But yes, it is no secret that Intel's inclusive L3 cache and ring bus benefit games. That will come to an end as core counts increase.
 

tamz_msc

Diamond Member
Jan 5, 2017
3,710
3,554
136
CPU's are used for things other than games. But yes, it is no secret that Intel's inclusive L3 cache and ring bus benefit games. That will come to an end as core counts increase.
Cache and interconnect are only a part of the memory subsystem. I don't think that inclusive/exclusive L3 matters when Zen 2 has twice the L3, and interconnect isn't a problem because despite the way IF works when accessing cores on different CCXes, I don't think the upcoming Ryzen 3 3100 and 3300X with 2:2 and 4:0 configuration would show variable performance which isn't explained by clock speed differences.

As it stands now the latency to access main memory is simply too high on Ryzen, which is the primary bottleneck in games. Games being a workload that involves significant memory access, Intel has nothing to worry about even with high core counts as long as they make their own improvements to mesh topology and use their superior integrated memory controller.
 

Gideon

Golden Member
Nov 27, 2007
1,608
3,573
136
I think it was Hardware Unboxed who did the testing which showed that Renoir in the Zephyrus G14 even with the integrated IMC does much worse in terms of latency than Skylake at 3200 Mhz vs 2666 Mhz(most probably). Doesn't it run 1:1 at 3200 MHz? If so there is a significant drawback to Ryzen's memory controller design compared to Intel.
There is no doubt that memory subsystem on Ryzen is a bottleneck and no FCLK improvement alone can remove it entirely. As seen in the video even custom timings the difference was 60ns vs 38.6ns so about 35% worse. To me the Renoir results weren't that surprising as the chiplet design shouldn't add more than ~5ns to the latency.

I took it as a given that the unified cache( 32MB vs 2x16MB) will be a big boon for Zen 3 as well as other improvements in the core (hopefully non-statically partitioned uop cache, larger L1D cache, 1MB L2 cache, inclusive L3 along with the huge CCX redesign).
I still can't see them getting rid of a limiting overall fabric clock, so my point was that even with all those upcoming improvements they still probably:

1. Still have to improve the FCLK significantly (>= 2.1GHz)
2. Make it possible for average-joe to recieve these FCLK improvements with minimal fuss (Ideally just by turning on your computer with a correct AMP module).

IMO not implementing AMP (despite mentioning it as an option in an interview) would be a huge missed opportunity. Just look at the difference between 3200 XMP vs the custom timings in the video. Imagine if Zen 3 had some 4200-4400Mhz low-latency AMP modules in the review kit running at 1:1 (asked to be included along with stock results).
 
  • Like
Reactions: Thunder 57

lobz

Platinum Member
Feb 10, 2017
2,057
2,856
136

Skylake wins in everything gaming except WoW 1% and 0.1% lows. This proves that the bottleneck in Zen 2 isn't frequency, but rather the memory subsystem. Hopefully Zen 3 addresses this, otherwise Intel would have nothing to worry about.
Yeah well... I'm playing FPS games competitively, so I'd trade average fps for better 0.1% lows ANYTIME.
Stutters at the worst times are your biggest enemies, not 10% lower avg fps.
 

amrnuke

Golden Member
Apr 24, 2019
1,181
1,772
136
plained by clock speed differences.

As it stands now the latency to access main memory is simply too high on Ryzen, which is the primary bottleneck in g
This is why the Zen3 rumor of shared L3 between all 6-8 cores on a chiplet is important, rather than 16MB/4 cores, it will be 32MB/8 cores (meaning in lightly threaded workloads, effective shared L3 cache is doubled), which will decrease cache misses and make the FCLK limit less of an issue.

However, single and lightly threaded games/workloads aren't going away, and this change seems like an expensive way to kick the can down the road as L3 cache is not cheap from a real estate, power, and heat standpoint.

My initial though was, as long as AMD don't correct what seems to be an obvious drawback with the memory subsystem, won't this still be an issue? However, it could be that the cost (from a real estate, power, and heat standpoint) of improving IF at this point is economically higher than boosting L3 cache, and perhaps Zen4 team is already working on memory subsystem improvements, hence the shared L3 is a good stop-gap.
 

Gideon

Golden Member
Nov 27, 2007
1,608
3,573
136
My initial though was, as long as AMD don't correct what seems to be an obvious drawback with the memory subsystem, won't this still be an issue? However, it could be that the cost (from a real estate, power, and heat standpoint) of improving IF at this point is economically higher than boosting L3 cache, and perhaps Zen4 team is already working on memory subsystem improvements, hence the shared L3 is a good stop-gap.
We shall see. I'd be very surprised if Zen 3 has no memory-latency improvements. Moving to a shared L3 for 8-cores should remove a ton of coherency issues (particularily if it's also inclusive) for all CPUs that have up to 8 cores (previously all memory relatad stuff needed to be synced between 2 CCXes, through I/O die). IMO gaming is an important market. Enough to justify optimizing at least all the CPUs up to 8 cores. It's not just the X series but all of the APUs would also benefit greatly from a simple inclusive L3 mode.

Inclusive L3 would not be enough for multi-chiplet CPUs (12+ cores) as these still need to coherently access the memory, but even there it should help somewhat (you only need to peek the other L3 not any L2 caches).

Anyway, I would be very surprised if they redid their entire cache layout and got no memory latency improvement out of that (despite greatly simplifying things by adding unified caches)
 

piokos

Senior member
Nov 2, 2018
554
206
86
CPU's are used for things other than games.
But this comparison is about gaming so...? :)
But yes, it is no secret that Intel's inclusive L3 cache and ring bus benefit games. That will come to an end as core counts increase.
Skylake-X uses a mesh interconnect and I don't think it really made it lag behind similarly clocked ring bus chips.
 

piokos

Senior member
Nov 2, 2018
554
206
86
Skylake-X performance in games is notoriously lackluster.
Maybe it is, but the source you suggested hardly confirms that.
In 1080p it lost only to -K chips with much higher clocks and the i5-8400 (which performed unexpectedly well...).
The 4K test is GPU-limited and results are within error (~2%, ~1fps). 1440p isn't much better.

You said "notoriously lackluster" like if it was 10% slower. :)
 

moinmoin

Diamond Member
Jun 1, 2017
4,934
7,619
136
This proves that the bottleneck in Zen 2 isn't frequency, but rather the memory subsystem.
I saw Hardwarenumb3rs mention this in twitter a few days back that 5Ghz improved min-fps but average almost didn't change.
Another way to put it: It seems the latency was optimized for a very specific frequency range. Zen 2's memory subsystem wasn't planned to work at frequencies higher than Zen 2 was designed for.
 
  • Like
Reactions: uzzi38

gdansk

Golden Member
Feb 8, 2011
1,980
2,355
136
Am I wrong to conclude that over-clocking Zen 2 is a waste of effort? If pushing it all the way up to 5GHz barely helps, why mess around at all?
 

piokos

Senior member
Nov 2, 2018
554
206
86
Am I wrong to conclude that over-clocking Zen 2 is a waste of effort?
With the boosting mechanisms we have today (from both Intel and AMD) overclocking anything is a waste of effort. Unless of course you have some extreme cooling that creates headroom well beyond what boosting is programmed to use (like in the video...).
If pushing it all the way up to 5GHz barely helps, why mess around at all?
It may barely help in games, but you would see decent gain in processing tasks (encoding, rendering, simulations).
I'm not saying it makes sense or that people will suddenly OC their workstations (they won't).

It's just that gaming is interactive, latency-dependent and extremely specific in general.
 

coercitiv

Diamond Member
Jan 24, 2014
6,151
11,686
136
Maybe it is, but the source you suggested hardly confirms that.
A kind reminder you made the original claim with zero evidence to back it up. If you disagree with my source, bring better data.

In 1080p it lost only to -K chips with much higher clocks and the i5-8400 (which performed unexpectedly well...).
I suggest you check the numbers again. The 7900X is a 10c/20t CPU and in 720p it loses by 5% to 8350K which is a 4c/4t @ 4Ghz and by 10% to 8600K which is a 6c/6t CPU @ 4.1-4.2Ghz in MT loads.

Even overclocked to 4.5Ghz it loses to 8400 @ 3.9GHz. It doesn't get any more lackluster than that.

You said "notoriously lackluster" like if it was 10% slower. :)
Yes, it is more than 10% slower. ;)
 

lightmanek

Senior member
Feb 19, 2017
387
754
136
Am I wrong to conclude that over-clocking Zen 2 is a waste of effort? If pushing it all the way up to 5GHz barely helps, why mess around at all?

If you stated "barely helps in games", then you might have a point, but in general, wide spread of workloads, your conclusion is not correct. Overclocking Zen 2 brings about linear performance improvement in workloads like rendering, video encoding, decompression and many more. Most older game engines are very dependant on memory latency, but there are game engines which would still scale on Zen 2 despite IF architecture.
For next-gen game engines, designed with Zen 2 CPU cores inside consoles in mind, I think bottleneck will be largely mitigated by developers. Let's don't forget, Intel processors are moving away from ring-bus on the desktop as well in the near future.
 

GoodRevrnd

Diamond Member
Dec 27, 2001
6,803
581
126
Another way to put it: It seems the latency was optimized for a very specific frequency range. Zen 2's memory subsystem wasn't planned to work at frequencies higher than Zen 2 was designed for.
Wasn't this established pretty quickly after the processors came out? Using memory outside the 1:1 FCLK range requires a massive overclock to offset the latency penalty to exceed throughput at lower speeds with 1:1 it's rarely worth it. And in games it may not be possible to recoup the latency loss. So basically you're stuck w/ 3800 memory on a 1900 FCLK if the chip can handle it (or possibly a few ticks higher).
 

uzzi38

Platinum Member
Oct 16, 2019
2,565
5,575
146
I think it was Hardware Unboxed who did the testing which showed that Renoir in the Zephyrus G14 even with the integrated IMC does much worse in terms of latency than Skylake at 3200 Mhz vs 2666 Mhz(most probably). Doesn't it run 1:1 at 3200 MHz? If so there is a significant drawback to Ryzen's memory controller design compared to Intel.
Mobile chips are given JEDEC rated memory. Latencies are the same for both sticks of memory, the higher freq means nothing. You're comparing DDR4-2666 CL16 (or was it 18, I forget) to DDR4-3200 CL22.
 
  • Like
Reactions: lightmanek

moinmoin

Diamond Member
Jun 1, 2017
4,934
7,619
136
Wasn't this established pretty quickly after the processors came out? Using memory outside the 1:1 FCLK range requires a massive overclock to offset the latency penalty to exceed throughput at lower speeds with 1:1 it's rarely worth it. And in games it may not be possible to recoup the latency loss. So basically you're stuck w/ 3800 memory on a 1900 FCLK if the chip can handle it (or possibly a few ticks higher).
In a way, yes. It was quickly clear that Zen 2 has no real headroom for OC to speak of, as at stock it already made good use of all the available room. Auto FCLK switching to 1:2 over 3800 and manually enforcing 1:1 there didn't earn any significant boost. Now we see increasing frequencies above what's Zen 2's normal range doesn't improve latency sensitive applications (in this case games). So all the different bottlenecks in Zen 2 appear to be pretty well optimized for each other, trying to go beyond in one area gets you stuck in one or some of the others.

May well be the case that this will be the biggest difference in Zen 3: a whole new sets of bottlenecks that are balanced/optimized in completely different ways than they have been in Zen 2. Though I don't expect any more headroom flexibility, the precedence of essentially maxing out nearly everything at stock has already been set, and I doubt AMD will move away from doing that again.
 

tamz_msc

Diamond Member
Jan 5, 2017
3,710
3,554
136
Mobile chips are given JEDEC rated memory. Latencies are the same for both sticks of memory, the higher freq means nothing. You're comparing DDR4-2666 CL16 (or was it 18, I forget) to DDR4-3200 CL22.
Actually, the Intel chips have superior latency from the Hardware Unboxed testing.


The other is the memory system. Yes, AMD offers higher memory bandwidth than Intel with the move to support DDR4-3200 speeds, Intel only offers DDR4-2666 with 9th-gen. In a benchmark like Sandra we see AMD providing around 35% more memory bandwidth. However, Ryzen 4000 appears to have inferior memory latency. This, like cache size can become a performance constraint. The Core i9-9880H has memory latency around 30ns for data sets above 32MB in size, while the Ryzen 9 4900HS has 46ns memory latency. That’s a substantial win for Intel.
 

uzzi38

Platinum Member
Oct 16, 2019
2,565
5,575
146
Actually, the Intel chips have superior latency from the Hardware Unboxed testing.


You're correct, they do have superior memory latency.

But that's not my point.

My point is the difference in memory latency caused by the memory kits themselves are identical between the DDR4-2666 kits on the Intel side and the DDR4-3200 kits on the AMD side, as they both run at JEDEC speeds.

Renoir systems sporting higher memory latency is purely down to the difference in architecture, and Intel does not lose anything by using DDR4-2666 memory in terms of latency compared to AMD systems. Bandwidth is another issue altogether, but most games are latency bound, not bandwidth bound.
 
Last edited:

GoodRevrnd

Diamond Member
Dec 27, 2001
6,803
581
126
Interesting, so this wasn't even optimal Intel memory performance either. Isn't the sweet spot usually somewhere in the 4200-4600 range depending on what timings you can hit? I'd like to see results from an aircooled 9900K mid-range OC vs an aircooled 3800X mid-range OC with optimal typical mem OCs on each.