AMD Ryzen 2000 (12nm Zen+) expectations

Page 6 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Topweasel

Diamond Member
Oct 19, 2000
5,436
1,654
136
But there is difference between a game loves low latency. Which has been the case since I had PC100 with CAS1 back in the day. But it doesn't get any better any faster than Kaby Lake. If it was the reason, it should out pace the gaines by a pretty large margin considering the starting point. It also doesn't explain why SL-X takes a Ryzen like drop when compared to SL, KBL or BW-E.
 

gOJDO_n

Member
Nov 13, 2017
32
7
36
Because IF is inherent of their whole development process from the very beginning. It's how inter CCX communication is handled, it's how intra CCX communication is handled, it's how cross die communication is handled, it how cross socket communication is handled. It how GCU communication is handled, it how GPU memory communication is handled. It's how GPU to GPU and it's how GPU to CPU communication is handled. AMD isn't going to change that on a whim. But just look at their GPU's an increase in width and faster ram. That's how AMD got the scalable to ~500GB/s. It's link to memory is part of it's standard now and expecting AMD to change it cause a bunch guys on the Internet are absolutely sure that it is why games don't run as well as they think it should is a pipe dream. If anything AMD will make the pathway wider on future Zen archs. Won't be Zen+ and probably not Zen 2. But if AMD could or was willing to change the ratio or just plain set a speed Zen and Vega wouldn't have the exact same policy. AMD made a judgement call when developing IF that finding a bandwidth that was acceptable for CPU's and hit their goals on Vega and went with that. Otherwise you increase the complexity decrease yields on vega just to increase the performance on Zen and again that assumes what we see is caused by IF and not for example the L3 system they are using. You know the same one Intel adopted on SL-X that sees a similar drop in performance compared to expectations.
Sure it is. But people act like it's the end of the world if it's not absolutely the end all be all in all avenues. AMD had goals and performance markers they needed to hit. This includes things like power usage, Yields, the ability for IF to work on every layer. Like all CPU designs that means trade-offs and compromises. Memory and core to core latency where obviously part of that and their impact on the big picture probably isn't that large.

This isn't about AMD saying gaming is unimportant. They aren't going to ignore us (though I think it would be better if they did). But they aren't going to prioritize that if it threatens their long term goals.

I expect AMD will work hard on tweaking what little they are going to touch on Zen+. I expect some decent arch changes going into Zen 2. But I doubt any of those is going to deal with IF outside increase width in the die. Not unless they are ready to change out all the impacted systems at the same time. That means new die for Ryzen, new Die for APU's, new arch for video.
That is silly though. We know Ryzen doesn't have the memory Latency an i7 has, so that means that it's Ryzen's issue? It also doesn't have the L3 configuration. It also doesn't have a ring bus. Hell it's CPUID isn't GenuineIntel, maybe that is the reason? I am all for figuring out what is causing the performance rift and having AMD fix that as long as it doesn't hurt them in another way. This "I know what the answer is even though I have never built a CPU, or developed a game, just saw a difference (and I'll admit there is a large amount of people that agree) and used that as fact".It is bothersome because it could be blinding people from the real issue.

I haven't checked out as many Coffee lake reviews as I probably should. I do know KabyLake within about 2% or so saw the same increases in performance as Ryzen when using faster memory. I don't know but have no reason to believe otherwise the CoffeeLake as KBL with 2 extra cores would behave any differently. As for your link I am not going to watch the whole thing (this reviewer isn't my style) but I skimmed through the video and didn't find a single test with the same CPU using 2 speeds of memory, can you give me the time stamp of when he did?
I was quite busy for the weekend, so excuse my late reply to your post.
First, lets clarify what DataFabric is and how it works. It is a is a multiple coherent point-to-multi point HyperTransport links connecting all the I/O, DRAM and CCX L3 caches in the system. According to the current, HTX 3.1, standard, the rated speed of the HT 3.1 links is 6400MT/s or 3200MHz. So such DF speeds are not a question of possibility, but a a question of decision. Since the same HT links are used as PCI-e links and since PCI-e 4.0 requires 2x the clock of PCI-e 3.0, the HT links of DF has to be clocked double, which translates to doubling the IF throughput and reducing (by half?) the access latency.

The logic of IF funcitioning will remain the same, the way of how CCXs, I/O and DRAM are connected will remain the same, everything will remain the same! Just the DF clock will be doubled, its operating votatge increased. In addition minor changes will be made for the PCI-e 4 support.

As for the memory latency. One of the major differences between KBL/CFL and Ryzen michroarchitectures is that SKL/KBL/CFL have single L3 cache where all the cores are communicating directly through the L3 at it's full speed and low latencies, while on Ryzen it is separated in multiple CCX's which are connected via IF(which adds latency) and very slow HT link(at 10% of the potential inter-L3 core bandwidth). That low IF bandwidth the CCX are sharing for all the L3 to L3, I/O and DRAM communication, where the DRAM alone can fully saturate the HT Link to the CCX.

So the point is that Ryzen takes two advantages from the faster DRAM(and DF) clock:
1) It gives more RAM bandwidth to cores
2) It gives more L3 to L3 bandwidth and reduces reduces the absolute access latency

That's why Ryzen enjoys more performance improvement of faster RAM than SKL/KBL/CFL.
Back in the days, I did some performance testing and comparison on Skylake 6700K and different DDR4 modules (from 2133 to 3733): https://www.it.mk/z170-ddr4-ram/
I came to the conclusion that there is no point of buying faster DDR4 than 3000MHz even with OC-ed CPU because it gives no performance improvements over the 3000MHz CL16 DDR4.

I can't say the same for Ryzen TR 1950X. It gains performance almost linearly from 2400 to above 3200MHz DDR4.
 
  • Like
Reactions: CatMerc

Topweasel

Diamond Member
Oct 19, 2000
5,436
1,654
136
That's why Ryzen enjoys more performance improvement of faster RAM than SKL/KBL/CFL.

But that isn't the case and Ryzen starts capping out at roughly 3-3.2GHz as well. I get the points. I understand how IF works on Ryzen. I get the napkin math. But it doesn't hold under scrutiny and SL-X gives us a roadmap on this issue. The core to core latency is higher on average, but it's memory latency is lower, it's IPC is higher, it clocks higher, it uses a similar cache configuration. It's better than Ryzen but takes a large dive comparable to the same difference SL-X sees in drop from productivity to games. It makes sense. No longer a pool of L3 cache for CPU's to grab info from now it has to go core to core or call up information from memory. IF speed increase will help mitigate the issue, but as long as Intel has a consumer die, built the old way, Ryzen is going to stay behind.

The real answer is that game developers from this point on with HEDT becoming more attainable and Ryzen as a consumer part. Start developing games to utilize the new cache structure better.
 

gOJDO_n

Member
Nov 13, 2017
32
7
36
SKL-X has power and heat problems. I expect a 16 core Ryzen+ ThreadRipper @4.2GHz on all cores and 4.4GHz on four cores boost with (the updated) IF2 with quad DDR4-3200 at least to match the performance of (the same clocked) 7960X in average while consuming less energy.

As for the optimizations, I agree that developers have to optimize the code for CPU architectures like SKL-X and Ryzen, which in general are more workstation/server oriented than desktop. Anyway, with or without code optimizations, games will benefit from faster IF on Ryzen+.
 

CatMerc

Golden Member
Jul 16, 2016
1,114
1,149
136
But there is difference between a game loves low latency. Which has been the case since I had PC100 with CAS1 back in the day. But it doesn't get any better any faster than Kaby Lake. If it was the reason, it should out pace the gaines by a pretty large margin considering the starting point. It also doesn't explain why SL-X takes a Ryzen like drop when compared to SL, KBL or BW-E.
It doesn't gain the same as Ryzen, and even if it did that means nothing. We established that games are latency sensitive, and Ryzen at equal RAM settings get much higher latency. That's all that matters.

Oh and Skylake-X? It too has increased latency. It's between Ryzen and Kaby, which is where it is in gaming.
 
Last edited:
  • Like
Reactions: coercitiv

gOJDO_n

Member
Nov 13, 2017
32
7
36
As for the KBL vs Ryzen IPC, I find this comparison (Great Job @Zolkorn) very interesting, especially this slide:
aida64-membw.png

Ryzen memory latency is huge...
 
Last edited:
  • Like
Reactions: DarthKyrie

SpaceBeer

Senior member
Apr 2, 2016
307
100
116
Yes but that is not so common. Most of 1080p players use 1080 Ti, so in more than 90% of gaming PCs, CPU is bottleneck. Only few of those with GTX 1060/RX 580 and slower cards (GTX 900, Rx 300 and older) might have GPU bottleneck. Oh, wait...
 
Aug 11, 2008
10,451
642
126
You might be surprised, but there are games that runs better on ryzen clock per clock.
Such as? I didnt see any in that very limited test just linked, unless you are trying to say 135 is faster than 132 in BF 1, which is within the margin of error of measurement, most likely. Not to mention you have to overclock ryzen and underclock intel to get them at the same clockspeed.
 

gOJDO_n

Member
Nov 13, 2017
32
7
36
Such as? I didnt see any in that very limited test just linked, unless you are trying to say 135 is faster than 132 in BF 1, which is within the margin of error of measurement, most likely. Not to mention you have to overclock ryzen and underclock intel to get them at the same clockspeed.
Civ VI

Sent from my SM-G935F using Tapatalk
 

Rifter

Lifer
Oct 9, 1999
11,522
751
126
I expect 0-5% IPC gains, and 10-15% clockspeed gains. With bigger gains to come from Zen2, Zen+ IMO will just be a small improvement. I do expect it to have much better ram support out of the box though which will help out alot of new ryzen users.
 

Topweasel

Diamond Member
Oct 19, 2000
5,436
1,654
136
It doesn't gain the same as Ryzen, and even if it did that means nothing. We established that games are latency sensitive, and Ryzen at equal RAM settings get much higher latency. That's all that matters.

Oh and Skylake-X? It too has increased latency. It's between Ryzen and Kaby, which is where it is in gaming.
It does. Because again this has been a point brought up from the very beginning. I even linked a test earlier and the gains that Ryzen gets per step up in memory clock are very very similar to the gains that Kaby Lake got.

SL-X has a higher latency than Kaby lake but not by much. Certainly nowhere near as high as Ryzen.

Still none of this feeds back to the IF being at fault. Maybe a better Memory controller helps. Still like my theory that the cache structure at fault and not something I think we or AMD wants to "fix".
 

unseenmorbidity

Golden Member
Nov 27, 2016
1,395
967
96
I expect 0-5% IPC gains, and 10-15% clockspeed gains. With bigger gains to come from Zen2, Zen+ IMO will just be a small improvement. I do expect it to have much better ram support out of the box though which will help out alot of new ryzen users.
If there is even a 1% ipc gain, then I'll dance a lil jig.
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
I don't expect that there will be any changes to IPC in Pinnacle Ridge, unless the current silicon revision still contains errata which deteriorates the IPC. Based on Raven's performance it is extremely unlikely that it does.

The Fmax (both factory and end-user side) has to improve by >= 15% compared to current die revision, however I expect ~5% improvement at best. Unless of course AMD found some ways to tune the design itself, so that they can juice even more out of the limited process.
GlobalFoundries says "10% higher performance (at ISO power) compared to 14nm LPP", however it is impossible say if the actual Fmax is higher than on 14nm LPP based on that.
 

Magic Hate Ball

Senior member
Feb 2, 2017
290
250
96
I don't expect that there will be any changes to IPC in Pinnacle Ridge, unless the current silicon revision still contains errata which deteriorates the IPC. Based on Raven's performance it is extremely unlikely that it does.

The Fmax (both factory and end-user side) has to improve by >= 15% compared to current die revision, however I expect ~5% improvement at best. Unless of course AMD found some ways to tune the design itself, so that they can juice even more out of the limited process.
GlobalFoundries says "10% higher performance (at ISO power) compared to 14nm LPP", however it is impossible say if the actual Fmax is higher than on 14nm LPP based on that.

Do you think there might be less of a voltage wall with a different process compared to 14nm LPP as we see with Ryzen @ 4ghz or so? That's the biggest question for me.

With watercooling I don't mind if it takes a bit of voltage to hit 4.5ghz on a R5 1600 successor.

I'm also curious how the Raven Ridge desktop APU's will OC.
 

PeterScott

Platinum Member
Jul 7, 2017
2,605
1,540
136
I don't expect that there will be any changes to IPC in Pinnacle Ridge, unless the current silicon revision still contains errata which deteriorates the IPC. Based on Raven's performance it is extremely unlikely that it does.

The Fmax (both factory and end-user side) has to improve by >= 15% compared to current die revision, however I expect ~5% improvement at best. Unless of course AMD found some ways to tune the design itself, so that they can juice even more out of the limited process.
GlobalFoundries says "10% higher performance (at ISO power) compared to 14nm LPP", however it is impossible say if the actual Fmax is higher than on 14nm LPP based on that.

Yep, I think people expecting 4.4 GHz are going to be disappointed, and that seems to be most people.
 

scannall

Golden Member
Jan 1, 2012
1,946
1,638
136
Yep, I think people expecting 4.4 GHz are going to be disappointed, and that seems to be most people.
Eh, 10% (From current 1800x base) more is closer to 4.1 Ghz. 4.4 on a golden chip? Possible, though probably not very likely.