Ryzen 2 and possible 3200MHz support what is worse : Low IF speed or high memory latency

May 11, 2008
18,301
39
126
#1
With the original zen we noticed the inter ccx communication is limited to the IF which in turn is coupled to ram speed.

Assuming the IF between the CCX is optimized a bit but not for example doubled in bandwidth and it is still coupled to memory speed, one wants high memory speeds. Now since the JEDEC specification for DDR4 now also officialy supports 3200MHz speeds (albeit at 20 or 22 clocks for memory timings) this is doable.
We noticed in the past that inter thread communications suffered from the IF bandwidth between the CCX. Sometimes affecting games and other programs that have high inter thread communication.

It would be nice if AMD has been able to double IF bandwidth between the CCX, IF is scalable so perhaps it is also scalable between the CCX. But this remains to be seen.

My question is, even with memory timings of :
CAS Latency (CL) 20
RAS-to-CAS-Delay (tRCD) 20
RAS-Precharge-Time (tRP) 20
Row-Active-Time (tRAS) 40

at 3200MHz, the IF would be much faster and inter CCX communication would benefit.

What is worse for ryzen, high memory timings at high speeds (3200MHz or higher) but with higher IF bandwidth or low speeds (2933MHz or less) at low memory timings but with lower IF bandwidth ?
Common sense says that the inter CCX communication must be as fast as possible.
High memory latency is offset by caches and smart prefetching / speculative loading of data into the cache.
 

IRobot23

Senior member
Jul 3, 2017
601
18
76
#2
No big changes in design as far as I know.
So.... IF is hard-linked to IMC, which means same clock speed.

Yes, IF at 3GHz would change a lot and if you like to know how well would it perform in AIDA64 I can tell you some numbers (calculated by myself - could be wrong). Well at 3Ghz would be below (48-49) 50ns (48-49ns) with CL16 3466MHz, stock.
 
Last edited:

Markfw

CPU Moderator, VC&G Moderator, Elite Member
Super Moderator
May 16, 2002
18,009
1,835
136
#3
Why cas 20 ? most DDR4 3200 memory is 14-16. 14 is the good samsumg b-die. I don't know that I have even seen cas 20.

I just check newegg. They are ALL 14-16.
 
May 11, 2008
18,301
39
126
#4
Why cas 20 ? most DDR4 3200 memory is 14-16. 14 is the good samsumg b-die. I don't know that I have even seen cas 20.

I just check newegg. They are ALL 14-16.
That is true, but at 1.35V. I meant the jedec specification at 1.2V

It is all described in JESD79-4B.pdf. Downloadable at the jedec site after registration.
Page 202.

Table 113 — DDR4-3200 Speed Bins and Operations
Speed Bin DDR4-3200W - DDR4-3200AA - DDR4-3200AC

CL-nRCD-nRP 20-20-20 - 22-22-22 - 24-24-24
Absolute Specification
- VDDQ = VDD = 1.20V +/- 0.06 V
- VPP = 2.5V +0.25/-0.125 V
- The values defined with above-mentioned table are DLL ON case


Edit : I am really guessing but it makes sense for ryzen 2xxxx series to support 3200MHz jedec speeds.
 
Last edited:

CatMerc

Golden Member
Jul 16, 2016
1,113
53
106
#5
High memory latency is Ryzen's biggest weakness. You can't speculate and prefetch everything.

CCX to CCX latency can be problematic when trying to scale performance past four cores, but for most consumer workloads getting the memory latency down as much as possible would net better results.
 
May 11, 2008
18,301
39
126
#6
High memory latency is Ryzen's biggest weakness. You can't speculate and prefetch everything.

CCX to CCX latency can be problematic when trying to scale performance past four cores, but for most consumer workloads getting the memory latency down as much as possible would net better results.
Interesting. Is that primarily for singlethreaded or bad written games ?
 

CatMerc

Golden Member
Jul 16, 2016
1,113
53
106
#7
Interesting. Is that primarily for singlethreaded or bad written games ?
There isn't a single game I profiled that wasn't significantly memory latency bound. And I profiled many.
 
May 11, 2008
18,301
39
126
#8
There isn't a single game I profiled that wasn't significantly memory latency bound. And I profiled many.
I see.
Thus, it would be nice if ryzen 2xxx would get a lower latency IMC and also be capable of 3200MHz @ 1.2V.
IF improvement would be very important. All combined would shave of valuable nanoseconds.
From the ryzen APU (single CCX) we have seen that there is no need to have two ccx to have dual channel memory.
I can remember that once it was thought that for ryzen 1xxx each ccx had its own 64 bit memory channel.
But that does not have to be the case.
It is the IMC and the connection between the IMC and both ccx which is the Inifinite Fabric.
Hmm, few more weeks and we shall see.
 
Apr 27, 2000
12,376
1,325
126
#9
You won't see RAM products based on the new JEDEC spec for awhile anyway, if ever. If you do buy RAM capable of DDR4-3200, you will be able to run it CAS/CL 16 more likely than not, in accordance with what @Markfw said.

Maybe you will see some newer RAM eventually, binned for DDR4-3200 CAS20 @ 1.2v, but I think that may wind up on SODIMMs first?

The real question is if Pinnacle Ridge can get better results out of existing DIMMs with Hynix ICs.
 

wahdangun

Senior member
Feb 3, 2011
994
7
106
#10
Umm, actually the IMC when you use 3200 ram is 1600mghz.

And no way the IF will be decoupled from IMC frequency, it will complicated the design and worsening the latency. And actually the latency between core in the ccx is better than intel ring bus topology.

Maybe in zen2 they will try double or triple the IF, but still linked to IMC speed.
 
Apr 27, 2000
12,376
1,325
126
#11
Maybe in zen2 they will try double or triple the IF, but still linked to IMC speed.
If that is possible, then that would be the ideal solution. I have my doubts about how fast IF frequency can get without causing other problems, though.
 

wahdangun

Senior member
Feb 3, 2011
994
7
106
#12
If that is possible, then that would be the ideal solution. I have my doubts about how fast IF frequency can get without causing other problems, though.
Yeah, the problem I think because amd was cash strapped and so using third party up in their design and IF was the solution, and because of that the IF frequency can't go higher.
 
May 11, 2008
18,301
39
126
#13
You won't see RAM products based on the new JEDEC spec for awhile anyway, if ever. If you do buy RAM capable of DDR4-3200, you will be able to run it CAS/CL 16 more likely than not, in accordance with what @Markfw said.

Maybe you will see some newer RAM eventually, binned for DDR4-3200 CAS20 @ 1.2v, but I think that may wind up on SODIMMs first?

The real question is if Pinnacle Ridge can get better results out of existing DIMMs with Hynix ICs.
I think we will get official 3200MHz @1.2V.Besides the new ryzen & vega apu already has support up to 2933MHz @1.2V. Just 267MHz more.
And if i am not mistaken, the ryzen apu is still manufactured at 14nm.
The new 12nm process might also help out for how the (IMC) integrated memory controller is set up.
Of course, when the jedec spec is official does not mean that a prior informed AMD of the same upcoming jedec spec would not just sit and wait to become it offcial... On the contrary, they would already start development of support for it.
 

CatMerc

Golden Member
Jul 16, 2016
1,113
53
106
#14
Yeah, the problem I think because amd was cash strapped and so using third party up in their design and IF was the solution, and because of that the IF frequency can't go higher.
IF is in house and a superset of Hypertransport. The reason the IF is coupled with memory clock is because decoupling it adds complexity that can actually reduce performance. Crossing clock domains costs you latency.
 
Apr 27, 2000
12,376
1,325
126
#15
I think we will get official 3200MHz @1.2V.Besides the new ryzen & vega apu already has support up to 2933MHz @1.2V. Just 267MHz more.
It is one thing for the IMC to support the speed; it is quite another for DIMM manufacturers to cater directly to the spec with compliant products.

Glancing at pcpartpicker.com, I observed that the only products reaching DDR4-2933 and beyond @ 1.2v right now are SODIMMs. Note that pcpartpicker.com has a bad habit of listing some 1.35v parts as 1.2v . If you look at the product pages, you will see the actual voltage requirements.

So I project that we will see DDR4-3200 @ 1.2v first in SODIMMs. It remains to be seen whether anyone will bother binning ICs for that spec when putting together standard desktop DIMMs. Maybe ULP DIMMs?
 

wahdangun

Senior member
Feb 3, 2011
994
7
106
#16
IF is in house and a superset of Hypertransport. The reason the IF is coupled with memory clock is because decoupling it adds complexity that can actually reduce performance. Crossing clock domains costs you latency.
i know IF is in house what i mean was there are several third party IP in their design, and it was connected like some short of module and IF was the solution to connecting this.
 
May 11, 2008
18,301
39
126
#17
It is one thing for the IMC to support the speed; it is quite another for DIMM manufacturers to cater directly to the spec with compliant products.

Glancing at pcpartpicker.com, I observed that the only products reaching DDR4-2933 and beyond @ 1.2v right now are SODIMMs. Note that pcpartpicker.com has a bad habit of listing some 1.35v parts as 1.2v . If you look at the product pages, you will see the actual voltage requirements.

So I project that we will see DDR4-3200 @ 1.2v first in SODIMMs. It remains to be seen whether anyone will bother binning ICs for that spec when putting together standard desktop DIMMs. Maybe ULP DIMMs?
That is what is confusing to me. 3600 and even 4000MHz at 1.35V with tight timings is possible but 3200MHz with normal relaxing timings is not possible at 1.2V.
That is where i am lost.

I checked the three major dram chips manufacturers for fun for what is available.
Samsung has 4Gb and 8Gb 3200MHz ddr4 chips in mass production.
Hynix has 8Gb 3200MHz as engineering samples.
Micron has 4Gb and 8Gb 3200MHz in mass production but these are x16 width. I am not sure these chips can be used.All on 1.2V.
The chips are there.
Good market for the dimm manufacturers i would say.
 
Apr 27, 2000
12,376
1,325
126
#18
That is what is confusing to me. 3600 and even 4000MHz at 1.35V with tight timings is possible but 3200MHz with normal relaxing timings is not possible at 1.2V.
That is where i am lost.

I checked the three major dram chips manufacturers for fun for what is available.
Samsung has 4Gb and 8Gb 3200MHz ddr4 chips in mass production.
Hynix has 8Gb 3200MHz as engineering samples.
Micron has 4Gb and 8Gb 3200MHz in mass production but these are x16 width. I am not sure these chips can be used.All on 1.2V.
The chips are there.
Good market for the dimm manufacturers i would say.
There may be a difference between IC rating and what they can be relied upon to do in "real world" settings (read: soldered in banks on a DIMM, using an actual memory controller instead of a factory test bed). Otherwise I would have to say that binning/market segmentation may be responsible for why we are not seeing DDR4-3200 @ 1.2v in the actual market. Also keep in mind that standard 288-pin DDR4 DIMMS are not the only places where these ICs can wind up. You may be looking at products that wind up in mobile devices and/or SODIMMs.

Otherwise I can't say for sure why such products are unavailable on the DiY PC market. Right now the best I found was DDR4-2800 @ 1.2v and I have no idea whose ICs those were in there.
 
May 11, 2008
18,301
39
126
#19
There may be a difference between IC rating and what they can be relied upon to do in "real world" settings (read: soldered in banks on a DIMM, using an actual memory controller instead of a factory test bed). Otherwise I would have to say that binning/market segmentation may be responsible for why we are not seeing DDR4-3200 @ 1.2v in the actual market. Also keep in mind that standard 288-pin DDR4 DIMMS are not the only places where these ICs can wind up. You may be looking at products that wind up in mobile devices and/or SODIMMs.

Otherwise I can't say for sure why such products are unavailable on the DiY PC market. Right now the best I found was DDR4-2800 @ 1.2v and I have no idea whose ICs those were in there.
I can imagine that when soldered on a pcb, connected to a mainboard through a dimm connector , copper traces in the motherboard , up into the cpu socket, from the socket to the pins, from the pins to the substrate and finally into the die, that there is quite the impedance matching to be performed. But is that not the task of the IMC and the ddr4 ic to do some selftest and training and optimizing the impedance matching for a proper signal ?

I am sure that because there is a shortage, a given manufacturer is less dedicated with coming up with something new to out perform the adversaries.

Indeed, the shortage is not helping.
 
Apr 27, 2000
12,376
1,325
126
#20
But is that not the task of the IMC and the ddr4 ic to do some selftest and training and optimizing the impedance matching for a proper signal ?
I would hope so. It really comes down to uniformity of impedance sources though . . . or so I would think? Mobo manufacturers seem not to choose uniform trace lengths from CPU socket to DIMM slots. Maybe there's some wiggle room there without putting the IMC over a barrel.

Anything beyond speculation on my part would require someone in the industry to chime in on the topic.
 
May 11, 2008
18,301
39
126
#22
I would hope so. It really comes down to uniformity of impedance sources though . . . or so I would think? Mobo manufacturers seem not to choose uniform trace lengths from CPU socket to DIMM slots. Maybe there's some wiggle room there without putting the IMC over a barrel.

Anything beyond speculation on my part would require someone in the industry to chime in on the topic.
Actually, Motherboard manufacturers do perform trace length matching. That is why some of all the traces between the memory socket and the cpu socket have been layed out as serpentine routing.
For example :





This done for timing issues such as skew between clock and data signals. But it is difficult to get proper impedance matching and proper skew matching and proper EMI.
And of course, a motherboard manufacturer has to deliver a product and as such designs a mother board that will work with current available technology.
Although the high end boards can be developed and routed to support very high dram frequencies.
But in general all motherboards have trace length matching.
Maybe because the ryzen 2 supports higher clockspeeds, then the upcoming mother boards might also have been routed to take better advantage of high memory clocks.

Also nice to know :

 
Apr 27, 2000
12,376
1,325
126
#24
Actually, Motherboard manufacturers do perform trace length matching.
Then there is a spec they have to follow. That's good to know.

Maybe because the ryzen 2 supports higher clockspeeds, then the upcoming mother boards might also have been routed to take better advantage of high memory clocks.
I'm hoping someone will put x370 and x470 boards through the paces to see how much of a difference motherboard/chipset choice can make. We might not see better DRAM speeds out of x370 + Pinnacle Ridge if routing plays a major role in higher DDR4 speeds.
 

Ratman6161

Senior member
Mar 21, 2008
607
0
91
#25
Interesting perspective on latency as CAS rating vs. latency in NS

http://www.crucial.com/usa/en/memory-performance-speed-latency

This is useful when over clocking. Essentially they are saying higher speed gives you a lower latency in NS even if you have to increase CAS to get there. In the table in the linked page one example is the last two entries. If you have DDR4 2400 each clock cycle takes .83 nano seconds. With a CAS of 17 cycles you get a true latency of 14.17 nano seconds. Now, supposing you over clock the ram to DDR4 2666 but to achieve that you have to increase the CAS from 17 to 18 for stability. Even though you increase the number of cycles of latency from 17 to 18 cycles, each cycle is now shorter at .75 nano seconds instead of .83. So the overall latency drops from 14.17 to 13.5 nano seconds as a result of the over clock.

In my case I'm using Crucial DDR4 2400 over clocked to 3066 and raised CAS from 16 to 18 to get there. But I get the benefit of the increased speed, and the latency measured in nano seconds is still lower.

That's the theory anyway.
 


ASK THE COMMUNITY