Question DDR5's impact on CPU design

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
With DDR5 on the horizon, we will be seeing a very significant, if not massive boost in available bandwidth for CPUs. Correct me if I'm wrong, but a dual channel DDR5 system should easily eclipse 100GB/s bandwidth. Quad channel memory systems and greater will have even more; over 200GB/s.

So the question, how do you think this is going to impact CPU design? We already have CPUs with tons of cores already that perform very well given the more limited bandwidth they have. What sort of enhancements would CPU architects make to be able to utilize the much higher bandwidth available with DDR5?

More cores is the obvious one, but what about more SIMD units? Or more SMT threads?
 

DrMrLordX

Lifer
Apr 27, 2000
22,551
12,413
136
Doesn't increased parallelism result in a greater dependence on memory bandwidth than on latency though? Serial tasks tend to be more latency dependent as I recall, which is why GPUs care much less about latency than CPUs.

Depends on the nature of the workload. Usually increasing parallelism in your code stresses intercore communications more than anything else, assuming its a common working set between threads. Highly-parallelized workloads with working sets that are individual to each thread would potentially put the most stress on memory bandwidth since it would be impossible for one core to find the data it needs from the cache hierarchy of another core.

The MSI Afterburner overlay he had didn't list any power numbers for the CPU, but it did for the GPU. The GPU power numbers were higher with the 4x4 config, but that is easily explainable due to the higher framerates. What I don't get though, is that the 4x4 also had significantly higher RAM usage than the 2x8. I wonder if that is also a result of the higher performance?

That certainly makes me wonder.
 
  • Like
Reactions: Carfax83

Ajay

Lifer
Jan 8, 2001
16,094
8,111
136
Yes. But it is not a real issue, because it has twice the burst length and also much higher transfer speed.
The bigger advantage is that the DIMM can handle two concurrent requests at different addresses. (Concurrent request to same address are addressed at UMC/cache level)
This means that a thread waiting for data has a lesser chance of having to wait for a current request from another core to finish thereby resulting in overall increase in .... "IPC" :nomouth:
Also we have to consider that the data transfer part of the DRAM access is a just a part of the overall access time. There are also the delays for CAS, RAS, delay after read and write beside the others. Everytime an address is accessed there has to be a small delay before another address can be accessed. Therefore the 2 channels will help.

For AMD, they made some optimizations to the LS subsystem to fuse memory access operations from multiple instructions into a single bigger block transfer. This will help too in keeping memory subsystem slightly less loaded.
I'm just not seeing the advantage over dual channel DIMMs. The actual real world advantage (not mem benchmarks) of dual channel over single channel isn't stupendous, IIRC. A case of diminishing returns.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,774
6,757
136
I'm just not seeing the advantage over dual channel DIMMs. The actual real world advantage (not mem benchmarks) of dual channel over single channel isn't stupendous, IIRC. A case of diminishing returns.
Are you thinking only gaming workloads or professional and server workloads?
For the next few years, the big leap in core count will need more memory channels. I cannot imagine a 64+ core machine serving several instances of a service with only few channels of memory.
And the improved RAS, Read and Write ECC, increased capacity per module are specifically targetted for this scenario.
 

Soulkeeper

Diamond Member
Nov 23, 2001
6,731
155
106
I think to properly explore the benefit of more bandwidth we need to be looking at high core/thread count CPUs.
The benefits are much more apparent in epyc for example. Of course a quad core budget gaming rig isn't going to scale well.

The trend of less bandwidth per core/thread has clearly been playing out for over a decade. Even if you have bigger caches and better algorithms, the memory still has to be read/written to refill those caches.

I kinda think of memory as hard drives, The time to read/write the entire capacity has consistently gotten slower. Speed hasn't kept pace with capacity and bandwidth per thread is a fraction of what it used to be in some cases. Of course the degree to which any bottleneck affects your particular use case varies wildly. You can see linear performance increases in the proper use cases, or you could see zero increase nomatter how much you increase bandwidth.
If memory wasn't a bottleneck in many use cases, we wouldn't need such complex caches on the cpu.
A large portion of modern CPU design is dedicated to working with or hiding memory access shortcomings.
 
Last edited:

TheELF

Diamond Member
Dec 22, 2012
4,027
753
126
If you look at that YouTube video above, it appears some of the really big modern games are sensitive to bandwidth as well. Very astonishing to me! But I guess I shouldn't really be surprised, as games these days are so much bigger and more parallel than ever before, so memory bandwidth is bound to be more important than it used to be.
Don't consoles share a pool of RAM between the system and the GPU?!
I would guess it has more to do with that than with anything else.
There is probably a whole storm of GPU ram that has to be copied to the system ram constantly and quad channel makes this much faster.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
@DrMrLordX and @TheELF, I did some more research and it appears it's probably a bit more complicated than just the extra bandwidth provided with quad channel can explain.

It appears that quad channel memory kits increase performance even in CPUs that don't have a quad channel memory controller. This video uses a Ryzen 2600x and tests with both dual channel and quad channel kits, and the quad channel kit was definitely faster. This would defy common logic, but I read a few YouTubers claiming the performance boost was due to "memory interleaving," which is the only plausible explanation I've read.

 
  • Like
Reactions: Makaveli

lightmanek

Senior member
Feb 19, 2017
508
1,245
136
But can that alone explain such a huge increase? The AMD results definitely weren't as great as in the video I linked to on the first page, but they did test more games.

Yes, this is effect of running single rank vs dual rank modules even if you only run two sticks. CPU memory controller can sustain twice as many open pages with dual rank which helps greatly in some situations.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
The memory interleaving explains one problem I had with a memory kit from years ago, when I had a 5930K. I had a 32GB quad channel DDR4 3200 CL16 CR2 kit. It performed well but had fairly relaxed timings. When I got my 6900K I got a newer kit with Samsung B die memory chips that could clock higher with more aggressive timings. The stock timings on it were CL14 CR1. At any rate, when I got my new kit and compared the AIDA64 memory test scores to my previous ones with the older kit, they were slower in everything but latency which I found very odd. The read, write and copy scores were all significantly lower.

I had no explanation for it at the time, but now when I think about it, I bet it was because the "slower" kit had dual rank modules. I just checked my current kit using CPUID and it says my memory is single rank.

And yeah, memory interleaving has definitely gone under the radar. I remember YEARS ago that you were able to enable or disable it in the BIOS, but now you don't see that option anymore. I think the motherboard manufacturers enable it by default with no option to turn it off, because what would be the point? Unless you want a slower computer.
 
  • Like
Reactions: lightmanek