• Guest, The rules for the P & N subforum have been updated to prohibit "ad hominem" or personal attacks against other posters. See the full details in the post "Politics and News Rules & Guidelines."

GPU Memory Math: GDDR6 vs. HBM2 vs. GDDR5X vs. AQUABOLT

goldstone77

Senior member
Dec 12, 2017
217
93
61
Samsung Starts Mass Production of 16Gb GDDR6 Memory ICs with 18 Gbps I/O Speed
by Anton Shilov on January 18, 2018 9:00 AM EST

What Samsung is announcing this week is its first 16 Gb GDDR6 IC that features an 18 Gbps per pin data transfer rate and offers up to 72 GB/s of bandwidth per chip. A 256-bit memory subsystem comprised of such DRAMs will have a combined memory bandwidth of 576 GB/s, whereas a 384-bit memory subsystem will hit 864 GB/s, outperforming existing HBM2-based 1.7 Gbps/3092-bit memory subsystems that offer up to 652 GB/s. The added expense with GDDR6 will be in the power budget, much like current GDDR5/5X technology.


Technology just keeps getting better! RAM is the real bottleneck in computing, so anytime I see jumps in RAM performance I'm a happy camper!
 

goldstone77

Senior member
Dec 12, 2017
217
93
61
I used MS paint to add "Aquabolt" to the image, and allow and easy comparison of the two memories now in full production at Samsung!
 

goldstone77

Senior member
Dec 12, 2017
217
93
61
Fixed Bus Width to 1024

Based on Samsung's press release
* Editor’s Note:

[HBM2 and GDDR5 data bandwidth calculation]

-An 8GB HBM2 package’s data bandwidth: 2.4Gbps per pin x 1024bit bus = 307.2GBps

Using four HBM2 packages in a system: 307.2GBps x 4 = 1228.8GBps = approximately 1.2TBps

-A 8Gb GDDR5 die’s data bandwidth: 8Gbps per pin x 32bit bus = 32GBps
https://news.samsung.com/global/samsung-starts-producing-8-gigabyte-high-bandwidth-memory-2-with-highest-data-transmission-speed
 
Last edited:

goldstone77

Senior member
Dec 12, 2017
217
93
61
Not about GPUs, but next iteration of Samsung DDR4: http://www.samsung.com/semiconductor/insights/news-events/samsung-now-mass-producing-industrys-first-2nd-generation-10-nanometer-class-dram/

Minus integrated GPU gaming, 2-channel DDR4 is not a bottleneck for current client CPUs. Perhaps, Samsung's improvements will enable bins of even lower latency, like ~8 ns (3200 MT/s, CL12 or 13) at 1.35 V.
Any time the CPU has to reach to RAM outside the core for new instructions it adds time. So, the faster/lower latency the RAM the faster overall computing will be.
Samsung’s 2nd-generation 10nm-class 8Gb DDR4 features an approximate 30 percent productivity gain over the company’s 1st–generation 10nm-class 8Gb DDR4. In addition, the new 8Gb DDR4’s performance levels and energy efficiency have been improved about 10 and 15 percent respectively, thanks to the use of an advanced, proprietary circuit design technology. The new 8Gb DDR4 can operate at 3,600 megabits per second (Mbps) per pin, compared to 3,200 Mbps of the company’s 1x-nm 8Gb DDR4.
It would be nice if they would be a little more clear about the advantages numerically. They have DDR4 3600 RAM now that can operate at 10-15% less power 1.35V to 1.2V.
 

IntelUser2000

Elite Member
Oct 14, 2003
7,374
2,066
136
It would be nice if they would be a little more clear about the advantages numerically. They have DDR4 3600 RAM now that can operate at 10-15% less power 1.35V to 1.2V.
10-15% lower compared to?

Generally, going to 1.2V from 1.35V would give 20% reduction in power, but that's without architectural refinements, and at the same frequency.

The process might have improved with 2nd generation 1xnm DRAMs, resulting in an improvement without voltage reductions. Frequency increase would counter that though. Marketing makes it sound like you get the frequency improvement and the power reduction from voltage/process, so we have to be careful of that. "And" in process marketing talk really means "Or".

As for GDDR5/6 comparisons, they state 35% reduction in power, but much higher frequency. 864GB/s GDDR6 would use slightly more power than 548GB/s GDDR5 setup in Titan Xp.

Any time the CPU has to reach to RAM outside the core for new instructions it adds time. So, the faster/lower latency the RAM the faster overall computing will be.
Future is said to be:
-HBM2 and successors for on-package fast memory
-DRAM for capacity expansion
-NV technologies like 3D XPoint for greater capacities

The latency benefits for on-package and HBM aren't guaranteed though. Implementation details matter, but eventually we should see it. It's really about the bandwidth.
 
  • Like
Reactions: goldstone77

goldstone77

Senior member
Dec 12, 2017
217
93
61
10-15% lower compared to?

Generally, going to 1.2V from 1.35V would give 20% reduction in power, but that's without architectural refinements, and at the same frequency.

The process might have improved with 2nd generation 1xnm DRAMs, resulting in an improvement without voltage reductions. Frequency increase would counter that though. Marketing makes it sound like you get the frequency improvement and the power reduction from voltage/process, so we have to be careful of that. "And" in process marketing talk really means "Or".

As for GDDR5/6 comparisons, they state 35% reduction in power, but much higher frequency. 864GB/s GDDR6 would use slightly more power than 548GB/s GDDR5 setup in Titan Xp.



Future is said to be:
-HBM2 and successors for on-package fast memory
-DRAM for capacity expansion
-NV technologies like 3D XPoint for greater capacities

The latency benefits for on-package and HBM aren't guaranteed though. Implementation details matter, but eventually we should see it. It's really about the bandwidth.
Bandwidth/frequency/latency all play their parts. And as we go smaller we will reach theoretical limits, which suggest if you can't go smaller go wider.
 

ASK THE COMMUNITY