"Hybrid memory cube" - on-CPU memory (sort of)

pm

Elite Member Mobile Devices
Jan 25, 2000
7,419
22
81
The press release for the "hybrid memory cube" came out in early December, but I hadn't heard of it until someone mentioned it to me today. I searched and no one else has posted anything.

It looks like it's a three-dimensional silicon structure in which you take flash memory, DRAM memory and then stick them onto a CPU like a miniature silicon chip sandwidth and then connect them with a technology called thru-silicon vias (TSV).

You get ridiculously high bandwidth from it. Since the traces are really short and you can use more pins, you can drive them in parallel and at higher frequencies without the issues that you run into a PCB routing to a DRAM. Similarly, you get much better latency and you get much better power efficiency. Since the wires are much shorter, it takes less power to drive them. And the whole motherboard will be much smaller because you won't have the DRAM portion any more.

On the flipside, if this happens then you won't be adding RAM to your computer any more. You'll buy your new 4GHz CPU with 4GB of onboard memory.

Here's a CNet article about it with a neat graphic of what it looks like:
http://news.cnet.com/8301-13924_3-57332954-64/micron-to-tap-ibm-chip-stacking-tech-for-fast-memory/

Here's a blog about it from Intel with a nifty looking photo of what it looks like in real life:
http://blogs.intel.com/research/2011/09/15/hmc/

Here's a web page from the consortium working on it along with a video of it:
http://hybridmemorycube.org/technology.html


Looks like they'll have a specification later this year.

Oh, for full disclosure, I work an engineer at Intel, but I had never even heard that Intel was involved in this. I just posted it because I thought it was a neat topic to discuss.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
That's awesome!

Will really be the next step in improving the utility of ram. First was the IMC, now it will be IM.

Who knows, 10yrs from now they might move the SSD on-die with stacked NAND as well.
 

jihe

Senior member
Nov 6, 2009
747
97
91
That's awesome!

Will really be the next step in improving the utility of ram. First was the IMC, now it will be IM.

Who knows, 10yrs from now they might move the SSD on-die with stacked NAND as well.

They should just make a ridiculously large cache.
 

grkM3

Golden Member
Jul 29, 2011
1,407
0
0
how sick would it be at 18nm they have integrated gpu/dram and nand flash for the os to run off of.

Like 8gb dram and 8gb ssd for os
 

dma0991

Platinum Member
Mar 17, 2011
2,723
2
0
With the stack being much thicker than a conventional piece of silicon, cooling it might be an issue.
 

MrTeal

Diamond Member
Dec 7, 2003
3,919
2,708
136
This would be really interesting technology in the desktop space. In addition to the technology benefits, there should also be cost benefits once the process is nailed down. A big portion of the cost of a stick of ram is the packaging of the dies and the PCB. I'm not sure I'd want to remove all expandability, but I guess you could say the same about the move to APUs.

It might not be the same as actually having all the ram at full speed, but I wonder how much of the benefit could be realized with appropriate OS/App support if it was implemented as a huge L4 cache. Maintain the existing L1/L2/L3 structure, and add 4GB of DDR3 running at 10x the bandwidth of standard DDR3 as another level of memory. For expandability, leave one or two DDR3 channels. Maybe if you have applications where you really need 32GB of RAM all running as fast as possible you might be better off with a quad channel solution, but for most people I would guess if you had 8GB on only one channel of DDR3-1866 + 4GB at DDR3-16000 or whatever it would be much faster than two or even four channels of DDR3-1866.
 

Soulkeeper

Diamond Member
Nov 23, 2001
6,740
156
106
I remember hearing about this tech from an IBM article a few years ago
It just seems like the way to go. Socketed memory with a heatsink on it would be awesome. Or better yet, the complete realization of a SOC.
 

pm

Elite Member Mobile Devices
Jan 25, 2000
7,419
22
81
Absolutely useless for games. I bet, latencies have stayed the same.

I've read quite a few articles that showed that there was a large correlation between graphics performance and memory bandwidth. Such as this one at RealWorldTech So instead of it being useless for games, I think it would be a huge advantage. But why did you say the latencies are the same? They shouldn't be. Latency should be lower.
 

dawp

Lifer
Jul 2, 2005
11,347
2,710
136
Here's a web page from the consortium working on it along with a video of it:
http://hybridmemorycube.org/technology.html

On the third video from the OP, the cpu and memory are shown as separate entities with two configurations, near, which has the HMC mounted near but is limited by the amount of memory you can have and far which appears to limit speed but greatly increases the amount you can have.
 

Magic Carpet

Diamond Member
Oct 2, 2011
3,477
234
106
I've read quite a few articles that showed that there was a large correlation between graphics performance and memory bandwidth. Such as this one at RealWorldTech So instead of it being useless for games, I think it would be a huge advantage. But why did you say the latencies are the same? They shouldn't be. Latency should be lower.
Okay, maybe I don't see the full picture, I only read the Intel blog.

Nowhere in the article, did latency come up in a spotlight. High bandwidth, yes? power consumption? That too.

Rationally, higher bandwidth invites higher operating temperatures... so do the lower latencies... and since the whole thing was mentioned in a server context ( where lower latency is a lesser gain vs power consumption), I have presumed... they are similar to what we have now.

Not sure, how "enthusiastic" this thing will turn out to be.
 

Magic Carpet

Diamond Member
Oct 2, 2011
3,477
234
106
It's just like my HP all-in-one, it can print, it can copy and it can fax. But doesn't actually excel in anything. But there is certainly a market for it, no question.
 

MrTeal

Diamond Member
Dec 7, 2003
3,919
2,708
136
It's just like my HP all-in-one, it can print, it can copy and it can fax. But doesn't actually excel in anything. But there is certainly a market for it, no question.

I'm not sure that analogy is perfectly fair, if the gains in bandwidth they saw can be translated to the real world. If you had a separate scanner, printer and fax machine the interface between them really wouldn't bottleneck any of them. In this case they're seeing huge performance benefits to going with this.

It might be a big challenge to keep the number of SKUs in control, but a product like the 2500k that is unlocked and has a larger than standard amount of RAM would be really cool. DRAM is selling for $1 per 2Gb chip. Even if it added $30 to the price of a CPU, 8GB of super fast RAM on a 2500k-level chip would definitely be worth it, especially if it shaves $10 or $20 off the cost of a motherboard. While some might not a board with no external memory slots, it sure would make the fanout simpler if they could go back to a 800 pin package and didn't have to worry about memory traces.
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Okay, maybe I don't see the full picture, I only read the Intel blog.

Nowhere in the article, did latency come up in a spotlight. High bandwidth, yes? power consumption? That too.

Rationally, higher bandwidth invites higher operating temperatures... so do the lower latencies... and since the whole thing was mentioned in a server context ( where lower latency is a lesser gain vs power consumption), I have presumed... they are similar to what we have now.

Not sure, how "enthusiastic" this thing will turn out to be.

Stop and think it threw. Everthing you said above is just wrong thinking, for this revolutionary 3D memory stack. Read the first link The IBM one than read the samsung link watch the video.
 

Magic Carpet

Diamond Member
Oct 2, 2011
3,477
234
106
Hey... I am not perfect! Just a bit short of time at the moment... but I will give it another thought later.

The joint efforts are designed to result in memory chips that realize the full performance potential of DRAM, or dynamic random access memory, resolving a long-standing problem referred to as the "memory wall." Initially, the technology will be used in areas such as networking and high-performance computing, but Micron and IBM say it will eventually appear in consumer products.
Arghh... there is still some time left before an "inevitable" adoption. Off to buy some bandwidth-starving DDR3 before it gets replaced by these magical cubes :D

Obviously, I am not the one affected by memory bandwidth. I want to see actual consumer products that take advantage of that, except LinX and other synthetic software.
 
Last edited:

grkM3

Golden Member
Jul 29, 2011
1,407
0
0
intel could probably get 100GB/sec memory bandwidth out of this on a 22nm ivy bridge vs 30GB on very fast 2400 ddr3 with very tight timings 8-9-9-1t

they could eliminate the memory bottle neck on the gpu side also and give the gpu so much to play with.
 

BrightCandle

Diamond Member
Mar 15, 2007
4,762
0
76
There is a memory bandwidth problem, a really big one. A single core CPU runs at about 3500 Mhz today, it has the ability to run at least 3 instructions in parallel, quite possibly it could run 6 or more in every clock cycle at 64 bit. At 6 instructions that is about 168000 MB/s and that is for a single core. Today modern CPUs are all about hiding the limitations of memory, they are basically multi level caches with some minimal sized processing logic attached.

Even the latency is killing us. A trip out to RAM is like 70 clock cycles, and often a lot more. That is a long time to be stuck waiting for some data to arrive. Pulling the RAM closer will reduce that even if its not 1:1 with the CPU.
 

Soulkeeper

Diamond Member
Nov 23, 2001
6,740
156
106
I agree, bandwidth is perhaps the largest limiting factor when it comes to computing.

big advances in computers recently illustrate this:

DDR memory
integrated memory controllers
dual channel memory support
64-bit vs. 128-bit mem on vid cards
SSDs
onchip L2, larger caches, L3 cache
etc.