AMD and Hynix announce joint development of HBM memory stacks

Homeles

Platinum Member
Dec 9, 2011
2,580
0
0
3DIC memory, and therefore all of 2.5/3D technology, took one step closer to full commercialization last week with the HBM joint development announcement from AMD and Hynix at the RTI 3D ASIP meeting in Burlingame CA.

Bryan Black, Sr Fellow and 3D program manager at AMD noted that while die stacking has caught on in FPGAs and image sensors “..there is nothing yet in mainstream computing CPUs, GPUs or APUs” but that “HBM (high bandwidth memory) will change this.” Black continued, “Getting 3D going will take a BOLD move and AMD is ready to make that move.” Black announced that AMD is co-developing HBM with SK Hynix which is currently sampling the HBM memory stacks and that AMD “…is ready to work with customers.”
http://electroiq.com/blog/2013/12/amd-and-hynix-announce-joint-development-of-hbm-memory-stacks/

It's been known that AMD has been flirting with Hynix's HBM for a while, but now they're a full-blown "co-developer." I wonder if we'll see products from AMD utilizing this by the end of 2014. It'll be the year of stacked memory, after all.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Its certainly looking good. Xeon Phi with stacked memory, nVidia Volta and now AMD joining too.

Cant be long (in relative terms) till the memory module is gone.
 

Ferzerp

Diamond Member
Oct 12, 1999
6,438
107
106
I always cringe about Hynix :(

Got a bad shipment of RAM once. We're talking a 30% failure rate out of nearly 1TB of ram. I know it's not fair, but now I am biased against them.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
I always cringe about Hynix :(

Got a bad shipment of RAM once. We're talking a 30% failure rate out of nearly 1TB of ram. I know it's not fair, but now I am biased against them.

But did you have to set their fab on fire? :D

For those wondering about the actual speed benefit. This is what Hynix roughly estimates.

v2wlqw.jpg

25rfuy9.jpg


It also fuels the notion that there is no successor besides stacked memory to GDDR5.
 

R0H1T

Platinum Member
Jan 12, 2013
2,582
163
106
But did you have to set their fab on fire? :D

For those wondering about the actual speed benefit. This is what Hynix roughly estimates.

v2wlqw.jpg

25rfuy9.jpg


It also fuels the notion that there is no successor besides stacked memory to GDDR5.
I bet the likes of Samsung & Co have a thing or two to say about that claim ~

Coalition of 20+ Tech Firms Backs MRAM as Potential DRAM, NAND Replacement

The thing is unless Intel &/or AMD backs one of these innovations they aren't going to take off massively anytime soon, now even if they do back them the pricing & compatibility is gonna be perhaps the single biggest factor in making them universal, unlike say Intel's Thunderbolt.

I'm not arguing against the notion that HBM is the next logical step in evolution of memory, however there are viable (better)alternatives that just need the right amount of financial & technical support to take the PC industry forward & that is certainly something I'm look forward to.
 
Last edited:

SlowSpyder

Lifer
Jan 12, 2005
17,305
1,002
126
Can someone explain 'stacked memory' to me? I assume it is what it sounds like it is. But how does heat from the bottom memory module escape? Does each memory chip get it's own 'lane'? Is this stacked DDR3? DDR4? GDDR5? Something else completely? Lastly, for all this bandwidth, doesn't the CPU/GPU/APU still need to be at least a certain size? Or does the memory controller communicate in a different method then we're used to?
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
In terms of Xeon Phi, I assume Intel uses Microns Memory Cube technology for their stacking.

8.jpg

1.jpg

2.jpg

7.jpg


So...nVidia and Samsung for Volta? :p

It will be wierd tho, to buy CPUs and APUs with a preset memory amount. We are already used to GPUs with preset.

Also stacked memory will make IGPs fly.
 

inf64

Diamond Member
Mar 11, 2011
3,884
4,691
136
I wonder if this will be available in time for Excavator?
In theory yes. In reality unlikely :(. It would make Carizzo's iGPU probably the fastest on the market and would erase all the lower end discrete GPUs from competition. Add a year or two on anything they claim and you will get approx. time frame for products.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
I wonder if this will be available in time for Excavator?

Excavator is already set for DDR3.

Xeon Phi with stacked memory is early 2015 on an enterprise class product. Volta with stacked memory is set for 2017 for endusers on GPUs(highend?). So expect affordable products in that timeframe. By 2020 I guess stacked memory will be the standard.
 

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
Excavator is already set for DDR3.

You're referring to the BS 2015 "roadmap" that AMD has already said is fake (and that doesn't match their presentation style at all)?

Given the importance of APUs in the AMD business strategy, crippling Excavator with DDR3 would make no sense at all. I'm confident that it will have something better - maybe not this, but probably at least DDR4, or the option for inclusion of GDDR5 on the motherboard. (Kaveri was reported to already support GDDR5 in the memory controller, though this was fused off in the final release.)
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
You're referring to the BS 2015 "roadmap" that AMD has already said is fake (and that doesn't match their presentation style at all)?

Given the importance of APUs in the AMD business strategy, crippling Excavator with DDR3 would make no sense at all. I'm confident that it will have something better - maybe not this, but probably at least DDR4, or the option for inclusion of GDDR5 on the motherboard. (Kaveri was reported to already support GDDR5 in the memory controller, though this was fused off in the final release.)

That roadmap was not fake ;)

http://www.sweclockers.com/nyhet/17995-amd-forbereder-kabini-for-stationara-datorer-med-sockel-fs1b
 

NaroonGTX

Member
Nov 6, 2013
106
0
76
Indeed the roadmap was real. Carrizo = DDR3 and Socket FM2+ still. No idea on what Basilisk will be... My guess is slightly-enhanced Excavator cores with DDR4 support on FM3?
 

NaroonGTX

Member
Nov 6, 2013
106
0
76
Basilisk isn't on any current roadmaps. I was just saying in general. Basilisk is apparently a SoC design. That's all I know about it.
 

sefsefsefsef

Senior member
Jun 21, 2007
218
1
71
I was at MICRO last week and Bryan Black gave the first morning keynote talk. It was the most interesting keynote I've ever attended at a computer architecture conference. Anyway, I snapped this picture during his talk (that's Bryan in the bottom right):

http://imgur.com/hxQCe8N.jpg

He said he was going to show a photograph of something that has never been shown before, but he ran out of time, so I have no idea if this is the thing that hasn't been shown publicly before or if that was something he didn't get to yet.

My picture is a little fuzzy, because I was far away, but that slide shows a genuine photograph of a working prototype, it's not just a cartoon figure.
 

Hans de Vries

Senior member
May 2, 2008
347
1,177
136
www.chip-architect.com
Where do you see Basilisk mentioned in the roadmaps?

http://webcache.googleusercontent.c...ub/curtis-rantz/9/432/b58+&cd=2&hl=en&ct=clnk

RTL Verification Engineer (Contract)

AMD


Public Company; 10,001+ employees; AMD; Semiconductors industry
January 2012 – July 2012 (7 months) Austin, Texas
• Worked in the APU Emulation group on the Single Emulation/Simulation Work Area and Build Flow project for the Kaveri, Carrizo and Basilisk APUs.

http://webcache.googleusercontent.c...m/pub/yun-zhang/14/11/786+&cd=1&hl=en&ct=clnk

Sect Manager of Verification Methodology Team

AMD


Public Company; 10,001+ employees; AMD; Semiconductors industry
August 2010 – Present (3 years 3 months) Shanghai
April 2012- Present Sect Manager of Verification Methodology Team
August 2010- April 2012 Staff verification engineer/lead of Verification Methodology Team
Project: Judo(Komodo)/WANI(Basilisk)/Cipher(Kryptos)


http://www.allthingscustomized.com/product/amd-basilisk-17381-751/
6822-front-456-tshirt.png



Hans
 

sefsefsefsef

Senior member
Jun 21, 2007
218
1
71
Can someone explain 'stacked memory' to me? I assume it is what it sounds like it is. But how does heat from the bottom memory module escape? Does each memory chip get it's own 'lane'? Is this stacked DDR3? DDR4? GDDR5? Something else completely? Lastly, for all this bandwidth, doesn't the CPU/GPU/APU still need to be at least a certain size? Or does the memory controller communicate in a different method then we're used to?

The stacked chips are bonded together, so they are like one thick chip, so heat goes through that chip just like it does any other chip. To get the power map of a stacked chip, you just sum up the power maps of all the individual layers-the thickness of each layer is so small that even when they're stacked on top of each other it's like they're one layer from a conventional chip. Bryan said in his talk that thermals are absolutely not a concern for 3D stacking.

It is not stacked DDRx. DDRx refers to the data rate of the *IO pins,* so what goes on inside the DRAM chip doesn't matter as long as it can keep up with the IO rate. HBM will have a layout and organization that will allow it to achieve 1TB/s of throughput (as per the slide above 1024 IOs x 1GHz each), compared to the 1-2 GB/s that a DDR3 chip provides.

HBM is a 2.5D technology, where there is a stack of memory *next to* the chip, connected by a silicon interposer. But in general, the various layers of a 3D stack do not need to be the same size. Smaller layers can go on top of larger layers.
 
Last edited:

Hans de Vries

Senior member
May 2, 2008
347
1,177
136
www.chip-architect.com
Looking now at the HBM Jedec standard from Oct 2013:

(can be downloaded after registration)
http://www.jedec.org/standards-documents/results/jesd235

jedec said:
The HBM DRAM is tightly coupled to the host compute die with a distributed interface. The interface is divided into independent channels. Each channel is completely independent of one another. Channels are not necessarily synchronous to each other. The HBM DRAM uses a wide-interface architecture to achieve high-speed, low-power operation. The HBM DRAM uses differential clock CK_t/CK_c. Commands are registered at the rising edge of CK_t, CK_c. Each channel interface maintains a 128b data bus operating at DDR data rates.

Hans
 

Hans de Vries

Senior member
May 2, 2008
347
1,177
136
www.chip-architect.com
Hynix will present this at ISSCC 2014:


http://www.miracd.com/ISSCC2014/PDF/ISSCC2014AdvanceProgram.pdf

SESSION 25 Wednesday February 12th

A 1.2V 8Gb 8-Channel 128GB/s High-Bandwidth Memory 2:00 PM
(HBM) Stacked DRAM with Effective Microbump I/O Test
Methods Using 29nm Process and TSV

D. U. Lee, K. W. Kim, K. W. Kim, H. Kim, J. Y. Kim, Y. J. Park,
J. H. Kim, D. S. Kim, H. B. Park, J. W. Shin, J. H. Cho, K. H. Kwon,
M. J. Kim, J. Lee, K. W. Park, B. Chung, S. Hong
SK Hynix, Icheon, Korea


Hans
 

sniffin

Member
Jun 29, 2013
141
22
81
Nice to hear this. I wonder how soon we'll see this implemented in APUs. When it happens we'll probably see the biggest uplift in iGPU performance ever for AMD.

I wonder if this is something Intel will also consider. Surely this will end up cheaper than daughter dies?
 

Hans de Vries

Senior member
May 2, 2008
347
1,177
136
www.chip-architect.com
I was at MICRO last week and Bryan Black gave the first morning keynote talk. It was the most interesting keynote I've ever attended at a computer architecture conference. Anyway, I snapped this picture during his talk (that's Bryan in the bottom right):

http://imgur.com/hxQCe8N.jpg

He said he was going to show a photograph of something that has never been shown before, but he ran out of time, so I have no idea if this is the thing that hasn't been shown publicly before or if that was something he didn't get to yet.

My picture is a little fuzzy, because I was far away, but that slide shows a genuine photograph of a working prototype, it's not just a cartoon figure.


Very interesting, thank you for posting!

That's 8 GByte of memory on the interposer with a 256 GByte/sec
bandwidth, apparently together with two equal controller slices(?)
that make up one big "die" via the interposer.

Hans.