[Sweclockers] SK Hynix is showcasing next-gen HBM memory for Nvidia's Pascal

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

nvgpu

Senior member
Sep 12, 2014
629
202
81
Use HBM now and be limited to 4GB on Quadro, Tesla & GeForce cards? Thats a good laugh.

AMD may be first to GDDR5 but their GDDR5 controllers sucks and is power inefficient and can't support higher speeds compared to Nvidia's GDDR5 controller that can run at 7GHZ & even supports 8GHz memory speeds.

http://www.anandtech.com/show/5699/nvidia-geforce-gtx-680-review/2

Perhaps the icing on the cake for NVIDIA though is how many revisions it took them to get to 6GHz: one. NVIDIA was able to get 6GHz on the very first revision of GK104, which after Fermi’s lackluster performance is a remarkable turn of events. And ultimately while NVIDIA says that they’re most proud of the end result of GK104, the fact of the matter is that everyone seems just a bit prouder of their memory controller, and for good reason.

Nvidia will have the better memory controller again for HBM era.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Nvidia will have the better memory controller again for HBM era.

Just because they had for GDDR5, doesnt mean they automatically will have for HBM. So lets wait and see till we can compare products.
 

wand3r3r

Diamond Member
May 16, 2008
3,180
0
0
Use HBM now and be limited to 4GB on Quadro, Tesla & GeForce cards? Thats a good laugh.

AMD may be first to GDDR5 but their GDDR5 controllers sucks and is power inefficient and can't support higher speeds compared to Nvidia's GDDR5 controller that can run at 7GHZ & even supports 8GHz memory speeds.

http://www.anandtech.com/show/5699/nvidia-geforce-gtx-680-review/2



Nvidia will have the better memory controller again for HBM era.

Geforce cards? The only card with more memory is the titan and possibly some custom 780 flavor.

What does a memory controller on a 2012 card demonstrate for the lack of HBM in 2015. It certainly doesn't help prove any superiority which you claim. :confused:

Why are you trying to spin the fact that nv is stuck on older memory as somehow being smart and superior. :rolleyes:
 

Skurge

Diamond Member
Aug 17, 2009
5,195
1
71
HBM has higher bandwidth, lower latency, uses half the power of GDDR5 and reduces the space taken up by the die, will make for smaller, simpler PCBs, will run cooler.

Please tell me it was smart for Nvidia to stick with GDDR5 for 2015 instead of HBM. We have no idea what other changes AMD have done to the architecture now that they have 50% more bandwidth to take advantage of it.
 

FatherMurphy

Senior member
Mar 27, 2014
229
18
81
I don't think there is any evidence that Nvidia is "behind" AMD with regard to HBM. What I mean is that, there is nothing to say the absence of HBM on GM200 is a technological deficiency on Nvidia's part rather than a deliberate decision based on the relative cost of GDDR5 v. HBM, the ability to stack more GDDR5 v. HBM, etc. The reviews I have seen of the Titan X show that it is not bandwidth bound when overclocking the core (admittedly, these are OCs done on the stock cooler, which appears to limit GM200). Perhaps GM200 simply would not substantially benefit from the increased memory bandwidth.

Sometimes you have to pick and choose competing technologies, and it isn't absolutely clear right now why Nvidia launched GM200 with GDDR5 instead of HBM, though we are all free to speculate.
 

AnandThenMan

Diamond Member
Nov 11, 2004
3,949
504
126
It's all about timing, plus Nvidia has not been involved with HBM from the beginning like AMD has. And yes Nvidia will be behind when the 3xxx cards hit, HBM is the future AMD will have it Nvidia not yet.
 

Despoiler

Golden Member
Nov 10, 2007
1,966
770
136
This is Nvidia marketing fluff at its finest. They always try to create relevance on a topic even when they have none. Nvidia has said is that it's going to use HBM2 on Pascal, but there is nothing exclusive nor noteworthy about that. They are going to get beat to the punch on use of HBM just like they did with GDDR5. AMD will be using HBM2 at the same time as Nvidia, which is entirely based on availability. I can only laugh at Nvidia every time they do something like this.
 

DownTheSky

Senior member
Apr 7, 2013
787
156
106
They are directly related though. I'll try to explain this with my limited knowledge.

Going by the energy conservation rule - all energy is transformed, none is wasted - all the electric energy used is released as thermal energy. Electricity is the organized movement of electrons. A chip functions by changing the state of the transistors inside it from 0 to 1 (and -1?). This is done by controlling the flow of electrons passing through them. Some of those electrons will escape the current (leakage) trying to find their way out of the chip, they start bouncing off he atoms inside the chip which in their turn they start vibrating (heat is vibrating atoms). The more electrons escape per unit of circuit, and the bigger the chip, the bigger wattage you'll need to properly operate the chip.
 
Last edited:

Vesku

Diamond Member
Aug 25, 2005
3,743
28
86
Not sure how Sweclockers can know which one of AMD and Nvidia will get their FinFet GPUs out the door first. If they have solid information about the FinFet production wafer schedules that would be far more interesting than "there will be HBM2 and it will be better than HBM1".
 

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
Use HBM now and be limited to 4GB on Quadro, Tesla & GeForce cards? Thats a good laugh.

AMD may be first to GDDR5 but their GDDR5 controllers sucks and is power inefficient and can't support higher speeds compared to Nvidia's GDDR5 controller that can run at 7GHZ & even supports 8GHz memory speeds.

http://www.anandtech.com/show/5699/nvidia-geforce-gtx-680-review/2



Nvidia will have the better memory controller again for HBM era.

nVidia has to run their memory that fast because they use small buses on most of their cards. The 290X has a 512bit bus, it doesnt need more expensive memory to get X amount of bandwidth. If there is one thing AMD GPU's have acceled at, it is have plenty of bandwidth. Both Tahiti and Hawaii have more bandwidth than they need.

Also, saying nVidia is "smarter" for sticking with older, slower, higher latency memory is a bit laughable. If anything, part of AMD's deal with Hynix is that they have exclusive rights for the first release.

AMD helped develop HBM just like they helped to develop GDDR5 (And own patents on).
 

Hitman928

Diamond Member
Apr 15, 2012
5,262
7,890
136
They are directly related though. I'll try to explain this with my limited knowledge.

Going by the energy conservation rule - all energy is transformed, none is wasted - all the electric energy used is released as thermal energy. Electricity is the organized movement of electrons. A chip functions by changing the state of the transistors inside it from 0 to 1 (and -1?). This is done by controlling the flow of electrons passing through them. Some of those electrons will escape the current (leakage) trying to find their way out of the chip, they start bouncing off he atoms inside the chip which in their turn they start vibrating (heat is vibrating atoms). The more electrons escape per unit of circuit, and the bigger the chip, the bigger wattage you'll need to properly operate the chip.

No offense, but none of this is really relevant to what was said nor very accurate.
 

Cookie Monster

Diamond Member
May 7, 2005
5,161
32
86
Not smarter, it's called being behind. AMD helped design HBM and as such will have the first GPUs out using it, very similar to GDDR5. Make no mistake Nvidia would love nothing more than to be using HBM now.

I wouldn't be too sure about that. Why? HBM is very expensive right now. There is only one supplier and this technology is in its first iteration. Although there are bandwidth/power gains to be had.. this is in return for much higher cost/risk compared to your conventional DRAM technology.

By the time HBM2 is around, prices would have gone lower and perhaps more suppliers to boot. It would have matured alot more than right now. Its a similiar argument regarding process nodes. nVIDIA and AMD on several occasions learned it the hard way when introducing a new GPU architecture on a brand new process for those hypothetical 10~30% reduction in power consumption (with higher clocks) yet it backfired resulting in lots of leakage and higher loss.

And history has shown us time and time again that other than power consumption (which is good of course!), usually no high end cards are bandwidth limited even in extreme scenarios because the shader core becomes bottle necked way before you start hitting bandwidth problems.

Think R600 and GDDR4/512bit bus or HD4870 with GDDR5 vs its competition all using GDDR3.. I look forward to the new tech, but I have my doubts especially when the card has ALL that bandwidth to gobble up yet the main core is held back by 28nm.
 

monstercameron

Diamond Member
Feb 12, 2013
3,818
1
0
I wouldn't be too sure about that. Why? HBM is very expensive right now. There is only one supplier and this technology is in its first iteration. Although there are bandwidth/power gains to be had.. this is in return for much higher cost/risk compared to your conventional DRAM technology.



By the time HBM2 is around, prices would have gone lower and perhaps more suppliers to boot. It would have matured alot more than right now. Its a similiar argument regarding process nodes. nVIDIA and AMD on several occasions learned it the hard way when introducing a new GPU architecture on a brand new process for those hypothetical 10~30% reduction in power consumption (with higher clocks) yet it backfired resulting in lots of leakage and higher loss.



And history has shown us time and time again that other than power consumption (which is good of course!), usually no high end cards are bandwidth limited even in extreme scenarios because the shader core becomes bottle necked way before you start hitting bandwidth problems.



Think R600 and GDDR4/512bit bus or HD4870 with GDDR5 vs its competition all using GDDR3.. I look forward to the new tech, but I have my doubts especially when the card has ALL that bandwidth to gobble up yet the main core is held back by 28nm.


So...no one should buy/invest in new tech because something better is around the corner? Makes perfect sense.
 

Cookie Monster

Diamond Member
May 7, 2005
5,161
32
86
So...no one should buy/invest in new tech because something better is around the corner? Makes perfect sense.

Im not sure how you came to that conclusion.

Investments to new tech leads to better tech. Commercialization on the other hand is a different animal all together.

A good example is GDDR4. Why didn't GDDR4 take off yet its successor GDDR5 did?

Hynix and all the other companies involved in making HBM a viable commercial product will reach a point where it becomes financially sensible (and very mature) to use this along with given performance benefits over the tried and proven DRAM technology. Yet for its early customers e.g. AMD it is still a big risk. Im sure they have done many analysis on this but I still have my doubts because being an early adopter for any new technology can be punishing!
 

RussianSensation

Elite Member
Sep 5, 2003
19,458
765
126
Use HBM now and be limited to 4GB on Quadro, Tesla & GeForce cards? Thats a good laugh.

Proof? If Hynix increases density from 1GB to 2GB per stack, you can have 8GB HBM1 with interlinking on the same 4096-bit bus.

AMD may be first to GDDR5 but their GDDR5 controllers sucks and is power inefficient and can't support higher speeds compared to Nvidia's GDDR5 controller that can run at 7GHZ & even supports 8GHz memory speeds.

GK204 had a 256-bit bus vs. 384-bit bus on Tahiti.

My 7970s has a 384-bit bus and the memory can run at 1700-1750mhz, for an effective rate of 6800-7000mhz! That's 50% more memory bandwidth than on a GK204, while having a whopping 1.2Tflops of DP performance in a 354mm2 die size, and beating a 680/770 in gaming performance.

Do you honestly even research? How do you not even understand that GDDR5 speed itself ties into the bus width of the memory controller. It's a lot easier to achieve 7-8Ghz GDDR5 speed on a 256-bit memory controller than on a 384-bit and it's easier to hit 7Ghz on a 384-bit than a 512-bit.

Despite a 384-bit memory controller, HD7970 has been known to hit 320-365GB/sec memory bandwidth with overclocking, a GPU from January 2012. Titan X and 780Ti on 384-bit bus just go to that level 2-3 years later, respectively! 3 years have passed and the Titan X is at 336 GB/sec. Whoops.

24k.jpg

http://www.techpowerup.com/img/12-04-03/24k.jpg

NV struggled with getting GDDR5 memory up to speed with Fermi because outside of GT240, it was their first generation of GPUs with GDDR5. AMD struggled much less with 5850/6970/7970. Your statements that AMD can't design memory controllers is based on no evidence.

Nvidia will have the better memory controller again for HBM era.

More conjecture with 0 evidence. If anything, it's more likely that AMD will have a better chance to have a superior memory controller in terms of die size and efficiency since for HBM2, it'll be their 2nd generation.

Also, saying nVidia is "smarter" for sticking with older, slower, higher latency memory is a bit laughable. If anything, part of AMD's deal with Hynix is that they have exclusive rights for the first release.

AMD helped develop HBM just like they helped to develop GDDR5 (And own patents on).

Ya, seriously. If GM200 could have had HBM1, the card would have used way less power and the die size would have been smaller due to a more compact memory controller.

There is a ~55W reduction in power usage from 512-bit memory controller paired with 8Gbps GDDR5 vs. 1st gen HBM, making HBM1 nearly 3X more efficient than GDDR5 at a similar level of memory bandwidth!

l29o6zV.png
 
Last edited:

RussianSensation

Elite Member
Sep 5, 2003
19,458
765
126
Just to correct you - the picture of Nvidia's incorporation of 2.5D memory isn't actually 2.5D memory. If you look close, well you don't need to with the giant picture you posted, you'll notice that there is no interposer. This is simply on-package DRAM which has been done before. It will be interesting to see how easily NVIDIA can implement actual 2.5D into their architecture.

From what I've read, when the DRAM sits next to or around the GPU, the interposer is underneath and that's called 2.5D interposer stacking. Again, this is a render so it doesn't have to be 100% accurate in depicting the interposer. 3D stacking is when the DRAM sits right on top/underneath of the GPU die. By all indications from that picture, Pascal is 2.5D stacking. That's how AMD defines it in their definition as well. Even Jonah Alben of NV confirmed that Pascal uses 2.5D stacking.

amd_stacked_memory.png


In order for you to have 3D Vertical Stacking, you'd need to be able to dissipate the heat from both the DRAM and the GPU die simultaneously. How would you accomplish that? You would need to have the entire 3D DRAM stack on the other side of the PCB where the back-plate is while the GPU die is exposed where it now normally sits. However, under such a scenario, you would need an insane amount of DRAM density to fit 32GB of memory in a single stack on Pascal. Alternatively, the memory die need to be so small that you could fit 4 stacks on the other side of a huge 600mm2 die. Sounds very complex because the current DRAM manufacturing/fabrication process isn't advanced enough to be that small!

"Nvidia’s senior vice president of GPU engineering, Jonah Alben, didn’t want to comment on the manufacturing process, or if the chip has already taped out. He was clear that the Pascal uses 2.5D HBM memory, which you can tell from the Pascal renders that we saw at GTC in March 2014 and again just hours ago.

He didn’t want to comment if the Volta card with new architecture will use the real 3D memory, where the memory chips are stacked on top of the GPU. Volta according to latest Nvidia roadmaps can be expected around 2018."

http://www.fudzilla.com/news/graphics/37294-pascal-uses-2-5d-hbm-memory

As I said, I stand firm right now based on information current available from various sources and Pascal renders that Pascal is 2.5D stacking.

It is correct. HBM is a 3D memory.

3D stacking is a different thing.

OK, well I guess you could in theory call 2.5D interpose "3D memory" since the memory itself is stacked vertically. However, a logical definition of "real" 3D memory is 3D Vertical Stack directly on top/underneath the GPU; and this is the definition AMD agrees with too. That's coming with Volta in 2018, not Pascal, and yet NV wants to market Pascal with "3D memory" which sounds misleading indeed.
 
Last edited:

Cookie Monster

Diamond Member
May 7, 2005
5,161
32
86
Do you honestly even research? How do you not even understand that GDDR5 speed itself ties into the bus width of the memory controller.It's a lot easier to achieve 7-8Ghz GDDR5 speed on a 256-bit memory controller than on a 384-bit and it's easier to hit 7Ghz on a 384-bit than a 512-bit.

I wouldn't be so sure about that. I think its almost entirely related to the memory controller and the bus design/layout not the actual VRAM IC itself. Your also forgetting that its not just 1x512bit memory controller but made up of say 8x64bit memory controllers. Id think its more to do with how well an individual memory controller can handle the higher clocks because the rest are just duplicates of itself.

If AMD had the chance of going 256bit and a higher clocked part, they would most definitely taken that route because complexity as a whole goes down i.e. cost. I think nVIDIA do hold an advantage here hence why they are constantly getting away with lower memory bus width than the competition.
 

Cookie Monster

Diamond Member
May 7, 2005
5,161
32
86
How expensive? How much will HBM add to the cost of the card versus GDDR5?

Some good articles:
http://semiengineering.com/time-to-revisit-2-5d-and-3d/
http://chipdesignmag.com/display.php?articleId=5279

And some numbers:
-2x the cost of LPDDR3 (LPDDR3 is not cheap either)
-~25c per mm2 for the interposer (~5 times the cost for DRAM for something thats just wires and at 65nm!)

If you do some maths.. it doesn't look too good. And we dont know what the yield rates will be for the seperate parts and once they are assembled together..
 

RussianSensation

Elite Member
Sep 5, 2003
19,458
765
126
I wouldn't be so sure about that. I think its almost entirely related to the memory controller and the bus design/layout not the actual VRAM IC itself. Your also forgetting that its not just 1x512bit memory controller but made up of say 8x64bit memory controllers. Id think its more to do with how well an individual memory controller can handle the higher clocks because the rest are just duplicates of itself.

You confused me. You said the exact same thing as me, but in different words. I just said that the complexity of the memory controller determines how well it can hit higher clocks because you can easily source 5-8Gbps GDDR5 chips. The more complex the memory controller, the harder it is to achieve higher clocks speeds. That's been the general rule for AMD/NV for several generations, but it's not always 100% true as I've shown with 384-bit 7970 hitting 365GB/sec!

If AMD had the chance of going 256bit and a higher clocked part, they would most definitely taken that route because complexity as a whole goes down i.e. cost.

No, they would not. That's the #1 misconception for Hawaii. AMD reduced the memory controller's die size area by 20% from Tahiti's 384-bit bus and because the controller is 512-bit, they used less power hungry / slower GDDR5 chips. The end result is 50% increase in memory bandwidth/mm2. That's engineering winning 101. Your suggestion that AMD would have been better off with a 256-bit or a 384-bit memory controller on 290X doesn't fly. 290X keeps up with 780Ti despite VASTLY superior DP performance, similar perf/watt, and a 438mm2 die size vs. 561mm2 die size for Kepler GK210!

So it's clear that when comparing efficiency per mm2 of 2 head-to-head competing architectures (290X vs. 780Ti), Hawaii completely smashed its direct competition!! AMD engineers designed a crazy efficient 512-bit memory controller which allowed for 290X to be just 438mm2, or just 24% larger in die size than a 7970, but pack 50% more memory bandwidth and 37.5% more functional units (SPs and TMUs), with 100% the ROPs! That's incredible in hindsight.

From an engineering point of view (SP/DP/compute/& perf/mm2), 290X is by far superior to a 780Ti. Just think about it, a 550mm2 is what 390X is rumoured to be which is another way of saying if you scale Hawaii 290X to 550mm2, how would the 561mm2 780Ti compare? It wouldn't stand a chance! All this time NV has been 'lucky' that AMD wouldn't have the b**lls to make a 500mm2+ GPU. Once that happens, NV's 15-20% historical advantage is going to disappear.

Memory.jpg


I think nVIDIA do hold an advantage here hence why they are constantly getting away with lower memory bus width than the competition.

NV might hold an advantage in more efficient colour compression, but not the design of the memory controller. As I already said, 290X matches or beats 780Ti in performance despite a 438mm2 die size, but still packs a ton of DP performance, and a 512-bit memory controller. Despite a 561mm2 die, 780Ti could only manage a 384-bit memory controller, far inferior SP and DP performance and can't even outperform the 290X!

9434

9435


About the only thing 780Ti can rightfully claim over 290X from an engineering point of view is about an 11% advantage in perf/watt at 1440/4K. That's nothing, considering AMD's engineers designed a way better all-around gaming+compute chip at only $550, gave it 4GB of VRAM and packed it in a die size just 78% the size of 780Ti's. That's why members on our forum who are so quick to write off AMD by comparing Maxwell against the outdated R9 200 series and not understanding just what AMD engineers were able to achieve with a 290X are going to be for a major surprise when 390X drops.

Just wait until 390X - it should level the Titan X in perf/mm2, SP and DP compute performance, and provide > 50% the memory bandwidth and you'll see just how good AMD can design the memory controller. :p

That's why this idea that Pascal will use HBM2 but AMD won't is some fluffy BS alright.
 
Last edited:

Adul

Elite Member
Oct 9, 1999
32,999
44
91
danny.tangtam.com
If I recall correctly AMD did have a much better GDDR5 controller in earlier generations of its card that could reach higher speeds, but it was deem more than what was needed when they went with a wider bus instead.