Not until spintronics. Thermal density sets the limit for performance. 2.5D doesn't try to circumvent this and that is why it is so successful. It beat every timetable whereas HMC failed. I even got a response they weren't competing. 🙂It's not true at all that Intel or AMD made any comments on this.Their comments were strictly on licensing.
There is too much smoke and while there doesn't seem to be any licensing deal, that has nothing to do with putting 2 die in a single package and maybe using Intel's silicon bridge to link them.
Intel makes the CPU, AMD makes the GPU and that's it. Not much different than pairing an Intel CPU and an AMD discrete.
It's cost effective, it makes some sense and it's likely happening.
Stacking memory on logic is the path forward and will happen, there is no doubt about it but we are not there yet. Costs and thermal are issues but you save power, increase perf.
I'm sorry, but that still contradicts Lisa's statement. An Intel package with Intel CPU & AMD GPU would compete against an AMD APU.
Source: AMD
Anyways, sorry everybody for being offtopic.
I can speak from direct correspondence that their cpus are more important for AMD than gpus, at this time.Not if it's done at the request of a customer without availability outside of that.
It's no different than selling discrete and it's a gain for AMD at Nvidia's expense.
It would also be followed by all AMD such solutions for other OEMs, taking away share from discrete and Nvidia.
Not until spintronics. Thermal density sets the limit for performance. 2.5D doesn't try to circumvent this and that is why it is so successful. It beat every timetable whereas HMC failed. I even got a response they weren't competing. 🙂
I can speak from direct correspondence that their cpus are more important for AMD than gpus, at this time.
Sorry, I thought we were talking about Raven Ridge? As in: the mobile/low-to-mid power desktop chip? And besides, stacking makes even less sense in the server space where CPUs typically have ~125-140W TDPs and very strict thermal tolerances due to high density. Good luck shoving any kind of secondary IC in between the CPU and IHS/HSF there and not toasting the entire chip.EPYC multi-die sockets aren't even for HEDT, they are for servers. Match!
Moving away from the interposer and moving towards PoP stacking is not necessarily the same. Also, fun that you use the "fan out" solution as an example, given that the 2nd gen fan out clearly shows them not stacking ICs on top of each other any more. Now, I wonder why they might stop doing that? Might it be related to thermals? Hmmmm.Ofc they are because that's where they need the BW.
However they are not moving away from stacked chips as you call it, they are moving towards it just like everybody else.
They are moving away from silicon interposer as those are too costly.
Even the fan out solution for mobile shows stacked DRAM.
2.5 and 3D packaging is a great thing and will be used more an more as monolithic 3D is not quite here yet and process shrinks don't provide much.
But nobody will use it for fun, it has to make sense.
Yeah, I'd like one too. And I'd like to see one demonstration of stacked memory on top of a power-consuming chip like a GPU, CPU or SoC where it doesn't thermally limit performance even at very low wattages.What?! Did I miss the news that talks about the technology going into this direction, HBM on top of CPUs? Can I have a source please?
No, it won't. That is an NP hard question.Depends on market. Cost is the bigger problem in some markets. Not the same scenario in a 100W ASIC, a 4W SoC or a 200mW SoC. And ofc there are all kinds of memories.
Logic on logic is hard but that will come too.
Moving away from the interposer and moving towards PoP stacking is not necessarily the same. Also, fun that you use the "fan out" solution as an example, given that the 2nd gen fan out clearly shows them not stacking ICs on top of each other any more. Now, I wonder why they might stop doing that? Might it be related to thermals? Hmmmm.
.
Okay. Let me be clear: the first "fan out" picture shows DRAM stacked on top of the Application Processor - PoP . The second clearly doesn't. Now I might be misreading the slide, but to my eyes, that's a pretty clear illustration of no longer stacking RAM on top of the SoC. Of course DRAM is an IC (don't insult my intelligence, you dolt), but it's not one that produces any noticeable amount of heat. Instead, if stacked on top of a heat-producing IC such as a CPU, GPU or SoC, it acts as an insulator. Is that so hard to grasp? The second "fan out" image is labeled "side by side". What do you suppose they mean by that? And to be even more clear: I've never, ever, talked about stacking DRAM on top of DRAM (which is an utterly obvious thing to do, and which all HBM does...), in case your reading comprehension is that awful. I've been talking about stacking DRAM and other, hotter ICs together - which is the whole core of this discussion, of HBM and Raven Ridge. Oh, and the Server & Network part of the image only shows using a cheaper/easier to produce material for the substrate, no other change than that.Are you high?
In mobile they move from PoP to fan-out with stacked DRAM- DRAM is an IC BTW.
In the high end solution they just replace the SI interposer for lower costs but still use HBM.
You really have no idea what's in that slide and maybe Google for PoP since you clearly don't know what that is.
No, it won't. That is an NP hard question.
One's a mobile tech; the other cutting-edge, power-stricted, budget-restricted, feasibly-limited prototype runs. Infinitesimally different. Different: the only way it comes is when hbm is cheaper than regular memory, a.k.a the inverse of the proposition.
Okay. Let me be clear: the first "fan out" picture shows DRAM stacked on top of the Application Processor - PoP . The second clearly doesn't. Now I might be misreading the slide, but to my eyes, that's a pretty clear illustration of no longer stacking RAM on top of the SoC. Of course DRAM is an IC (don't insult my intelligence, you dolt), but it's not one that produces any noticeable amount of heat. Instead, if stacked on top of a heat-producing IC such as a CPU, GPU or SoC, it acts as an insulator. Is that so hard to grasp? The second "fan out" image is labeled "side by side". What do you suppose they mean by that? And to be even more clear: I've never, ever, talked about stacking DRAM on top of DRAM (which is an utterly obvious thing to do, and which all HBM does...), in case your reading comprehension is that awful. I've been talking about stacking DRAM and other, hotter ICs together - which is the whole core of this discussion, of HBM and Raven Ridge. Oh, and the Server & Network part of the image only shows using a cheaper/easier to produce material for the substrate, no other change than that.
Very different only in production, not in thermal characteristics. Sure, removing the substrate of the top package removes one layer of thermal insulation. There are quite a few left. And yes, in comparison to an SoC, CPU or GPU, RAM produces negligible amounts of heat. Show me a DRAM stack/chip with a TDP above 2-3W, even in a high power scenario like a GPU, and I'll buy you an ice cream. In mobile, where RAM is extremely heavily optimized for power (rather than performance, like in a GPU), RAM might still be one of the more power consuming parts, but nowhere near an SoC or display. If that was the case, phones with more RAM would have significantly worse battery life...PoP is not stacking chips, it's "stacking" packages, 2 very very different things.
You claim that DRAM isn't producing any "noticeable amount of heat" and i'm the one that you deem necessary to insult? ROFL.
Very different only in production, not in thermal characteristics. Sure, removing the substrate of the top package removes one layer of thermal insulation. There are quite a few left. And yes, in comparison to an SoC, CPU or GPU, RAM produces negligible amounts of heat. Show me a DRAM stack/chip with a TDP above 2-3W, even in a high power scenario like a GPU, and I'll buy you an ice cream. In mobile, where RAM is extremely heavily optimized for power (rather than performance, like in a GPU), RAM might still be one of the more power consuming parts, but nowhere near an SoC or display. If that was the case, phones with more RAM would have significantly worse battery life...
As an example, check out this paper. Sure, memory is a measurable power draw, and in certain scenarios significant. But when compared to the SoC, even in a smartphone with a ridiculously low SoC TDP, it's only a fraction.
Well, I guess you won't see this (ignoring people that disagree with you makes for fun discussions!), but thanks for demonstrating my point, I suppose. That slide shows that a huge 512-bit GDDR5 bus (consisting of 16 32-bit interfaces, I assume) consumes gobs of power (duh), while four stacks of HBM drastically lowers that. And that's in a high power environment (GPU) where the RAM is constantly being hammered with use - i.e. those are maximum numbers or close to it. From that, we can tell that a single stack of HBM1 would consume something in the realm of 7W max. Now, HBM2 is supposed to be noticeably more power efficient than that, ~20% at the same bandwidth, IIRC. That brings us down to 5.5W max, while still running at high clocks. And as power draw with increased clocks is non-linear for RAM as for anything else, lowering clocks would drastically cut that. I wouldn't be surprised at all if you could get a single HBM2 stack down to ~2-3W sustained max at mobile-suitable clocks. Will it be as fast? Of course not. But that's neither necessary nor desirable in the mobile space. Oh, and that 3W figure would rarely if ever actually happen, due to the nature of mobile workloads (see the paper above).You really need to stop talking, you keep saying crazier and crazier things.
Here's an old slide from the AMD Fury stack. In mobile it's much worse when the DRAM uses a couple of W but this is more than enough, you are going on the ignore list.
![]()
I always understood HBCC to be parsed as (HBC)C. In other words a (high bandwidth cache) controller. No HBC aka HBM2 or equivalent, makes a HBCC ineffective or inefficient. Just my thoughts.It wouldn't be low perf (i.e. not the ~11 CU that RR will now ship with). More like 20+ CU.
It'd be for use (1) in areas which benefit from limited acceleration across GPUs, so presents a more balanced solution that dGPUs and (2) use as a high end APU for consumers/prosumers.
HBM is obviously needed to feed instance (2), and in terms of (1), having a local on-die area that can act as a... I'll not say cache as the pure latency isn't there for that... but more a second storage using separate memory "channels" can lift the bandwidth load off the main memory access - and that would apply to both x86 and GPU.
Erm... just had a thought.
With the way infinity fabric is... isn't that virtually the same as a CCX+GPU? Would the HBCC in Vega allow for seamless integration into IF I wonder.
Hmmmm....
Some of us have predicted this last year. Strangely, I find myself agreeing with you on this.No they can't. That's the issue. Even Kaby-G is probably an Apple semi-exclusive.
That's what I am predicting AMD is doing with Navi.
You are making the error in assigning the total power budget to a single stack. The power of scaling is to distribute power and computation. What if each stack is 30W or 50W? Workable then?No way. You know that stacked memory leads to cooling issues for tablets, right? With ~2.5-4W SOCs? Stacking memory on top of a 150+W GPU would be a recipe for disaster. It would probably destroy itself rather quickly, even if it had some sort of thermal vias embedded into the HBM.
You mean "3x better performance for the fraction of a second it takes it to overheat and throttle". The announcements made sense in regard to further developing an up-and-coming memory technology, making it suitable for more use cases in the future.
I tend to agree, but I also think that depends on how powerful it turns out to be. If it's comparable to a low-end dGPU, I'd see some takers. The problem with Iris is that while it's far faster than HD 620 and the like, it's not fast enough that that actually matters. 620 does all the acceleration and media playback you want, but Iris can't really game even at 720p. If KBL-G could replace a RX 550 or GT 1030, I would want one.
No memory cell ever will have 30-50W TDP, unless the sizes will change dramatically.You are making the error in assigning the total power budget to a single stack. The power of scaling is to distribute power and computation. What if each stack is 30W or 50W? Workable then?
It sounds like you're mixing up VRAM power and GPU power (or you're not expressing yourself quite clearly). I (we?) was talking about stacking HBM on top of a relatively high powered chip like a CPU, GPU or APU. Even though the HBM does consume some power, it's bound to be far lower than the chip beneath, unless you're talking about a Core m competitor, in which case a couple of stacks of low-clocked HBM2 would probably be roughly equal. The problem here is twofold: first (and the lesser of the two) is that you're concentrating heat-generating chips into a smaller surface area, i.e. concentrating the heat. The second, and bigger issue is that the stacked-on-top HBM will act as a thermal insulator between the CPU/GPU/APU die and the IHS(/cold plate in a laptop), making thermal transfer (cooling) exponentially more difficult. In essence, you're trapping part of the heat from the lower chip, not allowing it to be carried away by the cooling system.You are making the error in assigning the total power budget to a single stack. The power of scaling is to distribute power and computation. What if each stack is 30W or 50W? Workable then?
Yeah, despite wishing for an HBM-equipped APU, I think RR will be pretty great. I'll probably be in the market for a new laptop in the next year, so hopefully they get some good design wins. I'd like something Surface Pro-like (with pen support, and preferably some sort of docking that enables a cTDP-up state), or a simple 13-14" ultrabook like the XPS 13 with great battery life. Should be very doable, and should be able to play some decent games. I'm not looking for AAA on my laptop, but the odd round of Rocket League (without it looking like crap) would be nice.RavenRidge looks to be exactly what an AMD APU was supposed to be in the first place, cheap yet powerful CPU+iGPU and now at low wattage as well.
With 40% (iGPU) higher perf than BristolRidge at 50% less power, it could easily become the best APU in 2017 for Laptops.
I am looking very much at AA on my laptop - in case I ever buy one. The quoted improvement of 40% gpu is direct count from gpu cores. I can see where that fits, but there is still the question of AA - it requires bandwidth.Yeah, despite wishing for an HBM-equipped APU, I think RR will be pretty great. I'll probably be in the market for a new laptop in the next year, so hopefully they get some good design wins. I'd like something Surface Pro-like (with pen support, and preferably some sort of docking that enables a cTDP-up state), or a simple 13-14" ultrabook like the XPS 13 with great battery life. Should be very doable, and should be able to play some decent games. I'm not looking for AAA on my laptop, but the odd round of Rocket League (without it looking like crap) would be nice.
We both know how that 40% number is calculated, i wonder how does that translate in real life. Especially with power claim.With 40% (iGPU) higher perf than BristolRidge at 50% less power, it could easily become the best APU in 2017 for Laptops.
I think you misread my post entirely. I was talking about "triple-A" games (e.g. CoD, Tomb Raider, that kind of stuff), not Anti-Aliasing.I am looking very much at AA on my laptop - in case I ever buy one. The quoted improvement of 40% gpu is direct count from gpu cores. I can see where that fits, but there is still the question of AA - it requires bandwidth.
I cannot live without narrow-tent: SMAA+MLAA+ 4x adaptive MSAA narrow-tent. It would be great if it returned to the filtering stack. It really is something for me.