Linus Torvalds: Discrete GPUs are going away

Page 7 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

sao123

Lifer
May 27, 2002
12,653
205
106
discrete is not going anywhere any time soon... I however expect to see GPU / VRAM sockets on the motherboard similar to how CPU/RAM is socketed now.

This will shift manufacturing costs from the GPU manufacturers to the motherboard manufacturers, but will allow for a wider range of video configuration.
 

Techhog

Platinum Member
Sep 11, 2013
2,834
2
26
Well, ARM and Atom may kill off x64 Core CPUs, at least at the low end. There is talk that Intel will kill off Core-based Celeron and Pentium CPUs, and fill those low-end brands with purely Atom CPUs.

Whether this will cause Intel to slowly kill off their higher-end CPUs, would be interesting to watch, since it's approximately equivalent to what's happening between IGPs ("good enough"), versus the best, or at least, better, dGPUs. ("Good enough" Atoms, eliminating the market for Core-based CPUs, first at the low-end, then ... high-end too?)

Except that this isn't happening with dGPUs beyond the $100 mark yet...
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
discrete is not going anywhere any time soon... I however expect to see GPU / VRAM sockets on the motherboard similar to how CPU/RAM is socketed now.

This will shift manufacturing costs from the GPU manufacturers to the motherboard manufacturers, but will allow for a wider range of video configuration.

I don't expect this ever. We'll have stacked RAM before long.
 

jpiniero

Lifer
Oct 1, 2010
16,868
7,310
136
I agree with what Linus said, but it's going to be more about Intel's strong arming more than anything else. Will cutting off dGPU access to the dual core laptops be enough? Probably not. They'll have to do more, but they do have to be delicate about it. But hey, it's Intel; they'll figure it out.

5 years from now? It looks more like 1-2 years from now with Skylake GT4 with 144 Gen9 EUs. Gen10 GT5, if Intel decides to make one, could be serious competitor for dGPUs in 2016.

Intel's going to have to fix the fill rate problems first. The gap between the 5200 and the 30W 840M widens as you raise the resolution, although it could be bandwidth issues. Don't know how many EUs Skylake GT4 will have but 144 seems extremely optimistic. Should be great at SP compute though.

That doesn't speak to the idea of a socketed GPU, though.

There is no way in hell Intel will allow anything like that. Even something like NVLink isn't happening. The Phi is uncompetitve on Single Precision, so I imagine it wouldn't make much sense to use it in games. I could see AMD doing it.
 

Zodiark1593

Platinum Member
Oct 21, 2012
2,230
4
81
What would be interesting is a standardized blade-style board design with an SOC and everything on board. While basic users can get away with one, power users can use multiple board together in sort of a micro blade pc.
 

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
Intel's going to have to fix the fill rate problems first. The gap between the 5200 and the 30W 840M widens as you raise the resolution, although it could be bandwidth issues. Don't know how many EUs Skylake GT4 will have but 144 seems extremely optimistic. Should be great at SP compute though.
We'll see how Intel improves their IGPs with Gen8 and 9. Skylake is rumored to have 72EUs, but the GT apparently wasn't mentioned, so now people think it's GT4.

But there's a presentation somewhere of Intel's graphics, and it says that it can scale to GenX: GT4-96EUs. That's obviously referring to Broadwell, because GT2 and GT3 will have 24 and 48 EUs, so GT4 continuous this 2X trend.

If Skylake has a GT with 72EUs, it can't be less than 96EUs if it's GT4, so it would be GT3, GT4 would be 2X of that, 144, and GT2 would have 36EUs.

If every GT is +24EUs, GT2 would have 24EUs, GT3 48EUs and GT4 72EUs like rumored, but that wouldn't fit with the exponential nature of the semiconductor industry.
 

NTMBK

Lifer
Nov 14, 2011
10,461
5,847
136
If every GT is +24EUs, GT2 would have 24EUs, GT3 48EUs and GT4 72EUs like rumored, but that wouldn't fit with the exponential nature of the semiconductor industry.

The semiconductor is exponential over time, not at a given moment in time. At any fixed point the cost (approximately) increases linearly with number of transistors, until you start hitting the serious yield issues with mega-dies.
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
The semiconductor is exponential over time, not at a given moment in time. At any fixed point the cost (approximately) increases linearly with number of transistors, until you start hitting the serious yield issues with mega-dies.

Assuming transistor density stays the same?
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,003
126
I bet Linus also thinks Linux is viable in desktop space. Meanwhile we're looking at a ~1.5% desktop market share since it was released in 1991, even with the Vista/8 disaster, and even when not factoring the vast amount of pirated Windows installations in the likes of China which aren't counted in market share statistics.

Still, maybe in a hundred years it'll reach 5%. :awe:

This is a fantasy. Except for perhaps the very slowest discrete parts, a CPU socket has neither the space nor the TDP budget necessary for the cooling or power needed for a GPU, much less when combined with an already ~100W CPU. You can't install a dual-slot 10.5" cooler around a CPU socket.

Socketed GPU/VRAM solutions are an equal fantasy; I have 6GB @ 288.4 GB/sec on my GPU. How much will that cost to put on a motherboard? Mobo makers certainly won't foot that bill, nor will the "good enough" crowd. And you still run into the same problem of cooling restrictions.

Sure, you might have some eDRAM/L4 to use, but that's so small it basically can't store anything but the frame buffer. That means all your textures, geometry, shaders, and other assets will be constantly fetched from system memory, a magnitude slower than even what a low-end GPU has onboard.
 
Last edited:

Qwertilot

Golden Member
Nov 28, 2013
1,604
257
126
40w CPU + 60(+)W GPU is plenty of thermal space to take the mid end of the graphics card market out. Bandwith might be another matter.

You know I seem to remember talk of Nvidia including arm chips in at least some versions of Maxwell. Or did that get scrapped somewhere along the line? That'd formally make all of those iGPUs :) Unless, of course, viewed as a dGPU with an iCPU ;)
(But then some of AMD's APU's might very well trend that way anyway.).
 

NTMBK

Lifer
Nov 14, 2011
10,461
5,847
136
This is a fantasy. Except for perhaps the very slowest discrete parts, a CPU socket has neither the space nor the TDP budget necessary for the cooling or power needed for a GPU, much less when combined with an already ~100W CPU. You can't install a dual-slot 10.5" cooler around a CPU socket.

Socketed GPU/VRAM solutions are an equal fantasy; I have 6GB @ 288.4 GB/sec on my GPU. How much will that cost to put on a motherboard? Mobo makers certainly won't foot that bill, nor will the "good enough" crowd. And you still run into the same problem of cooling restrictions.

Sure, you might have some eDRAM/L4 to use, but that's so small it basically can't store anything but the frame buffer. That means all your textures, geometry, shaders, and other assets will be constantly fetched from system memory, a magnitude slower than even what a low-end GPU has onboard.

Have you paid any attention at all to the Xeon Phi? Knight's Landing is a socketed CPU with >3TFlops of GPU-style compute power, with 16GB of on-package faster-than-GDDR5 memory, backed up with DDR4 to add more capacity. Seriously, the socket is not a limitation.
 

Keysplayr

Elite Member
Jan 16, 2003
21,219
55
91
40w CPU + 60(+)W GPU is plenty of thermal space to take the mid end of the graphics card market out. Bandwith might be another matter.

You know I seem to remember talk of Nvidia including arm chips in at least some versions of Maxwell. Or did that get scrapped somewhere along the line? That'd formally make all of those iGPUs :) Unless, of course, viewed as a dGPU with an iCPU ;)
(But then some of AMD's APU's might very well trend that way anyway.).

No. Mid range will always be several steps ahead of integrated. By the time 40w CPU + 60W GPU hits, the mid range at that time will waste it. Kind of like it does today. Both make progress every gen. Like one car chasing another but one is always a few car lengths behind.
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,003
126
You know I seem to remember talk of Nvidia including arm chips in at least some versions of Maxwell. Or did that get scrapped somewhere along the line?
From what I understand the ARM chip is still coming, and it'll be used as a scheduler to reduce the reliance of game-specific optimizations in the drivers.

I personally think this is a fantastic idea. A CPU uses OoOE to attain the best possible performance with any arbitrary code; a GPU should have something similar.

Have you paid any attention at all to the Xeon Phi? Knight's Landing is a socketed CPU with >3TFlops of GPU-style compute power, with 16GB of on-package faster-than-GDDR5 memory, backed up with DDR4 to add more capacity.
This part is based on Larrabee. Do I need to say any more? This is the same part Intel was claiming was going to dominate GPU space and everybody would be moving to ray tracing. Years later all we got is some Quake 4 demo running shiny reflections at a slideshow.

Also FLOPS is only one part of the performance equation. What's its texturing fillrate on FP render targets? How many MSAA samples per cycle can the ROPs perform? What (if any) hardware HSR/culling does it employ for rasterization?

Seriously, the socket is not a limitation.
Really? So any off-the-shelf LGA1150 cooler is all it needs?

Let's see it then. Show us the socketed version's gaming performance in today's AAA game engines (real engines like Battlefield and Crysis, not Goat Simulator or Minecraft), along with the associated motherboard cost when it's integrated, and what cooler it needs.

After years of empty promises from Larrabee I'll assume it's a dud unless hard data is provided otherwise.
 
Last edited:

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
No. Mid range will always be several steps ahead of integrated. By the time 40w CPU + 60W GPU hits, the mid range at that time will waste it. Kind of like it does today. Both make progress every gen. Like one car chasing another but one is always a few car lengths behind.

Wrong. Why do people always forget Intel's manufacturing prowess? When it launched 22nm Ivy Bridge, Intel was for the first time 1 full node ahead of the competition which didn't have 32/28nm yet.

At the end of 2014, Intel will launch 14nm. It won't be until 2018 that the foundries will have a competitor for that node in terms of density and performance, a difference of 4 years (note: desktop 14nm SKUs won't arrive until 2015, but if it's like 20nm, GPUs with TSMC's 10nm won't arrive until 2019 either).
 

Pottuvoi

Senior member
Apr 16, 2012
416
2
81
This part is based on Larrabee. Do I need to say any more? This is the same part Intel was claiming was going to dominate GPU space and everybody would be moving to ray tracing. Years later all we got is some Quake 4 demo running shiny reflections at a slideshow.

Also FLOPS is only one part of the performance equation. What's its texturing fillrate on FP render targets? How many MSAA samples per cycle can the ROPs perform? What (if any) hardware HSR/culling does it employ for rasterization?
It's a CPU, it doesn't have ROPs or TEX units anymore.
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,003
126
Wrong. Why do people always forget Intel's manufacturing prowess? When it launched 22nm Ivy Bridge, Intel was for the first time 1 full node ahead of the competition which didn't have 32/28nm yet.

At the end of 2014, Intel will launch 14nm. It won't be until 2018 that the foundries will have a competitor for that node in terms of density and performance, a difference of 4 years (note: desktop 14nm SKUs won't arrive until 2015, but if it's like 20nm, GPUs with TSMC's 10nm won't arrive until 2019 either).
And yet the performance gap between GPUs and CPUs just keeps growing wider and wider:

cpu-vs-gpu.png



It's a CPU, it doesn't have ROPs or TEX units anymore.
That's my whole point - the equivalent software functions are a magnitude slower.

We already saw how slow Larrabee was without those hardware units, which is why Intel were trying so hard to push their ray tracing fantasy.
 

NTMBK

Lifer
Nov 14, 2011
10,461
5,847
136
This part is based on Larrabee. Do I need to say any more? This is the same part Intel was claiming was going to dominate GPU space and everybody would be moving to ray tracing. Years later all we got is some Quake 4 demo running shiny reflections at a slideshow.

The actual processor in question is utterly irrelevant to what I was saying. :\ The point is the socket, the platform- a CPU socket which can cool a >200W processor, and provide more bandwidth than GDDR5. If it makes you happier, imagine a theoretical NVidia motherboard with a socket for an ARM + Pascal APU, or an AMD K12 + GCN APU.

The point is that being in a socket is no limitation to a high TDP, high bandwidth processor.
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,003
126
The actual processor in question is utterly irrelevant to what I was saying. :\ The point is the socket, the platform- a CPU socket which can cool a >200W processor, and provide more bandwidth than GDDR5. If it makes you happier, imagine a theoretical NVidia motherboard with a socket for an ARM + Pascal APU, or an AMD K12 + GCN APU.

The point is that being in a socket is no limitation to a high TDP, high bandwidth processor.
Your example is irrelevant because it doesn't apply to the desktop/consumer space we're talking about, which apparently Linus is too.

That's not a consumer LGA1150 CPU or socket. You can't put something like that on a Z97 motherboard (for example). In consumer space, sockets have neither the TDP or the cost budget of a Knight's Landing.

And after all that, there's absolutely no indication that it'll be anything other than a flop for graphics performance, and that's exactly where it needs to compete if it's going to replace discrete GPUs.
 

Keysplayr

Elite Member
Jan 16, 2003
21,219
55
91
Wrong. Why do people always forget Intel's manufacturing prowess? When it launched 22nm Ivy Bridge, Intel was for the first time 1 full node ahead of the competition which didn't have 32/28nm yet.

At the end of 2014, Intel will launch 14nm. It won't be until 2018 that the foundries will have a competitor for that node in terms of density and performance, a difference of 4 years (note: desktop 14nm SKUs won't arrive until 2015, but if it's like 20nm, GPUs with TSMC's 10nm won't arrive until 2019 either).

Not sure why you used this as an example as it actually serves to weaken your argument and reinforces my argument. EVEN WITH Intel's process advantages the still can only barely compare with Nvidia or AMD's low end 28nm discrete GPUs. If its lucky.
 

NTMBK

Lifer
Nov 14, 2011
10,461
5,847
136
Your example is irrelevant because it doesn't apply to the desktop/consumer space we're talking about, which apparently Linus is too.

That's not a consumer LGA1150 CPU or socket. You can't put something like that on a Z97 motherboard (for example). In consumer space, sockets have neither the TDP or the cost budget of a Knight's Landing.

And after all that, there's absolutely no indication that it'll be anything other than a flop for graphics performance, and that's exactly where it needs to compete if it's going to replace discrete GPUs.

Where do you think the X99 platform is coming from? Workstations and servers. Obviously there would be lower bandwidth, lower power consumption alternatives for consumers with less graphical need- just like there already is today.

And the Phi isn't a GPU- it no longer has the texture units Larrabee had, and does not support DirectX or OpenGL. So I don't expect great graphical performance either :) Don't fixate on the specific processor- the interesting part for the future of GPUs is that you can fit that much TDP and that much bandwidth into a socket.
 

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
And yet the performance gap between GPUs and CPUs just keeps growing wider and wider:

cpu-vs-gpu.png

Nice straw man.

You don't have to have such a nice graph to know that the gap is getting wider between CPUs and GPUs. The GPU's number of cores keeps increasing exponentially, while a CPU still has just a few. There are obvious reasons for that. You can't compare a GPU to a CPU because they're made for different purposes.
If you had a graph of a single-core CPU and GPU, they would probably look quite similar.

This has nothing do to with Intel's ability to shrink the gap between IGPs and GPUs to make most GPUs obsolete.
 

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
Not sure why you used this as an example as it actually serves to weaken your argument and reinforces my argument. EVEN WITH Intel's process advantages the still can only barely compare with Nvidia or AMD's low end 28nm discrete GPUs. If its lucky.

The process node isn't everything that determines the performance of a GPU. There's also the architecture, which it seems Intel's going to fix with Gen8 and further improve upon with Gen9. Also, the best IGP that is available is a GT2 with 20 EUs. That's much different than the Broadwell GT3 with 48 Gen8 EUs we'll see, not to mention Skylake's GT4 with 72 or 144 EUs. Those SKUs could have the potential to compete with the mid-range market.

Also, Intel's current process advantage applies mostly to transistor performance and power, but less to density, which is very important for GPUs. Density (and transistor price) of foundries' nodes won't improve after 20nm until around 2019. Intel, on the other hand, is now strongly focusing on also getting a distinct density advantage. Both 14 and 10nm will scale aggressively and even 7nm will likely be released before TSMC's 10nm (which is equivalent to Intel's 14nm).

Those 3 things, a much better architecture, the lack of density improvement of the competition for about 4 years and Intel's aggressive scaling, will enable Intel to close the gap between IGPs and GPUs in a way that could make a lot of GPUs obsolete.
 

raghu78

Diamond Member
Aug 23, 2012
4,093
1,476
136
Your example is irrelevant because it doesn't apply to the desktop/consumer space we're talking about, which apparently Linus is too.

Let me put it in a simple way with an example.

AMD 2016 APU - 4 next gen x86 cores + 1024 GCN 2.0 cores connected to a HBM stack with 128 - 150 Gb/s (First gen HBM 1 - 1.2 Ghz speeds )

Roughly 220 - 240 sq mm. 50 - 55% GPU (110 - 130 sq mm). The rest of the SOC including CPU cores, cache, memory controller makes up the rest of the die. btw these APUs and the dGPUs are manufactured on the same process. The GPU architecture used in AMD APUs derives from their dGPUs. Till now the biggest bottleneck has been bandwidth which is eliminated with HBM. AMD is making the move to HBM for APUs and GPUs. read the conclusion on page 52 from the presentation below

http://www.microarch.org/micro46/files/keynote1.pdf

1.15W - ultrabooks
2. 35w- notebooks
3. 55w - Gaming notebooks
4. 65w - SFF PCs
5. 95w - standard desktops
6. 125w - unlocked with hybrid cooler (they could still sell it for USD 280 and undercut a $200 CPU + lowest dGPU)

Discrete market - 4 chips
Lowest SKU - 60/90W
Mid Range - 120/150w
High end - 175/200w
Ultra high end - 225/250w

Notebooks market- 3 chips
Lowest SKU - 30/40W
Mid Range - 55/70w
High end - 85/100w

95w and 125W SKUs take out the lowest dGPU chips which sell for USD 100 - 120. In notebooks the 55W SKUs take out the notebook version of the lowest dGPU chips. Remember these bring the highest volume and the bulk of sales. You are left with only 2 chips for the notebook market which is not a healthy situation for sales volume and revenue. Intel already has 55w SKUs. With 4 next gen x86 cores AMD would provide competitive CPU performance and can easily sell 55w SKUs which eliminate the smallest chip which is the highest volume seller.
 
Last edited: