Linus Torvalds: Discrete GPUs are going away

Page 8 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Qwertilot

Golden Member
Nov 28, 2013
1,604
257
126
The thing with Intel is also that they haven't really been trying to produce iGPU matching low/mid end gaming GPUs yet.

That Haswell R thing that just appeared is I think the first thing where they've even tried to sell a desktop CPU based on the iGPU performance. Clearly not that serious an attempt - 65W total TDP processor for one thing!

Broadwell K looks a bit like it should be their first genuine effort to see if they can make a chip they can sell based partially on its iGPU performance. It then sounds like skylake might eventually run even bigger in some configurations etc.
 

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
Discrete shipments are shrinking and both GPU players aren't putting more money on discrete development, but in development of other markets niches for them.

Once both players voted the question with their wallets, what's left for discussion?
 

_Rick_

Diamond Member
Apr 20, 2012
3,983
74
91
Linus point is that there's so much to gain with unified cpu-gpu address space that eventually there's no other way than put them in single-chip. And as gpus get growing and cpus keep shrinking combining those together is simply inevitable.

Monster GPU's ain't there for gaming but for hpc utilities. And Intel's Knights Landing is first to be able to run os without master cpu, nVidia's Maxwell will be second. AMD's solutions is still unknown but there's no other possibility any more than put cpu into hpc GPU.

I agree with him on this point. Being able to write code without resorting to using external libraries that will automatically schedule across available compute resources is the future.
But there are still a bunch of limitations to overcome.

I think the future isn't so much dGPUs disappearing, but rather that they end up connected to a next-gen QPI/HT like high speed bus, and are detected as just another processor in the system. Basically as a form of asymmetric multi-processing. Physical integration isn't going to fly, because, quite simply, demands and potential performance packages vary too widely. The product matrix would be too large for the result to be profitable.

Furthermore, there's little interest in using common memory, which is where iGPUs excel. Sure, copying memory is ugly, but CPUs already do that all the time with caches. GPUs stream a lot of data, which makes them require different RAM than CPUs, where a local cache a few MB in size can reduce memory bandwidth usage massively. Using fast memory for the CPU is a waste of resources though, which may be overcome by economies of scale, as the console makers try to make work, but 64GB RAM at 3-4GHz (not GT!) sounds like it will remain extremely expensive, compared to 1-1.3 GHz DDR4.

So at least currently there remain a lot of engineering and economical challenges to building large-scale (10B-xtors-class) fully-integrated "APU" systems. Especially the economic side looks bad. the current dGPU markets, HPC and gaming, will probably continue to want to be able to balance their CPU+GPU performance for the tasks they face, with the budget they have.

What may enable the gamer market to become an easier target for such integration, is the fact that cpu and gpu development have significantly slowed over the past few years, making upgrades less desirable, and getting a single high performance chip with a mid-life memory upgrade could be viable.

HPC on the other hand loves cards. Adding in some CPU into the cards makes some sense, for management reasons, but you'd still want a dedicated master-CPU per box of APUs to manage scheduling, I/O, network and such.

If vendors were to able to standardize the CPU interconnects and move away from PCIe3 for GPUs, asymmetric multi processing could be an easier sale with almost all the advantages of physically colocating, and fewer economic disadvantages. But it would require AMD and Intel to drop HT and QPI in favor of a common protocol. Hell will freeze over before that happens.
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,003
126
Nice straw man.
Show us then, how does this manufacturing prowess come to bear when Intel is falling exponentially behind every generation in graphics performance?

You don't have to have such a nice graph to know that the gap is getting wider between CPUs and GPUs. The GPU's number of cores keeps increasing exponentially, while a CPU still has just a few. There are obvious reasons for that. You can't compare a GPU to a CPU because they're made for different purposes.
You are participating in the same thread as us, right? With the title "Linus Torvalds: Discrete GPUs are going away"?

If you don't want to make such a comparison, why even post here? It's the whole point of the thread.

If you had a graph of a single-core CPU and GPU, they would probably look quite similar.
How is that relevant to reality? Or is your whole argument that Intel can compete with an imaginary single-core GPU that doesn't even exist?

This has nothing do to with Intel's ability to shrink the gap between IGPs and GPUs to make most GPUs obsolete.
How is Intel going to make anything obsolete if they're exponentially falling further and further behind, despite having a process advantage?

The process node isn't everything that determines the performance of a GPU. There's also the architecture, which it seems Intel's going to fix with Gen8 and further improve upon with Gen9. Also, the best IGP that is available is a GT2 with 20 EUs. That's much different than the Broadwell GT3 with 48 Gen8 EUs we'll see, not to mention Skylake's GT4 with 72 or 144 EUs. Those SKUs could have the potential to compete with the mid-range market.

Also, Intel's current process advantage applies mostly to transistor performance and power, but less to density, which is very important for GPUs. Density (and transistor price) of foundries' nodes won't improve after 20nm until around 2019. Intel, on the other hand, is now strongly focusing on also getting a distinct density advantage. Both 14 and 10nm will scale aggressively and even 7nm will likely be released before TSMC's 10nm (which is equivalent to Intel's 14nm).

Those 3 things, a much better architecture, the lack of density improvement of the competition for about 4 years and Intel's aggressive scaling, will enable Intel to close the gap between IGPs and GPUs in a way that could make a lot of GPUs obsolete.
More theoretical fluff. We heard similar for years with Larrabee and it absolutely failed.

Going by past scaling, in four years time, GPU performance will be up by a factor of about 4x-5x across the board. That shifts Intel's target because even low-end parts will be that much faster.

Again, show us, in practical terms, when Intel will be able to beat today's Titan for graphics processing on a discrete CPU socket for made for consumer parts.
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,003
126
Where do you think the X99 platform is coming from? Workstations and servers. Obviously there would be lower bandwidth, lower power consumption alternatives for consumers with less graphical need- just like there already is today.
Where is the graphics prowess of the X99 you speak of? Have you seen benchmarks of it knocking out discrete GPUs?

And the Phi isn't a GPU- it no longer has the texture units Larrabee had, and does not support DirectX or OpenGL. So I don't expect great graphical performance either :) Don't fixate on the specific processor- the interesting part for the future of GPUs is that you can fit that much TDP and that much bandwidth into a socket.
What does it matter about the socket if it's (a) not available in consumer space and (b) it still can't compete with discrete GPUs on a performance basis in games?

That's what Linus is claiming will happen, and that's what this entire thread is about. All I see is a bunch of people rattling on about theoretical manufacturing advantages while ignoring the reality of:

(1) Larrabee was a flop.
(2) Intel's current GPUs are unplayably slow and are a magnitude slower than many low-end parts, much less anything faster.
(3) Discrete GPUs continue to dramatically increase in performance and exponentially widen the gap with Intel.
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,003
126
We're not comparing GPUs to CPUs here, we are comparing iGPUs to dGPUs.
Right, but as soon as you try to stick a GPU alongside a CPU in a CPU socket, you suddenly have a bunch of limitations you didn't have before.

Whichever way you cut it, Intel is exponentially behind something like a Titan in terms of graphics performance. You even could argue that Larrabee was their best attempt to date in taking out the competition, yet it failed miserably.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
Show us then, how does this manufacturing prowess come to bear when Intel is falling exponentially behind every generation in graphics performance?

IGPs, specially Intels IGPs are advancing faster than dGPUs currently. And that trend seems to favour IGP even further in the future, when the ROI of shrinking dGPUs, not to mention new uarchs goes down the drain due to low volume.

The gap is closing, not expanding.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
This is a fantasy. Except for perhaps the very slowest discrete parts, a CPU socket has neither the space nor the TDP budget necessary for the cooling or power needed for a GPU, much less when combined with an already ~100W CPU. You can't install a dual-slot 10.5" cooler around a CPU socket.
1. The lowest are exactly what's being talked about, at least in any meaningful terms. Today, a GTX 750 is about the slowest thing worth buying, due to IGP improvements thus far. That's pretty rapid improvement, which includes AMD (just that we kind of expected something decent from them), and now both are basically needing faster RAM, more than anything else, to keep going. Intel's DRAM gives them an edge, but its implementation as they have it is not likely good enough for the long run (it basically fixes the problem for low-gfx games at low reses, but once buffers exceed it, no amount of IGP processing power will help, and it is kind of small).

2. The CPU sockets are designed to hold heatsinks, already, of enough size and weight for the job. TDP may be an economic limitation, but it's not a technical one.

Sure, you might have some eDRAM/L4 to use, but that's so small it basically can't store anything but the frame buffer. That means all your textures, geometry, shaders, and other assets will be constantly fetched from system memory, a magnitude slower than even what a low-end GPU has onboard.
Except that both Intel and nVidia have been talking about stacking real amounts of RAM right near the chip in the near future, with decent bandwidth, not just adding a small cache (prior mobile on-package RAM has consistently been slow, and only a space-saving measure).

Show us then, how does this manufacturing prowess come to bear when Intel is falling exponentially behind every generation in graphics performance?
They aren't. They are catching up (your graph is comparing simple vector FPU performance of CPUs, nothing to do with the GPUs). You're just looking at it from the opposite end. The likes of a Titan are not going to be available integrated unless we have some RAM tech that is unpredictably disruptive (not, "if they have this, then they can do this," but, "if somebody makes this, we might be in for a computing revolution."). The next moderately disruptive tech looks to be getting fast RAM on the package, rather than in another package on the PCB. If it doesn't keep costing a mint to do (it's not a new idea, but is supposedly going to get economical ASAP), it could more or less take care of the constant bandwidth limits.
 
Last edited:

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
IGPs, specially Intels IGPs are advancing faster than dGPUs currently. And that trend seems to favour IGP even further in the future, when the ROI of shrinking dGPUs, not to mention new uarchs goes down the drain due to low volume.

The gap is closing, not expanding.

The gap is going nowhere. A GM108 is beating Intel's fastest iGPU easily and costs much less.

And GM108 is around 80mm^2 big...

NVIDIA-GeForce-800M-Slides%20%2826%29.jpg
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
The gap is going nowhere. A GM108 is beating Intel's fastest iGPU easily and costs much less.
1. Where are 3rd-party scores for this GM108?
2. 80mm^2 puts it at not quite double the size of Haswell's.
3. I am dubious of any performance metric in which adding a normally power-hungry component decreases power by a factor of over 2. That will take much more than a line on a marketing slide to explain.
4. How is it going to cost less? It's an extra card or package, going into a slot or on a mobo, and the OEM could save money by simply not using it (like they normally do). If the new Geforces are particularly good, and affect sales of Iris Pro CPUs, Intel will be forced to make them more competitively priced.
 
Last edited:

exar333

Diamond Member
Feb 7, 2004
8,518
8
91
I think it's great that the 'average' dGPU is rapidly improving with each new iteration. When we hit ~console power (think 30-45fps reliably for 1080p with decent quality) that will only be FANTASTIC for PC gaming. Be that Steam boxes, Linux, windows, DX, mantle, whatever. :)

On the other hand, I really don't think discrete will ever go away. Just like 'mainstream' CPUs for Intel are all quad or dual-cores. Enthusiasts want/use 6-8+ cores for more powerful setups or specialized applications. I don't want my choices to go away. If I want a 140w TDP CPU + 2x300w TDP GPUs, that is my choice. Even a 240w 'APU' couldn't touch that for performance...

I think the more compelling argument is what the GPU landscape will look like in 10 years. Will the R&D for 'mainstream' components support higher-end discrete cards? What will happen to companies like NV who don't have a strong revenue stream from CPU and iGPU products? Will they adapt or be bought-out?

It feels to me like we are moving back to cira-1993 when a normal Pentium CPU was more than enough to play a lot of mainstream games very well. Add-in cards where expensive, but useful to a smaller group of folks. Contrast that to the early 2000's where you NEEDED a discrete GPU to pretty much play anything. I like the simplicity of knowing a normal CPU can install and play mainstream games like Sims X or the like. That's great for the business. I would argue the 'complexity' of needing to understand what a 'good' GPU was 10-12 years ago gave rise to the console. NV and AMD were partially to blame for this...they sold so many horrible GPUs <$100 that essentially didn't do anything.
 

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
1. Where are 3rd-party scores for this GM108?
2. 80mm^2 puts it at not quite double the size of Haswell's.
3. I am dubious of any performance metric in which adding a normally power-hungry component decreases power by a factor of over 2. That will take much more than a line on a marketing slide to explain.

http://www.notebookcheck.net/Asus-Zenbook-UX32LN-R4053H-Ultrabook-Review.118273.0.html

Nearly twice as fast as 5100.
And Iris Pro (5200?) is only available in a few notebooks...
 

NTMBK

Lifer
Nov 14, 2011
10,461
5,845
136
Where is the graphics prowess of the X99 you speak of? Have you seen benchmarks of it knocking out discrete GPUs?

Are you actually trolling, or what? :\ X99 is a CPU socket. I don't even know what this post was meant to mean. I was using X99 as an example of a server oriented socket coming to the (high end) consumer market.

What does it matter about the socket if it's (a) not available in consumer space and (b) it still can't compete with discrete GPUs on a performance basis in games?

Because we're discussing the theoretical possibilities and limitations of integrated CPU/GPU, versus the GPU on a card. One of the arguments brought up time and again is that "APU can never compete with a dGPU, because it can't have the same TDP and bandwidth!" This is why I am discussing Knight's Landing.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
http://www.notebookcheck.net/Asus-Zenbook-UX32LN-R4053H-Ultrabook-Review.118273.0.html

Nearly twice as fast as 5100.
And Iris Pro (5200?) is only available in a few notebooks...
Where is the control unit? Yes, Iris pro is not widely available, but that is because Intel can get away with it, not that it's costing them all that much more money. If the competition gets popular again, with Maxwell, the same features will make their way into cheaper models in new generations. It was something special when it came out, but it may become something minimally necessary in the not too distant future.
 

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
Where is the control unit? Yes, Iris pro is not widely available, but that is because Intel can get away with it, not that it's costing them all that much more money. If the competition gets popular again, with Maxwell, the same features will make their way into cheaper models in new generations. It was something special when it came out, but it may become something minimally necessary in the not too distant future.

I dont see your point?!
Iris Pro is limited to a few models - mostly Apple. The Asus Ultrabook with the 5100 cost much more than the Asus Ultrabook with 840m. And we talking here about the smallest and slowest Maxwell GPU from nVidia...
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
I dont see your point?!
1. That comparing power consumption correctly requires known similar hardware.

2. That Intel's next IGP is going to be faster, too. By how much, we'll have to see. This point in time is not necessarily indicative of all time.

3. Following #2, that Intel can and will have more options with L4. It's just a question of cost v. performance.

IoW, it's reasonable that the 830M, and maybe 840M, will be met or bested by IGPs soon to come. And that marketing slides aren't worth much. The x60M, and desktop x50, are likely 'safe' for a couple years yet.
 

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
Show us then, how does this manufacturing prowess come to bear when Intel is falling exponentially behind every generation in graphics performance?
Do I really have to explain this?

Oh, wait, you're saying "in graphics performance". Sorry, I thought your graph compared CPU FLOPS vs GPU FLOPS; I misunderstood the graph.

First things first, it's hard to compare how Intel can be falling behind exponentially when this graph isn't logarithmic.
Secondly, 2X the number of FLOPS don't equate into a 2X higher frame rate. Your GTX Titan, which shows a nice FLOPS improvement, doesn't have the same performance increase in benchmarks.

Intel claims a 75X gaming improvement since 2006, outpacing both Moore's Law and your graph:

Haswell_GPU_03.jpg


The gap really is closing. Intel fastest Sandy Bridge IGP, HD3000 performed much worse in comparison to the dGPU competition.

You are participating in the same thread as us, right? With the title "Linus Torvalds: Discrete GPUs are going away"?

If you don't want to make such a comparison, why even post here? It's the whole point of the thread.

How is that relevant to reality? Or is your whole argument that Intel can compete with an imaginary single-core GPU that doesn't even exist?
I hope my comment above clarified the confusion.

How is Intel going to make anything obsolete if they're exponentially falling further and further behind, despite having a process advantage?
They're not falling behind further, and certainly not exponentially. And if you want to compare FLOPS, I think even a theoretical Gen7 IGP with 72-144 EUs can give you a decent understanding of how Intel will catch up in the coming 1-2 years.

I already told you. Intel will improve its microarchitecture so that it isn't much behind anymore, and because its manufacturing lead is expanding, Intel will be able to get much better IGPs than what would have been possible without this 2-3 node advantage. Just look at how good or bad GPUs were 2-3 nodes (or 4-6 years) ago.


More theoretical fluff. We heard similar for years with Larrabee and it absolutely failed.
Why do you have to refer to the situation multiple years ago? Why is that relevant? Roadmaps change, plans change, targets change, all sort of things change. If you're going to refer to the past, when Intel was much more behind, you're always going to come to the conclusion that Intel will never catch up, obviously.

Going by past scaling, in four years time, GPU performance will be up by a factor of about 4x-5x across the board. That shifts Intel's target because even low-end parts will be that much faster.
I'm not so sure about that. I already told you that TSMC will have 10nm only in H2'18 or '19, and since Dennard scaling is already a decade dead, the difference between 28nm and 16FF+ won't be that staggering. But Intel will also have 14nm, 10nm and 7nm to catch up until Nvidia will have a new node.

Again, show us, in practical terms, when Intel will be able to beat today's Titan for graphics processing on a discrete CPU socket for made for consumer parts.
A GTX Titan? I think 10nm is very likely: 10nm is 5x more dense than 28nm, so your massive 550mm² GTX Titan is reduced to 110mm². Add the CPU and you APU is about the size of Ivy Bridge/Haswell. I don't know how high it will be able to clock, but note that Intel will use germanium at 10nm, which could potentially quite dramatically reduce improve consumption and performance.

This APU with a GTX Titan is quite small, so a bigger 14nm IGP (like the ~260mm² of GT3 Haswell) might also be able to come close. I don't think a Titan will be low-end in 2016.
 

Techhog

Platinum Member
Sep 11, 2013
2,834
2
26
1. That comparing power consumption correctly requires known similar hardware.

2. That Intel's next IGP is going to be faster, too. By how much, we'll have to see. This point in time is not necessarily indicative of all time.

3. Following #2, that Intel can and will have more options with L4. It's just a question of cost v. performance.

IoW, it's reasonable that the 830M, and maybe 840M, will be met or bested by IGPs soon to come. And that marketing slides aren't worth much. The x60M, and desktop x50, are likely 'safe' for a couple years yet.

To you and everyone else agreeing with Torvalds, what is your prediction on how link it'll be until the high-end gamer market becomes too small to be worth supporting? 10 years?
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
The gap is going nowhere. A GM108 is beating Intel's fastest iGPU easily and costs much less.

And GM108 is around 80mm^2 big...

If we look beyond your marketing slide. Real world numbers shows that nVidias and AMDs notebook shipments are in free fall.

IGPs reign the mobile world.

I am sure you are also familiar with these numbers:
Table1Rev3JPG.jpg
 
Last edited:

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
To you and everyone else agreeing with Torvalds, what is your prediction on how link it'll be until the high-end gamer market becomes too small to be worth supporting? 10 years?

Remember there are 2 stops. And one will come some years before the second.

At some point there will be no incentive to develop or shrink the dGPU. However it will still be sold and still be the fastest.

The second point is when IGPs actually outperform dGPUs.

I dont think there is many years before we reach the first point. Certainly less than 10. Maybe just 5. It mainly depends when stacked memory will hit the IGP.
 

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
The gap is going nowhere. A GM108 is beating Intel's fastest iGPU easily and costs much less.

And GM108 is around 80mm^2 big...

NVIDIA-GeForce-800M-Slides%20%2826%29.jpg

This slide is not apples to apples. They're using a 15W CPU with Maxwell while Iris Pro is 45W or so. No surprise that it consumes more. Since this is a slide from Nvidia, they probably also took the best-case scenario for Maxwell. And Maxwell's main competitor will be Gen8, which is going to be a massive upgrade.
 

Techhog

Platinum Member
Sep 11, 2013
2,834
2
26
If we look beyond your marketing slide. Real world numbers shows that nVidias and AMDs notebook shipments are in free fall.

IGPs reign the mobile world.

I am sure you are also familiar with these numbers:
Table1Rev3JPG.jpg

That doesn't say a ton for the overall market. It just says that thinner, lighter notebooks only need the bare minimum graphics performance and that the low-end is being phased out. If you were to restrict this to gaming and high-end workstation laptops, it would tell a different story.

It it obvious that what we currently consider the entry level will die. Will it go much further than that? Only time will tell. Since Intel doesn't even release drivers to optimize gaming performance, and Intel still pretty much only has Iris Pro to bow to Apple's whims, It's really impossible to make a call about what will happen. All we have is theories and assumptions. Id Intel even putting out the best IGP they can in relation to its die size? I doubt it. And to say that we only have two generations of GPUs left? I'll believe it when I see it. There's no evidence to even support the idea that GT2 will be phased out in favor of GT3 anytime soon. Both Broadwell and Skylake will be mostly GT2. Broadwell GT2 will match, but will not exceed, Haswell Iris Pro. Meanwhile, Broadwell Iris Pro will, at best, trade blows with a 750 Ti, and that's if you ignore drivers.
 
Last edited:

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
This slide is not apples to apples. They're using a 15W CPU with Maxwell while Iris Pro is 45W or so. No surprise that it consumes more. Since this is a slide from Nvidia, they probably also took the best-case scenario for Maxwell. And Maxwell's main competitor will be Gen8, which is going to be a massive upgrade.

In a GPU-limited situation, which this should very much be, there's no reason why the CPU in the Iris Pro shouldn't consume levels similar to whatever the 15W CPU is given that it's the same CPU uarch with the same power management. And that's probably the case. IF this is a proper and fair measurement, and that's a big if, then this would qualify as a pretty legitimate slam on the GPU perf/W.

But just for this one particular test case.