Are GPUs ending up like CPUs?

CrazyElf · Sep 23, 2014

The recent launch of the Maxwell, at about 400mm^2 did not lead to a much faster chip than the existing ~560mm^2 Kepler 780Ti, perhaps 5% at stock.

With an overclock, that number will undoubtedly be bigger, say 10-15% faster (assume the top 780Ti could sustain perhaps ~1300-1350 MHz on water 24-7; any higher and you'll be using voltages that will seriously reduce the life of your chip). Perhaps 1600-1700 MHz might be realistic for "little" Maxwell on water?

But the issue is, that factoring in the performance, the leap from say 580 > 680 brought about a roughly ~30% gain in gaming performance (roughly). The 580 was a 520mm^2 die going to a ~300mm^2 GTX 680. Granted, this was with a die shrink 40 > 28 nm, but it was still a substantial gain in terms of performance, even more if you factored in overclocking headroom.

The leap from 780Ti > 980 does not appear to be bringing anything close to the 580 > 680 leap.

A "big >550mm^2 Maxwell" might perform perhaps 30-40% faster than the current 980 (there's less headroom than 780Ti to 680 because the 680 was a ~300mm^2 chip and the 980 is a ~400mm^2 chip), unless they make a >600mm^2 chip.

We have yet to see what AMD has to offer at this point, but I'd imagine that the gains would be comparable to Maxwell - perhaps somewhat better with GCN.

So this leaves the question - where do GPUs go from here?

Moore's Law seems to have died out at 28nm. Price per transistor is going up with each new generation of node. In fact, low and medium end stuff may stay on 28 nm for good. Fab costs are going up, at the same time, the marginal benefit for moving onto the next node is dropping. Things like EUV and 450mm wafers appear problem plagued. It's been claimed that FD SOI 20nm may give a new lease on life to Moore's Law (STMicro especially says this), but I remain skeptical (I hope they are right though).
This means that it will have to rely on mostly architectural gains per generation. There are some technologies that are exciting, like HBM, but for how long will we continue to see performance gains based mostly on architecture? See the above on my thoughts from Kepler to Maxwell. Whatever happens next at both AMD and Nvidia will probably lead to even smaller marginal gains.
We might see 16nm FinFET based GPUs, but I'd imagine that they'd only be ~15% faster at a given power level compared to their 28nm counterparts. Compounding the issue, their price per transistor might be higher and the OC headroom might be lower owing to leakage.
How big could a die get? As the 28nm process continues to mature, we could see higher yields, but eventually it will flatten out. The largest die I have ever heard of was Intel's Tukwila at ~700mm^2. That must be near the reticule size. Could we see a big 700mm^2 GPU at 28nm ever? Could dies get any bigger (we're talking >1000mm^2 here)?

Does this mean that barring a breakthrough, like in III-V materials, we are starting to see GPUs end up like CPUs?

On one hand, it's disappointing that this is the stagnation. On the other hand, it may finally make sense, for the first time, to go quad-GPU knowing that next year's GPUs may not be much faster.

I'm thinking it may end up like CPUs where well, let me put it this way. Imagine if you own a 2600K and you had good luck with the silicon lottery (ex: 5GHz+ at 1.45V or under and stable under Intel Burn Test). Haswell, even with Devil's Canyon might prove a sidegrade for 4 cores, unless you need the new instruction sets, in which case an "E" series might be justifiable.

Are GPUs starting to end up like that? Granted, GPUs are much more parallel in nature than CPUs, but they still are limited by architecture and die shrinks.

Historically, AMD's GPU's have generally offered better 3-way and 4-way GPU scaling.

Considering Maxwell SLI offered about ~60% performance at 4K, I do not see that changing.

f1sherman · Sep 23, 2014

CrazyElf said:
The leap from 780Ti > 980 does not appear to be bringing anything close to the 580 > 680 leap.

Why compare Ti to non-Ti?

Is it because we are talking from technical/physics standpoint?
Then lets stick to same physical descriptors. Like die size.

Leap from 550mm2 780 Ti to 550-600mm2 980 Ti will be ~50%

And that's on the same node. So no, GPU situation does not look anything like current CPU stuation.

CrazyElf · Sep 23, 2014

You do have a point there in terms of performance per mm^2, about 50%.

Hmm ... let's do an approximation here.

Assuming an OC, let's say that the 980 performs better than the 780Ti by about 15% (5% stock).

So that would mean
Performance = 550 (780Ti die size)/400 (roughly 980 die size) x 1.15
Performance = About 60% more performance per die

Of course the large chip will not scale on a performance per mm^2 on a 1 for 1 basis, owing to other factors, but let's say 50% as you say.

What did the 580 to 680 leap bring? Was it more than 50%?

Edit:
Remember, there may not be a node shrink for a while because Moore's Law has ended at 28nm and 20nm has a higher cost per transistor (at least for the next couple of years).

f1sherman · Sep 23, 2014

580 -> 680 was ~35-40%
IMHO big 28nm Maxwell being 50% over 780Ti is not being too generous to Nvidia.

All I'm saying is you took the worst possible time to bring up the stagnation theory.
Absolute perf gains over last-gen high-end are not impressive, that's a fact.
But are we not talking from the tech/arch standpoint?
Also take a look at their respective launch prices.

And we don't have to go deep with numbers.
Lets keep it simple:

On the big-medium die(GM204) they are bringing half gen performance over the big die (GK110) - on the same node(*). On much smaller TDP(!)

Sounds to me like you forgot that last part(*) when coming up with the stagnation theory.

CrazyElf · Sep 23, 2014

I don't dispute that power efficiency has made some substantial gains from Fermi to Kepler to Maxwell.

Yuriman · Sep 23, 2014

I think we'll see GPUs moving to new nodes, even if it isn't cost effective. nVidia is designing mobile-first now - imagine how well their smartphone GPUs will compete if they stay on 28nm. Intel, Qualcomm and friends will not be staying on 28nm with their GPUs. Desktop will get dragged along with it.

f1sherman · Sep 23, 2014

Ppl kept saying who cares about power efficiency.

Although perf/W is pretty much a knockout in ANYTHING but Desktop and Workstations,
and although eventually(with maxed mm2 die) perf/W translates pretty damn good to perf itself, even at those 2 market segments;
so yes - that's why perf/W is a king.

But OK, to hell with pwr efficiency

Take a look at consumer's king metrics perf/$

For half of 780Ti price you are getting 5% slower product
(which very well may end up faster in DX12) with 33% more VRAM.
That's a pretty damn impressive.

wand3r3r · Sep 23, 2014

They do appear to be slowing in progression. It doesn't help when they have been milking out every tiny bump as "new" gpus.

580 -> +35%/$0 -> 680
680 -> +30%/+$500 -> titan
titan -> -10%/-$350 -> 780
780 -> +13%/+$50-> 780 ti
780 ti -> +5%/-$150-> 980.

Single digit performance increases will mean pushing other things such as price cuts, efficiency and whatever else they can market since the performance isn't there.

Overall this has mean that since the 580 release in 2010 we've gotten barely over a 100% increase in performance. They've milked it out with 5 "releases" though without dropping the price in the end (still full priced like the 580). We finally get over 100% more but it's still costing $330-550. The "glory days" with nearly doubling ever 18 months or so (24?) appear to be gone.

f1sherman · Sep 23, 2014

f1sherman said:
For half of 780Ti price you are getting 5% slower product
(which very well may end up faster in DX12) with 33% more VRAM.
That's a pretty damn impressive.

Oh I forgot to mention - on the same node.

TAKE THAT, INTEL

wand3r3r · Sep 23, 2014

f1sherman said:
Ppl kept saying who cares about power efficiency.

Although perf/W is pretty much a knockout in ANYTHING but Desktop and Workstations,
and although eventually(with maxed mm2 die) perf/W translates pretty damn good to perf itself, even at those 2 market segments;
so yes - that's why perf/W is a king.

But OK, to hell with pwr efficiency

Take a look at consumer's king metrics perf/$

For half of 780Ti price you are getting 5% slower product
(which very well may end up faster in DX12) with 33% more VRAM.

Sure it matters in a laptop or in a university cluster. Most people here seem to be enthusiasts and aren't concerned about $10/year in electric costs. Sure there is a niche in the enthusiast community that likes small computers or whatever, but in general we seem to await performance increases and don't fall for marketing as easily. If it performed +50% over the ti (without a 50% price increase) it would be snapped up regardless of the marketing slogans on the box.

On the 5% slower than the ti, well ~15% has been available for nearly half the price for nearly a year (290). Of course it's great that the price got cut and performance went up, but it's only incremental and it's a year later. Not to downplay the 970, it's currently the best high end budget card, but it's not that big of a game changer in the end.

On top of all of that, the titan has been around forever which is the same silicon. A few percent bump is nice, but the 290 already brought the rip off prices down to earth. The 970 just improved upon that.

dragantoe · Sep 23, 2014

Yuriman said:
imagine how well their smartphone GPUs will compete if they stay on 28nm.

considering the tegra k1 is almost 2x the performance of the snapdragon 805, and the atom is a joke in terms of gpu perf, I see no reason for them to move to a different size any time soon...

f1sherman · Sep 23, 2014

wand3r3r said:
Single digit performance increases will mean pushing other things such as price cuts, efficiency and whatever else they can market since the performance isn't there.

This is getting silly.

All of a sudden price is just a nuisance and a side-argument.
Then just buy 2 980, and compare that to last gen.

Suddenly we don't care about about price, we don't care about technical aspects, we don't give a damn about all the new toys.

wand3r3r said:
Most people here seem to be enthusiasts and aren't concerned about $10/year in electric costs.

Who even mentions electric costs here, other than you?
And all of a sudden you are only concerned about ppl with unlimited $$.
Then the answer is still the same 4x Titan Blacks on water loop. But have you ever given this advice to anyone?

Never give credit where credit is due

wand3r3r · Sep 23, 2014

f1sherman said:
<snip>

I think you misread my post. Try again.
(and please don't put words into my mouth, you completely missed the point)

CrazyElf · Sep 23, 2014

wand3r3r said:
They do appear to be slowing in progression. It doesn't help when they have been milking out every tiny bump as "new" gpus.

680 -> +30%/+$500 -> titan
titan -> -10%/-$350 -> 780
780 -> +13%/+$50-> 780 ti
780 ti -> +5%/-$150-> 980.

Single digit performance increases will mean pushing other things such as price cuts, efficiency and whatever else they can market since the performance isn't there.

Compare that to say, what we were getting a few years ago. It does seem to be slowing down relatively speaking.

The "big" Maxwell is looking like another uber expensive Titan-like card, unless AMD has something pretty solid to respond with. Let's hope that they have another 290, only a bit more power efficient this time around.

Pariah · Sep 23, 2014

Efficiency appears to be the be all end all in the industry right now, unless your name is AMD. The 900 series reviews showed there is plenty of clock room left on the 970 and 980 if Nvidia had chosen to raise the TDP slightly. They easily could have increased performance by 10% with higher clock rates without significantly increasing power usage. NVidia obviously knows what these chips are capable of, it looks like they are artificially holding clock rates down so they can release a Ti/Titan in the near future.

Sandy Bridge 2600k/2500k was released almost 4 years ago. Was there any sample from day one that wasn't able to hit the 4GHz barrier without crazy voltage increases? When did Intel finally release a 4GHz CPU? About 3 months ago. Why did it take that long? I have no idea, maybe someone else does.

Grooveriding · Sep 23, 2014

It is going that way. The performance improvement from 780ti to 980 with both at stock is about 8% on average. That is for all intents and purposes useless if you have a 780ti and want better performance. I run my 780tis as 1300/7500 and the only 980 I have seen that was faster was one running at close to 1600mhz. It's just not an upgrade. The review Linus tech tips did is interesting as they did all cards overclocked and the 780ti is very slightly faster half the time and near the same most of the other benches.

There is still room for actual performance to come though. We'll see a 550mm2+ Maxwell likely in a few months that should give us 35-40% over a 780ti. What can they do though with their inability to afford TSMC's 20nm. If not for the efficiency of Maxwell we would of gotten nothing better than Kepler. Unlike Intel who chooses to deliver more efficiency and slight performance increases even with a new process, nvidia and AMD are at the mercy of TSMC and competing against other customers for fab space and what they can manufacture profitably.

I think where it will get really ugly is even in a year when TSMC's 16nm Finfet is available andthey want to finally get off the five plus years of 28nm, they will still be competing against the likes of Apple for fab space. So they may not be able to afford the move immediately and we wait even longer, plus GPUs will probably cost truly insane prices at retail to compensate for manufacturing cost.

DaveSimmons · Sep 23, 2014

Sandy Bridge 2600k/2500k was released almost 4 years ago. Was there any sample from day one that wasn't able to hit the 4GHz barrier without crazy voltage increases?

Probably because the 2500K / 2600K are only 99.9% stable at 4GHz and there would be class action lawsuits if they sold the CPUs guaranteeing 100% error free use at that speed. The first P3 1.x GHz released to play catch-up with AMD was basically overclocked and Toms got it to fail doing Linux kernel builds.

When you OC to play a game and get a crash or corrupted save now and then you can shrug and reboot or load an older save. If someone's "real work" gets corrupted they could sue.

cytg111 · Sep 23, 2014

Good question. Of course CPU's is targetting watts above all else atm trying to get into those lesser form factors - and we also know that a new proces node is targetted special features ie. perf-watt / high clocks / low leakage etc. So when Glofo and TSMC is designing their new nodes what gives the direction? Demand im sure.. but point being, being fabless means that you gotta build with the bricks available to you and those bricks may be perf/watt oriented atm.

fronzentundra has a point here

http://forums.anandtech.com/showpost.php?p=36729927&postcount=47

Who knows, we may see 3-way SLI on a single card in the future

.

Pariah · Sep 23, 2014

DaveSimmons said:
Probably because the 2500K / 2600K are only 99.9% stable at 4GHz and there would be class action lawsuits if they sold the CPUs guaranteeing 100% error free use at that speed.

Wouldn't the binning process catch that .1%?

Intel never released an 1155 SB above a base clock of 3.5Ghz. 2011 made it to 3.6Ghz. Ivy Bridge 1155 never made it past 3.5Ghz either. If AMD had anything worthwhile in the highend, you can bet these two lines would not have topped out at 3.5Ghz.

CrazyElf · Sep 23, 2014

At this point, I'd say that we are going to be on 28nm for a while. That's the key conclusion that I have drawn from all of this.

Big Maxwell will lead to another 35-40% stock, maybe up to 50% (assuming more OC headroom than a 780Ti with a third party PCB and water cooling), and it's pricing ... will depend on AMD. Otherwise we can expect to be milked. For a single generation, I guess that's a decent gain overall.

But its not Maxwell that makes me interested, nor whatever AMD has. It's the generation beyond this one and the one after that. It's all down to architectural gains from now on.

xpea · Sep 23, 2014

dragantoe said:
considering the tegra k1 is almost 2x the performance of the snapdragon 805, and the atom is a joke in terms of gpu perf, I see no reason for them to move to a different size any time soon...

and still, Erista (Tegra M1 with Denver CPU + Maxwell GPU) tapped out this summer on 20nm...
Can't wait for this one in a tablet :thumbsup:

3DVagabond · Sep 23, 2014

f1sherman said:
Why compare Ti to non-Ti?

Is it because we are talking from technical/physics standpoint?
Then lets stick to same physical descriptors. Like die size.

Leap from 550mm2 780 Ti to 550-600mm2 980 Ti will be ~50%

And that's on the same node. So no, GPU situation does not look anything like current CPU stuation.

The 580 was the equivalent of the 780ti in the stack and 980 is also the equivalent of 680 (as far as we know.

escrow4 · Sep 23, 2014

If a 980 Ti is released and if its at least ~30% faster consistent over a 780 Ti I'll buy one Day 1 guaranteed assuming Gigabyte has a decent Windforce model out. $1K or not. A 980 is pffft.

CrazyElf · Sep 23, 2014

So far, the more I think about this, the more pessimistic I'm becoming. There will continue to be gains, but the rate of gain per generation will drop rapidly. It already has I would argue to an extent.

escrow4 said:
If a 980 Ti is released and if its at least ~30% faster consistent over a 780 Ti I'll buy one Day 1 guaranteed assuming Gigabyte has a decent Windforce model out. $1K or not. A 980 is pffft.

That's why Big Maxwell will cost so much. They know that there are people that are willing to be milked.

Not sure why you think so highly of the Windforce. I highly doubt that the G1 or SOC Force will have the top custom PCB. Most likely that will go to the MSI Lightning or EVGA Classified.

SoulWager · Sep 23, 2014

I wouldn't worry too much, CPUs are still getting more power efficient. The reason CPUs can't turn that into more performance is because CPU workloads are difficult to parallelize across multiple cores. GPU workloads don't have that problem, so you can get more performance by just throwing a lot more cores in parallel.

Are GPUs ending up like CPUs?

Member

Platinum Member

Member

Platinum Member

Member

Diamond Member

Platinum Member

Diamond Member

Platinum Member

Diamond Member

Senior member

Platinum Member

Diamond Member

Member

Elite Member

Diamond Member

Elite Member

Lifer

Elite Member

Member

Senior member

Lifer

Diamond Member

Member

Member