About the misconception of "compute" in games

boxleitnerb

Platinum Member
Nov 1, 2011
2,605
6
81
Often it is said, that AMD cards are so fast in certain titles because of "compute". Usually, no one really says what they mean by compute, it is just thrown in as a buzzword. Now I've always thought that this touted special compute ability was rather irrelevant and that it instead was all about raw power (SP GFLOPs) and bandwidth.

I recently had a look at some "Tahiti LE" reviews which are quite interesting, because Tahiti LE cards have about the same SP GFLOPs as a 670/680 hybrid and the same memory bandwidth.

Now as Tahiti LE is still Tahiti, thus should possess quite some compute prowess, but when we look at the results, the card lands at 670/680 levels or even below. The seizable advantage in titles like Sleeping Dogs, AvP, Arma2, Metro2033 etc. is completely gone.

So my conclusion:
Kepler and GCN as architectures are equally good when it comes to compute-heavy games, there is no difference. What matters more and what sets the two apart, is the actual amount of raw power their individual SKUs have.

Any thoughts?
 

Red Hawk

Diamond Member
Jan 1, 2011
3,266
169
106
The Kepler flagship card has the same or better "compute" performance in games as the thirdfourth best GCN card. Therefore, Kepler is equal to GCN at compute performance.

Is that what you're trying to say?
 

Haserath

Senior member
Sep 12, 2010
793
1
81
If there are a lot of interacting objects in the scene AMD has the inter-core bandwidth advantage.

Raytracing, where light interacts and bounces several times in the scene, GCN is many times faster than Kepler.

No games really take advantage of this advantage really. Dirt showdown has a lighting engine that gives AMD a bit of an advantage; it's nowhere near ray tracing's intensity.
 

boxleitnerb

Platinum Member
Nov 1, 2011
2,605
6
81
The Kepler flagship card has the same or better "compute" performance in games as the thirdfourth best GCN card. Therefore, Kepler is equal to GCN at compute performance.

Is that what you're trying to say?

In a way, yes. The comparison is a bit unfair, because it's just logical that a GPU with more raw power will perform better if that power can be used. But that doesn't say anything about the architecture itself. GK110 is Kepler too, the architecture is the same. It's just bigger, has more units and memory controllers. From the beginning I've considered the Kepler lineup misplaced in the market. GK104 should have been positioned above Pitcairn but below Tahiti where it belongs according to its specs.

If anything, Kepler seems to be more efficient than GCN regarding the use of raw power, since on average it is 20% faster with the same resources (FLOPs and bandwidth). Now perf/W and perf/mm2 are other factors where things are different and in favor of AMD. I'm just talking about architectural efficiency.
 
Last edited:

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
In order to measure the compute performance between the two cards we'd have to look at compute benchmarks, not games. Especially when you consider in games the 680 is about 25% faster than the 7870XT. The fact that the 7870XT keeps up with it in compute intensive games just shows how weak the 680's compute capability is.
 

boxleitnerb

Platinum Member
Nov 1, 2011
2,605
6
81
In order to measure the compute performance between the two cards we'd have to look at compute benchmarks, not games. Especially when you consider in games the 680 is about 25% faster than the 7870XT. The fact that the 7870XT keeps up with it in compute intensive games just shows how weak the 680's compute capability is.

We're talking about compute in games, look at the thread title. You cannot just arbitrarily change the subject ;)
The 7870XT keeps up because it has the same compute power in terms of GFLOPs and the same bandwidth. That's it. If Kepler were so bad at compute in games, it would be slower in those games, but it isn't.

Aside from that, what is "compute" and how do you know for certain that it is "compute" that is bottlenecking performance in game X and not memory bandwidth?
 
Last edited:

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
We're talking about compute in games, look at the thread title. You cannot just arbitrarily change the subject ;)
The 7870XT keeps up because it has the same compute power in terms of GFLOPs and the same bandwidth. That's it.

How can you say the 680 and 7870XT have the same compute power when there are so many other functions going on while gaming? If you compare OpenCL benchmarks alone, for example, Pitcairn, which is slower than the 7870XT, creams the 680. That would be a pure compute comparison.
 

boxleitnerb

Platinum Member
Nov 1, 2011
2,605
6
81
How can you say the 680 and 7870XT have the same compute power when there are so many other functions going on while gaming? If you compare OpenCL benchmarks alone, for example, Pitcairn, which is slower than the 7870XT, creams the 680. That would be a pure compute comparison.

I never said they have the same compute power in general. I contested that it is this compute thing that is the reason why Tahiti/GCN is quite a bit faster in certain gaming titles. Many here think so, at least that is my impression. The thread title clearly states "compute in games". Cannot get any clearer than that.
 

notty22

Diamond Member
Jan 1, 2010
3,375
0
0
Here is another take on the current gpu's architecture and compute performance:
Why AMD FirePro Still Cannot Compete Against NVIDIA Quadro, Old or New?


The K5000m is essentially a 100watt TDP GTX 670.

Apples to Oranges - Desktop AMD FirePro W9000 Beaten by Laptop-based Kepler - K5000M
While we aren't currently able to measure apples-to-apples e.g. workstation parts, we can give you an insight into what Kepler-based Quadros can do. K5000M 4GB is almost identical to its desktop version, with some sacrifices made in order to fit the 100 Watt thermal limit. 22% less power means that NVIDIA had to fuse off one SMX cluster from the GK104, resulting in 1344-core part. Furthermore, the 4GB GDDR5 memory is clocked down to just 750MHz QDR (three billion transfers per second), giving 96GB of compute/video bandwidth. Our benchmark system was consisted out of Intel Core i7-2960XM, 16GB DDR3-1333 memory and a single SSD. Note that this system is significantly weaker than Intel Core i7-3770K processor. 16GB DDR3-1333 memory and the solid state drive were almost identical in both cases.
In SPEC ViewPerf 11, the results are as follow:
As you can see, a notebook part with a 32nm notebook CPU operating at 2.7GHz (3.7GHz Turbo) was competitive against a 22nm desktop CPU operating at 3.5GHz with a 3.9GHz Turbo mode.
 

Will Robinson

Golden Member
Dec 19, 2009
1,408
0
0
Somehow...someway....there must be a way to make this generation of NV cards appear as good or faster than Tahiti.
Keep trying Boxy....its just a hoot watching it.:)
 
Last edited:

Will Robinson

Golden Member
Dec 19, 2009
1,408
0
0
I never said they have the same compute power in general. I contested that it is this compute thing that is the reason why Tahiti/GCN is quite a bit faster in certain gaming titles. Many here think so, at least that is my impression. The thread title clearly states "compute in games". Cannot get any clearer than that.
Notty musta missed that bit...(interoffice memo down?):whiste:
 

Jaydip

Diamond Member
Mar 29, 2010
3,691
21
81
I believe people often interchangeably use sp/dp compute but they are totally different.The dp of Kepler consumer cards is 1/24th compared to 1/8th of Fermi.But very few applications use dp, the medical imaging software we write uses it but still most of them are sp in nature.I think Kepler does fairly well in sp compute so it will do quite good at games that uses it extensively.
 

Red Hawk

Diamond Member
Jan 1, 2011
3,266
169
106
In a way, yes. The comparison is a bit unfair, because it's just logical that a GPU with more raw power will perform better if that power can be used. But that doesn't say anything about the architecture itself. GK110 is Kepler too, the architecture is the same. It's just bigger, has more units and memory controllers. From the beginning I've considered the Kepler lineup misplaced in the market. GK104 should have been positioned above Pitcairn but below Tahiti where it belongs according to its specs.

If anything, Kepler seems to be more efficient than GCN regarding the use of raw power, since on average it is 20% faster with the same resources (FLOPs and bandwidth). Now perf/W and perf/mm2 are other factors where things are different and in favor of AMD. I'm just talking about architectural efficiency.

GK110 is not simply a bigger GK104 like Tahiti is a bigger Pitcairn. There are physical components that are completely absent from GK104, such as a hardware scheduler. GK110 is somewhat irrelevant when talking about the architecture of GK104.
 

boxleitnerb

Platinum Member
Nov 1, 2011
2,605
6
81
Somehow...someway....there must be a way to make this generation of NV cards appear as good or faster than Tahiti.
Keep trying Boxy....its just a hoot watching it.:)

If you have nothing of substance to contribute to this thread, please refrain from posting and trolling here. Thank you.

GK110 is not simply a bigger GK104 like Tahiti is a bigger Pitcairn. There are physical components that are completely absent from GK104, such as a hardware scheduler. GK110 is somewhat irrelevant when talking about the architecture of GK104.

Source? It makes no sense for Nvidia to change scheduling again with GK110. And remember, I was talking about Kepler, not GK104 specifically. GK110 is a bigger Kepler. If course there are differences like HyperQ, DP etc., but these have no relevance for gaming. Caches are a different thing, though. To my knowledge Tahiti is a cache monster, has lots of it. But looking at the Tahiti LE results, that doesn't seem to make any difference in gaming either. So I doubt the improvements of GK110 in this area will matter much or at all compared to GK104.
 
Last edited:

Mopetar

Diamond Member
Jan 31, 2011
8,154
6,871
136
Would be useful to have a solid definition for what you mean by "compute" before going any further. If nothing else, give several examples where "compute" performance is important in games. As in titles and specific aspects of that game.

Otherwise this is just going to turn into the usual he said she said crapfest, flame war, and circle jerk.
 

boxleitnerb

Platinum Member
Nov 1, 2011
2,605
6
81
That is exactly the problem! You would have to ask those members here that have used this term in the context of discussing gaming performance.
Personally, I have no idea which is why I've put the word in double quotes in my first post.
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106

Performance in SPEC ViewPerf is a function of driver optimizations. It has nothing to do with compute capabilities, per se.

Given that AMD skipped us from their FirePro briefing, yet alone review list, we had to wait until the first results came out. Thanks to our colleague Joel at HotHardware, you are able to see that a $4000 FirePro W9000 cannot keep up with Kepler-based Quadro 6000 in numerous tests, and in some tests the W9000 suffered an indignity of being defeated by a $749.99 Quadro 4000.

Without going back and reading the original review from HotHardware, which your link references for it's AMD performance as they didn't have a card, IIRC AMD said they hadn't optimized drivers for SPEC ViewPerf 11 benchmark suite. Even if they had though, I doubt they'd beat nVidia's performance. I'm pretty certain if you took all of the workstations at Autodesk, NewTek, and any other 3D software company, you'd be lucky to find any of them running AMD graphics cards. The software is developed running nVidia graphics cards.
 

mango123

Senior member
Sep 1, 2012
214
0
0
they are talking about Direct Compute which is a feature of I believe AMD only GPUs... which is why all the BITCOIN miners use AMD setups, and not crossfire 680s. The new gear from that one company that is made specifically for mining is a GPU that literally JUST directcomputes.

It's a form of... processing.

They are not talking about "Compute" power.

Direct Compute power.

http://en.wikipedia.org/wiki/DirectCompute:colbert:
 

boxleitnerb

Platinum Member
Nov 1, 2011
2,605
6
81
DirectCompute works on every DX11 GPU.
The algorithm used for bitcoin mining uses specific functions that AMD GPUs can do faster than Nvidia GPUs. I believe it is some shuffle function, but I don't remember where I've read that. Anyway, this has nothing to do with DirectCompute.
 

VulgarDisplay

Diamond Member
Apr 3, 2009
6,188
2
76
Often it is said, that AMD cards are so fast in certain titles because of "compute". Usually, no one really says what they mean by compute, it is just thrown in as a buzzword. Now I've always thought that this touted special compute ability was rather irrelevant and that it instead was all about raw power (SP GFLOPs) and bandwidth.

I recently had a look at some "Tahiti LE" reviews which are quite interesting, because Tahiti LE cards have about the same SP GFLOPs as a 670/680 hybrid and the same memory bandwidth.

Now as Tahiti LE is still Tahiti, thus should possess quite some compute prowess, but when we look at the results, the card lands at 670/680 levels or even below. The seizable advantage in titles like Sleeping Dogs, AvP, Arma2, Metro2033 etc. is completely gone.

So my conclusion:
Kepler and GCN as architectures are equally good when it comes to compute-heavy games, there is no difference. What matters more and what sets the two apart, is the actual amount of raw power their individual SKUs have.

Any thoughts?

Wait so a $240 dollar GPU performs the same as a $500 GPU therefore GCN is slower than Kepler? What are you even trying to say. My brain just melted. o_O
 

Pottuvoi

Senior member
Apr 16, 2012
416
2
81
Would be useful to have a solid definition for what you mean by "compute" before going any further. If nothing else, give several examples where "compute" performance is important in games. As in titles and specific aspects of that game.
Compute means calculations on GPU that do not go trough standard pixel/vertex shading phases.
you can also do 'GPGPU' by rendering quad to buffer and calculate whatever you want with pixel shader. (most trick effects on ps2 did this and particle physics on Halo reach as well.)

Battlefield 3 uses DC for lighting the environment and possibly for other things as well.
(forward rendering to g-buffer, DC for lighting, forward rendering for transparent surfaces)

Performance in compute depends a lot on methods what software writer uses and small changes on software can cause catastrophic performance drop.
These pitfalls are different on each architecture as are the best case scenarios, this makes looking single small benchmark worthless on large scheme.

Currently AMD holds the GFLOP crown, but nvidia parts are faster on some cases.
 
Last edited:

boxleitnerb

Platinum Member
Nov 1, 2011
2,605
6
81
Wait so a $240 dollar GPU performs the same as a $500 GPU therefore GCN is slower than Kepler? What are you even trying to say. My brain just melted. o_O

Take a deep breath and read my posts again, carefully. There is nothing there about price. Nothing. Price has nothing to do with architecture whatsoever.
 
Last edited:

ICDP

Senior member
Nov 15, 2012
707
0
0
Take a deep breath and read my posts again. There is nothing there about price. Nothing. Do you have trouble staying on a certain topic?

The problem is that the cards you are comparing are not in the same price bracket. So people will automatically jump on that, regardless of your true reason for the comparison.

Honestly, most don't know or don't care why the 7970 is faster than the GTX 680 in most games. All that matters is that it is, and on top of that is it cheaper.