|
|
 |
|
02-06-2013, 10:56 PM
|
#1
|
|
Golden Member
Join Date: Oct 2011
Posts: 1,839
|
About the misconception of "compute" in games
Often it is said, that AMD cards are so fast in certain titles because of "compute". Usually, no one really says what they mean by compute, it is just thrown in as a buzzword. Now I've always thought that this touted special compute ability was rather irrelevant and that it instead was all about raw power (SP GFLOPs) and bandwidth.
I recently had a look at some "Tahiti LE" reviews which are quite interesting, because Tahiti LE cards have about the same SP GFLOPs as a 670/680 hybrid and the same memory bandwidth.
Now as Tahiti LE is still Tahiti, thus should possess quite some compute prowess, but when we look at the results, the card lands at 670/680 levels or even below. The seizable advantage in titles like Sleeping Dogs, AvP, Arma2, Metro2033 etc. is completely gone.
So my conclusion:
Kepler and GCN as architectures are equally good when it comes to compute-heavy games, there is no difference. What matters more and what sets the two apart, is the actual amount of raw power their individual SKUs have.
Any thoughts?
|
|
|
02-06-2013, 11:07 PM
|
#2
|
|
Diamond Member
Join Date: Jan 2005
Location: Cape Cod MA
Posts: 5,502
|
My thoughts are that most people have no idea what they are talking about.
__________________
I'm a Video card fanboy- 60+ old school to present top end cards
Heat 123-0-0
|
|
|
02-06-2013, 11:27 PM
|
#3
|
|
Golden Member
Join Date: Jan 2011
Posts: 1,525
|
The Kepler flagship card has the same or better "compute" performance in games as the thirdfourth best GCN card. Therefore, Kepler is equal to GCN at compute performance.
Is that what you're trying to say?
__________________
Desktop: Thermaltake V-4 Black case | Gigabyte GA-Z68AP-D3 | Core i5 2500k @ 4 GHz | ASUS Radeon HD 7870 DirectCU II 2 GB @ 1010 MHz
8 GB G.Skill DDR3 RAM 1333 MHz | 120GB OCZ Vertex 3 SSD & Western Digital 500 GB HDD | Antec 650w PSU | Acer 1080p 60 Hz 21.5'' | Windows 8 Pro
Laptop: ASUS K52Jr-X5 | Core i3-350m @ 2.26 GHz| Mobility Radeon HD 5470 1 GB @ 750 MHz
4 GB DDR3 RAM | 90 GB OCZ Agility 3 SSD | 1366x768 15.6'' | Windows 8 Pro /Ubuntu 12
|
|
|
02-06-2013, 11:28 PM
|
#4
|
|
Senior Member
Join Date: Sep 2010
Posts: 537
|
If there are a lot of interacting objects in the scene AMD has the inter-core bandwidth advantage.
Raytracing, where light interacts and bounces several times in the scene, GCN is many times faster than Kepler.
No games really take advantage of this advantage really. Dirt showdown has a lighting engine that gives AMD a bit of an advantage; it's nowhere near ray tracing's intensity.
|
|
|
02-06-2013, 11:31 PM
|
#5
|
|
Diamond Member
Join Date: Jan 2005
Location: Cape Cod MA
Posts: 5,502
|
Maybe if GE takes off we will see more of these advantages put to good use
__________________
I'm a Video card fanboy- 60+ old school to present top end cards
Heat 123-0-0
|
|
|
02-06-2013, 11:41 PM
|
#6
|
|
Golden Member
Join Date: Oct 2011
Posts: 1,839
|
Quote:
Originally Posted by Red Hawk
The Kepler flagship card has the same or better "compute" performance in games as the thirdfourth best GCN card. Therefore, Kepler is equal to GCN at compute performance.
Is that what you're trying to say?
|
In a way, yes. The comparison is a bit unfair, because it's just logical that a GPU with more raw power will perform better if that power can be used. But that doesn't say anything about the architecture itself. GK110 is Kepler too, the architecture is the same. It's just bigger, has more units and memory controllers. From the beginning I've considered the Kepler lineup misplaced in the market. GK104 should have been positioned above Pitcairn but below Tahiti where it belongs according to its specs.
If anything, Kepler seems to be more efficient than GCN regarding the use of raw power, since on average it is 20% faster with the same resources (FLOPs and bandwidth). Now perf/W and perf/mm2 are other factors where things are different and in favor of AMD. I'm just talking about architectural efficiency.
Last edited by boxleitnerb; 02-06-2013 at 11:55 PM.
|
|
|
02-07-2013, 12:04 AM
|
#7
|
|
Diamond Member
Join Date: Aug 2009
Location: Christchurch, NZ
Posts: 5,149
|
In order to measure the compute performance between the two cards we'd have to look at compute benchmarks, not games. Especially when you consider in games the 680 is about 25% faster than the 7870XT. The fact that the 7870XT keeps up with it in compute intensive games just shows how weak the 680's compute capability is.
|
|
|
02-07-2013, 12:09 AM
|
#8
|
|
Golden Member
Join Date: Oct 2011
Posts: 1,839
|
Quote:
Originally Posted by 3DVagabond
In order to measure the compute performance between the two cards we'd have to look at compute benchmarks, not games. Especially when you consider in games the 680 is about 25% faster than the 7870XT. The fact that the 7870XT keeps up with it in compute intensive games just shows how weak the 680's compute capability is.
|
We're talking about compute in games, look at the thread title. You cannot just arbitrarily change the subject 
The 7870XT keeps up because it has the same compute power in terms of GFLOPs and the same bandwidth. That's it. If Kepler were so bad at compute in games, it would be slower in those games, but it isn't.
Aside from that, what is "compute" and how do you know for certain that it is "compute" that is bottlenecking performance in game X and not memory bandwidth?
Last edited by boxleitnerb; 02-07-2013 at 12:12 AM.
|
|
|
02-07-2013, 12:12 AM
|
#9
|
|
Diamond Member
Join Date: Aug 2009
Location: Christchurch, NZ
Posts: 5,149
|
Quote:
Originally Posted by boxleitnerb
We're talking about compute in games, look at the thread title. You cannot just arbitrarily change the subject 
The 7870XT keeps up because it has the same compute power in terms of GFLOPs and the same bandwidth. That's it.
|
How can you say the 680 and 7870XT have the same compute power when there are so many other functions going on while gaming? If you compare OpenCL benchmarks alone, for example, Pitcairn, which is slower than the 7870XT, creams the 680. That would be a pure compute comparison.
|
|
|
02-07-2013, 12:15 AM
|
#10
|
|
Golden Member
Join Date: Oct 2011
Posts: 1,839
|
Quote:
Originally Posted by 3DVagabond
How can you say the 680 and 7870XT have the same compute power when there are so many other functions going on while gaming? If you compare OpenCL benchmarks alone, for example, Pitcairn, which is slower than the 7870XT, creams the 680. That would be a pure compute comparison.
|
I never said they have the same compute power in general. I contested that it is this compute thing that is the reason why Tahiti/GCN is quite a bit faster in certain gaming titles. Many here think so, at least that is my impression. The thread title clearly states "compute in games". Cannot get any clearer than that.
|
|
|
02-07-2013, 12:27 AM
|
#11
|
|
Diamond Member
Join Date: Jan 2010
Location: Beantown
Posts: 3,146
|
Here is another take on the current gpu's architecture and compute performance:
Why AMD FirePro Still Cannot Compete Against NVIDIA Quadro, Old or New?
The K5000m is essentially a 100watt TDP GTX 670.
Quote:
Apples to Oranges - Desktop AMD FirePro W9000 Beaten by Laptop-based Kepler - K5000M
While we aren't currently able to measure apples-to-apples e.g. workstation parts, we can give you an insight into what Kepler-based Quadros can do. K5000M 4GB is almost identical to its desktop version, with some sacrifices made in order to fit the 100 Watt thermal limit. 22% less power means that NVIDIA had to fuse off one SMX cluster from the GK104, resulting in 1344-core part. Furthermore, the 4GB GDDR5 memory is clocked down to just 750MHz QDR (three billion transfers per second), giving 96GB of compute/video bandwidth. Our benchmark system was consisted out of Intel Core i7-2960XM, 16GB DDR3-1333 memory and a single SSD. Note that this system is significantly weaker than Intel Core i7-3770K processor. 16GB DDR3-1333 memory and the solid state drive were almost identical in both cases.
In SPEC ViewPerf 11, the results are as follow:
As you can see, a notebook part with a 32nm notebook CPU operating at 2.7GHz (3.7GHz Turbo) was competitive against a 22nm desktop CPU operating at 3.5GHz with a 3.9GHz Turbo mode.
|
__________________
I5 750@3940mhz , Gigabyte p55 ud4p
1600mhz ddr3 4GB
GTX 660 2gb SC
Let's make sure history never forgets... the name... 'Enterprise'. Picard out.
|
|
|
02-07-2013, 12:31 AM
|
#12
|
|
Golden Member
Join Date: Dec 2009
Posts: 1,019
|
Somehow...someway....there must be a way to make this generation of NV cards appear as good or faster than Tahiti.
Keep trying Boxy....its just a hoot watching it.
__________________
Intel Core i7 860 Quad @ 3.8GHz |Gigabyte P55-UD4 MB
Corsair H50 Liquid Cooling|Patriot Wildfire SSD 120GB
XFX Radeon HD5850 |Corsair DDR3 RAM x 12GB
2x HDD 2.75TB |X-Fi Extreme Gamer Fatal1ty Pro
Dell Ultrasharp U2410 |Logitech Z323 2.1 Audio
Abee 700W PSU |Windows 7 64bit
Last edited by Will Robinson; 02-07-2013 at 12:37 AM.
|
|
|
02-07-2013, 12:35 AM
|
#13
|
|
Golden Member
Join Date: Dec 2009
Posts: 1,019
|
Quote:
|
I never said they have the same compute power in general. I contested that it is this compute thing that is the reason why Tahiti/GCN is quite a bit faster in certain gaming titles. Many here think so, at least that is my impression. The thread title clearly states "compute in games". Cannot get any clearer than that.
|
Notty musta missed that bit...(interoffice memo down?)
__________________
Intel Core i7 860 Quad @ 3.8GHz |Gigabyte P55-UD4 MB
Corsair H50 Liquid Cooling|Patriot Wildfire SSD 120GB
XFX Radeon HD5850 |Corsair DDR3 RAM x 12GB
2x HDD 2.75TB |X-Fi Extreme Gamer Fatal1ty Pro
Dell Ultrasharp U2410 |Logitech Z323 2.1 Audio
Abee 700W PSU |Windows 7 64bit
|
|
|
02-07-2013, 12:36 AM
|
#14
|
|
Platinum Member
Join Date: Mar 2010
Posts: 2,038
|
I believe people often interchangeably use sp/dp compute but they are totally different.The dp of Kepler consumer cards is 1/24th compared to 1/8th of Fermi.But very few applications use dp, the medical imaging software we write uses it but still most of them are sp in nature.I think Kepler does fairly well in sp compute so it will do quite good at games that uses it extensively.
__________________
Windows 7 Home Premium 64 bit || i7 920 D0 @3.8 with CM V6-GT || Sabertooth X58||Asus Reference 680@1218 || Corsair Vengeance 12GB 1600 || WD Cavier Black 1TB FAEX|| HAF-X || Corsair TX750 V2 ||AL MX 5021E || BenQ E2420HD
|
|
|
02-07-2013, 12:38 AM
|
#15
|
|
Golden Member
Join Date: Jan 2011
Posts: 1,525
|
Quote:
Originally Posted by boxleitnerb
In a way, yes. The comparison is a bit unfair, because it's just logical that a GPU with more raw power will perform better if that power can be used. But that doesn't say anything about the architecture itself. GK110 is Kepler too, the architecture is the same. It's just bigger, has more units and memory controllers. From the beginning I've considered the Kepler lineup misplaced in the market. GK104 should have been positioned above Pitcairn but below Tahiti where it belongs according to its specs.
If anything, Kepler seems to be more efficient than GCN regarding the use of raw power, since on average it is 20% faster with the same resources (FLOPs and bandwidth). Now perf/W and perf/mm2 are other factors where things are different and in favor of AMD. I'm just talking about architectural efficiency.
|
GK110 is not simply a bigger GK104 like Tahiti is a bigger Pitcairn. There are physical components that are completely absent from GK104, such as a hardware scheduler. GK110 is somewhat irrelevant when talking about the architecture of GK104.
__________________
Desktop: Thermaltake V-4 Black case | Gigabyte GA-Z68AP-D3 | Core i5 2500k @ 4 GHz | ASUS Radeon HD 7870 DirectCU II 2 GB @ 1010 MHz
8 GB G.Skill DDR3 RAM 1333 MHz | 120GB OCZ Vertex 3 SSD & Western Digital 500 GB HDD | Antec 650w PSU | Acer 1080p 60 Hz 21.5'' | Windows 8 Pro
Laptop: ASUS K52Jr-X5 | Core i3-350m @ 2.26 GHz| Mobility Radeon HD 5470 1 GB @ 750 MHz
4 GB DDR3 RAM | 90 GB OCZ Agility 3 SSD | 1366x768 15.6'' | Windows 8 Pro /Ubuntu 12
|
|
|
02-07-2013, 12:46 AM
|
#16
|
|
Golden Member
Join Date: Oct 2011
Posts: 1,839
|
Quote:
Originally Posted by Will Robinson
Somehow...someway....there must be a way to make this generation of NV cards appear as good or faster than Tahiti.
Keep trying Boxy....its just a hoot watching it. 
|
If you have nothing of substance to contribute to this thread, please refrain from posting and trolling here. Thank you.
Quote:
Originally Posted by Red Hawk
GK110 is not simply a bigger GK104 like Tahiti is a bigger Pitcairn. There are physical components that are completely absent from GK104, such as a hardware scheduler. GK110 is somewhat irrelevant when talking about the architecture of GK104.
|
Source? It makes no sense for Nvidia to change scheduling again with GK110. And remember, I was talking about Kepler, not GK104 specifically. GK110 is a bigger Kepler. If course there are differences like HyperQ, DP etc., but these have no relevance for gaming. Caches are a different thing, though. To my knowledge Tahiti is a cache monster, has lots of it. But looking at the Tahiti LE results, that doesn't seem to make any difference in gaming either. So I doubt the improvements of GK110 in this area will matter much or at all compared to GK104.
Last edited by boxleitnerb; 02-07-2013 at 12:53 AM.
|
|
|
02-07-2013, 01:01 AM
|
#17
|
|
Platinum Member
Join Date: Jan 2011
Posts: 2,525
|
Would be useful to have a solid definition for what you mean by "compute" before going any further. If nothing else, give several examples where "compute" performance is important in games. As in titles and specific aspects of that game.
Otherwise this is just going to turn into the usual he said she said crapfest, flame war, and circle jerk.
|
|
|
02-07-2013, 01:05 AM
|
#18
|
|
Golden Member
Join Date: Oct 2011
Posts: 1,839
|
That is exactly the problem! You would have to ask those members here that have used this term in the context of discussing gaming performance.
Personally, I have no idea which is why I've put the word in double quotes in my first post.
|
|
|
02-07-2013, 01:21 AM
|
#19
|
|
Diamond Member
Join Date: Aug 2009
Location: Christchurch, NZ
Posts: 5,149
|
Quote:
Originally Posted by notty22
|
Performance in SPEC ViewPerf is a function of driver optimizations. It has nothing to do with compute capabilities, per se.
Quote:
|
Given that AMD skipped us from their FirePro briefing, yet alone review list, we had to wait until the first results came out. Thanks to our colleague Joel at HotHardware, you are able to see that a $4000 FirePro W9000 cannot keep up with Kepler-based Quadro 6000 in numerous tests, and in some tests the W9000 suffered an indignity of being defeated by a $749.99 Quadro 4000.
|
Without going back and reading the original review from HotHardware, which your link references for it's AMD performance as they didn't have a card, IIRC AMD said they hadn't optimized drivers for SPEC ViewPerf 11 benchmark suite. Even if they had though, I doubt they'd beat nVidia's performance. I'm pretty certain if you took all of the workstations at Autodesk, NewTek, and any other 3D software company, you'd be lucky to find any of them running AMD graphics cards. The software is developed running nVidia graphics cards.
|
|
|
02-07-2013, 01:29 AM
|
#20
|
|
Member
Join Date: Sep 2012
Location: Milwaukee Area
Posts: 189
|
they are talking about Direct Compute which is a feature of I believe AMD only GPUs... which is why all the BITCOIN miners use AMD setups, and not crossfire 680s. The new gear from that one company that is made specifically for mining is a GPU that literally JUST directcomputes.
It's a form of... processing.
They are not talking about "Compute" power.
Direct Compute power.
http://en.wikipedia.org/wiki/DirectCompute
__________________
|P8z77 V-LK|3570k @ 4.4|8Gb Sniper 1866|Sapphire 7970 Dual-X|840 Pro 128GB|Seagate 7200 1TB|
|
|
|
02-07-2013, 01:48 AM
|
#21
|
|
Golden Member
Join Date: Oct 2011
Posts: 1,839
|
DirectCompute works on every DX11 GPU.
The algorithm used for bitcoin mining uses specific functions that AMD GPUs can do faster than Nvidia GPUs. I believe it is some shuffle function, but I don't remember where I've read that. Anyway, this has nothing to do with DirectCompute.
|
|
|
02-07-2013, 02:00 AM
|
#22
|
|
Diamond Member
Join Date: Apr 2009
Location: Chicago
Posts: 3,357
|
Quote:
Originally Posted by boxleitnerb
Often it is said, that AMD cards are so fast in certain titles because of "compute". Usually, no one really says what they mean by compute, it is just thrown in as a buzzword. Now I've always thought that this touted special compute ability was rather irrelevant and that it instead was all about raw power (SP GFLOPs) and bandwidth.
I recently had a look at some "Tahiti LE" reviews which are quite interesting, because Tahiti LE cards have about the same SP GFLOPs as a 670/680 hybrid and the same memory bandwidth.
Now as Tahiti LE is still Tahiti, thus should possess quite some compute prowess, but when we look at the results, the card lands at 670/680 levels or even below. The seizable advantage in titles like Sleeping Dogs, AvP, Arma2, Metro2033 etc. is completely gone.
So my conclusion:
Kepler and GCN as architectures are equally good when it comes to compute-heavy games, there is no difference. What matters more and what sets the two apart, is the actual amount of raw power their individual SKUs have.
Any thoughts?
|
Wait so a $240 dollar GPU performs the same as a $500 GPU therefore GCN is slower than Kepler? What are you even trying to say. My brain just melted.
|
|
|
02-07-2013, 02:05 AM
|
#23
|
|
Member
Join Date: Apr 2012
Posts: 124
|
Quote:
Originally Posted by Mopetar
Would be useful to have a solid definition for what you mean by "compute" before going any further. If nothing else, give several examples where "compute" performance is important in games. As in titles and specific aspects of that game.
|
Compute means calculations on GPU that do not go trough standard pixel/vertex shading phases.
you can also do 'GPGPU' by rendering quad to buffer and calculate whatever you want with pixel shader. (most trick effects on ps2 did this and particle physics on Halo reach as well.)
Battlefield 3 uses DC for lighting the environment and possibly for other things as well.
(forward rendering to g-buffer, DC for lighting, forward rendering for transparent surfaces)
Performance in compute depends a lot on methods what software writer uses and small changes on software can cause catastrophic performance drop.
These pitfalls are different on each architecture as are the best case scenarios, this makes looking single small benchmark worthless on large scheme.
Currently AMD holds the GFLOP crown, but nvidia parts are faster on some cases.
Last edited by Pottuvoi; 02-07-2013 at 02:19 AM.
|
|
|
02-07-2013, 02:35 AM
|
#24
|
|
Golden Member
Join Date: Oct 2011
Posts: 1,839
|
Quote:
Originally Posted by VulgarDisplay
Wait so a $240 dollar GPU performs the same as a $500 GPU therefore GCN is slower than Kepler? What are you even trying to say. My brain just melted. 
|
Take a deep breath and read my posts again, carefully. There is nothing there about price. Nothing. Price has nothing to do with architecture whatsoever.
Last edited by boxleitnerb; 02-07-2013 at 02:41 AM.
|
|
|
02-07-2013, 03:01 AM
|
#25
|
|
Senior Member
Join Date: Nov 2012
Posts: 445
|
Quote:
Originally Posted by boxleitnerb
Take a deep breath and read my posts again. There is nothing there about price. Nothing. Do you have trouble staying on a certain topic?
|
The problem is that the cards you are comparing are not in the same price bracket. So people will automatically jump on that, regardless of your true reason for the comparison.
Honestly, most don't know or don't care why the 7970 is faster than the GTX 680 in most games. All that matters is that it is, and on top of that is it cheaper.
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT -5. The time now is 11:46 PM.
|