boxleitnerb
Platinum Member
- Nov 1, 2011
- 2,605
- 6
- 81
There is always a tradeoff between power and die space. A much larger but lower clocked GPU should always be more efficient.
I meant you can achieve a certain goal with two different approaches. Each one has its own pro and cons regarding power and die size.
I believe Titan could have better performance/W than GTX680, but in turn, GTX680 should have better perf/mm2.
well said. given that graphics is a massively parallel problem by going wider (more cuda cores) but slower (core clocks) Nvidia will gain on perf/watt but lose on perf/sq mm. It looks likely that Nvidia will gain 50% performance over GTX 680 for a 30% higher TDP. Such an increase in performance is logically possible.
One other thing to keep in mind is that the GTX 680 is clocked past optimal Perf/Watt levels. It's a mid-range die from Nvidia that had it's TDP budget pushed past ideal levels to compete with the 7970.
well said. given that graphics is a massively parallel problem by going wider (more cuda cores) but slower (core clocks) Nvidia will gain on perf/watt but lose on perf/sq mm. It looks likely that Nvidia will gain 50% performance over GTX 680 for a 30% higher TDP. Such an increase in performance is logically possible.
I don't know, actually think nVidia was conservative on TDP.
The GTX680 is already bandwidth limited compared to the GTX670 (that is why they are so close). A 1536 ALU card with 10-15% lower clocks would be quite a bit more efficient. I think Nvidia had this target in mind originally, but had to increase clocks to match Tahiti.
Considering more people play at 1080p or less, this is a sound strategy and probably is what they're hoping for. It tends to work against nvidia's last few card generations that many reviews sites test at higher resolutions, as that's where AMD's cards seem to have an advantage. I'm not sure what particularly about nvidia's designs make them perform better at lower resolutions, maybe it's the extra ROP's or efficiently loading the cores, or maybe something else, but it's a trend I've noticed for awhile now.Seems like this card will be limited by memory bandwidth just like GK104. It has roughly the same memory bandwidth as Tahiti despite having way more processing power. GPGPU applications don't benefit as much from memory bandwidth as rendering graphics that's why nv didn't go with a wider 512 bit bus or even 448bit. Sacrificing some shader performance for another memory controller would have made that card way faster in games, but it's clear that games are not the primary focus of GK110. On top of that wider memory bus would made its PCB more complex. That's why I find claims that it will be 2x faster than GTX680 absurd. I think it will be 40%-60% faster than GTX680 depending on the game. In games using compute extensively it can be a lot faster than that, but AFAIK we don't have such games just yet.
One way to increase its performance without increasing its TDP would be to make a much larger turbo boost percentage-wise. Clearly TDP must account for stressing both SP and DP shaders, unless they artificially limit its DP performance which doesn't seem far-fetched. Games don't need DP shaders at all. It would even make some sense aside from product segmentation, it could allow to increase base clock with the same TDP without resorting to a large unpredictable turbo boost.
I think it will be 40%-60% faster than GTX680 depending on the game. In games using compute extensively it can be a lot faster than that, but AFAIK we don't have such games just yet.
Well there are 4 compute heavy titles out (Sleeping Dogs, Dirt Showdown, Sniper Elite V2, Hitman Absolution). HD7970 OC is > 50% faster than GTX680 in Sleeping Dogs.
I am going to take a wild stab and say the Titan has 905-925mhz GPU clocks with 235W power consumption in the same game. But then I can't reconcile how a 550mm2 chip with 6GB of VRAM can hit 925mhz on 28nm node when a 365mm2 925mhz 7970 used 189W with 3GB of GDDR5. Could the 28nm node have matured that much?
RussianSensation has no clue. He thinks that OGSSAA is the same like compute (DirectCompute). Or that MSAA is the same like compute (DirectCompute). And he thinks that it's a new breakthrough in scientist that a card with 32% more compute performance is actually faster...
What are you on about? Where did you pull that HD7970 GE is 32% faster in compute over GTX680? Single Precision floating point =! Direct Compute performance. It doesn't work like that. I suggest you head over to this thread and start doing some researching/reading. boxleitnerb was open-minded about it but you seem to be stuck in Denial Land that GK104 doesn't have an issue with DirectCompute / Compute Shaders.
Tahiti XT has between 25% and 40% more compute performance than a GTX680.I explain it in detail using examples from several compute games and show why memory bandwidth and GFLOPS alone do not explain the discrepancies for why GK104/VLIW architectures have issues in Compute Shader heavy titles compared to GCN parts. The mathematics, discussion of what DirectCompute means for games, graphs, it's all there. You should start reading on how GCN architecture actually works to understand what was so special about its redesign for Compute. It appears most people didn't read that article because to them it's still shocking to accept that GCN Tahiti XT is far more advanced than GK104/VLIW architectures are for Compute Shaders.
Right and in the 15 others is only a few % faster. But i guess these are not "compute games". BTW: Why is Sleeping Dogs, Hitman and Elite v2 a "compute heavy title"? Crysis 2 looks much better than Elite Sniper v2...Don't get your panties in a bunch when a stock HD7970GE destroys the GTX680 in 3 of the 4 compute games I listed and obliterates it in Dirt Showdown (which just happens to be the most compute heavy title this generation).
Yeah, MSAA in Sleeping Dogs.Also, in case you didn't notice I said HD7970 OC. The 1050mhz 7970 has a 30%+ lead already and piling in more shader performance (1330mhz) will extend the lead to > 50% in Sleeping Dogs in cases outside of MSAA and 1600P too.
These numbers are not real. Wizzard made a misstake.Care to explain how a GTX690 has vastly superior theoretical performance in every metric possible, including memory bandwidth and GLOPs compared to a single HD7970GE but gets obliterated in Sleeping Dogs?
And please don't say it's VRAM bottlenecked because an HD7850 2GB is still beating it. Also don't say SLI scaling doesn't work because it's 77% compared to a single 680.
![]()
Yeah, and a Golf with 140PS is not faster than my BMW 120D with 177PS. I guess my BMW 120d is a wonder of a car, right...There is not a single game that has been released so far where GTX680 is beating HD7970Ghz that incorporates heavy use of computer shaders. Known memory bandwidth hogs are Metro 2033 and Aliens vs. Predator. Don't start mixing and matching different game engines to prove a point.
Lol, "pants down"? You mean like AMD with the 7970 for $549 which got beaten by the GTX680 with the inferior compute performance?Admit the facts that GK104 is inferior for DirectCompute / Computer Shaders and move on. NV caught AMD with its pants down with tessellation for 2 generations and NV got caught with its pants down with Compute.
Wow, now we using Showdown to show, that the GK104 is "garbage chip for DirectCompute"?Comparison of the GLOPs, memory bandwidth and other theoretical parameters of HD7870 vs. GTX670 and then looking at benchmarks of Dirt Showdown show how flawed your entire argument is.
![]()
Keep digging in your arsenal of excuses why GK104 is not a garbage chip for DirectCompute. You might as well spend a month trying to prepare a rebuttal.
Keep digging in your arsenal of excuses why [GCN] is not a garbage chip for DirectCompute. You might as well spend a month trying to prepare a rebuttal.
EDIT: I just ran the internal benchmark GTX 670 oc'd @ 1440p. exact frame rates were: 82.5 average, 109.8 max, 50.0 minimum with AA on LOW, 23.5 average, 29.0 maximum, 15.7 minimum with AA maxed out. I don't think compute is what is killing Kepler in that game. Pretty sure memory bandwidth is what is killing frame rates on Kepler in Sleeping Dogs.
it only took about 2 hours of playing this sloppy PC port before I sent it to the doghouse.
Severely sloppy execution in numerous areas made me want to take a rolled up newspaper to Sleeping Dogs. The entirety of the first two hours I played was punctuated with sloppy programming, poor execution, and inexcusable bugs.
Once I’m in the game and switch it to full screen mode, however, I immediately notice that the user interface is a crappy console port that sort of lets you use the mouse for some functions (like clicking on buttons), but also insists that you navigate with the keyboard as well.
But my experience with the game suggests it’s a lousy after-thought of a console port. Maybe it’s a great game on the consoles. I’d recommend you try it there if you must play it. But on the PC
Sleeping Dogs (PC) is Chinese water torture in gaming form.
These numbers are not real. Wizzard made a misstake.
Look at these:
![]()
I hope you find the difference...
Tahiti XT has between 25% and 40% more compute performance than a GTX680. Maybe i should use a car comparison to show that theoretical a car with more hp is faster than a car with less...
Why is Sleeping Dogs, Hitman and Elite v2 a "compute heavy title"? Crysis 2 looks much better than Elite Sniper v2...
Yeah, MSAA in Sleeping Dogs.![]()
These numbers are not real. Wizzard made a misstake. I hope you find the difference...
Lol, "pants down"? You mean like AMD with the 7970 for $549 which got beaten by the GTX680 with the inferior compute performance?
Wow, now we using Showdown to show, that the GK104 is "garbage chip for DirectCompute"?
Man here is Ascension Creed 3:
GTX680 is 2x faster than the 7870.
Wait we can't find a good thing to say about Crysis 2-3, and now we are showcasing a game, because it has lopsided performance results?
What explains the 7870 or 7850 dominating the GK-104 [in Dirt Showdown]?
