• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

A hint on new king : GeForce GTX 790 and its performance calculated

DGLee

Member
(Contact : leedaeguen [at] kaist.ac.kr)


AMD and NVIDIA compete a balanced fight since Q4 2013 with their newest flagship silicons namely Hawaii and GK110. And this kind of frame might last a couple of quarters since transition to 20nm fabrication process, which is possibly a key to next generation flagship GPU such as Maxwell, is being delayed by TSMC. So it is more likely that each company's dual GPU SKU will be the closest 'balance breaker' of their competition rather than next generation SKUs. Yesterday, NVIDIA engage their first hook for that 'balance breaker' : GeForce GTX 790, a dual-GK110 SKU replacing NVIDIA's current king, GTX 690.

According to VideoCardz, GTX 790 includes:
- 2 x 2496 CUDA Cores (13 SMX per each GK110 GPU, 1 shorter than TITAN / 1 more than GTX 780)
- 2 x 320bit GDDR5 memory interface (highly implies that each GPU only has 40 ROP, not 48)
- 10GB of memory
(Source : http://*******/pwiybI)

In this posting I'll calculate its speculative performance via 'VGA calculator' I designed. The calculator is actually a simple multivariable fractional equation whose variables are SP/TMU/ROP count and GPU/VRAM frequency. Each term represents the 'simulated' GPU's shader/texture mapping/rendering performance via harmonically (which means each term employs '1/n' form) thus we can easily see not only specification-wise bluff but also a true fact on each GPU's real performance. For example, Radeon HD 5830 has more SP/TMU and same amount of ROP than 4890 while their clock rates are 800 and 850MHz respectively, so it is quite natural that 5830 seems faster than 4890. But it's not. This is because of 5830's (a bit) slower ROP partition than 4890 which affects gaming performance despite entire dominance on SP/TMU. By this calculator, however, I successfully speculated that 5830 won't overwhelm 4890. Indeed, across more than 3 year, it still works for recent GPUs such as Volcanic Islands/Kepler family so that I speculated that Hawaii will faster than GTX TITAN/780 but not compete with a full-blown GK110, which now released as GTX 780 Ti, prior to Hawaii's release. (See that : http://udteam.tistory.com/535)


Well, let this lecture get finished. The results are as below:
(Assume that GPU/VRAM clock rate remains unchanged. GPU clock reflects the max boost frequency)

EGbXYAw.png


▲ It's obvious that the new card bests any of other predecessors including GTX 690, a dual-GK104 SKU. Roughly the margin between GTX 690 seems to be about 25%, and almost a half more than GTX 780 Ti. Speaking of SLI scale, however, it's also a bit disappointing when we compare this to actual SLI configurations. Let's see 2 x GTX 780 Ti config.

IJ4dnL7.png


▲ See what I mean? GTX 790 actually doesn't exhaust a full potential of two monstrous GPU. Let's figure this out component by component. First, let's compare a half of GTX 790 to other single-GK110 SKUs.

GSzna6e.png


▲ It seems like a half of GTX 790 doesn't even compete GTX 780 though it has 1 more SMX. The only difference is ROP count and memory interface wihch bonds together(8 ROP and 64bit interface are blocked together in GK110) so it is rational that the performance gap is originated from that. Let's try to prove this.

nQRM107.png


▲ The result above is given from a half of GTX 790 plus 1 more ROP-IMC cluster(means +8 ROP and +64bit GDDR5 memory controller). Another ROP-IMC cluster features almost 12% increase in performance so that it can overcome GTX 780 and goes very close to GTX TITAN.

Let's see the contrary : A full-blown SP/TMU (same amount as GTX 780 Ti) and a flawed ROP-IMC.

yPjjw3V.png


▲ It becomes obvious that ROP-IMC part affects more on performance than SP/TMU.

So, my conclusion is as follow:
- GTX 790 will gain the crown but not faster than SLI config. of today's highest end single-GPU SKU.
- GTX 790 will actually be slower than 2 x GTX 780, not 780 Ti nor TITAN because of its lack on ROP.

Well, the post is over. Thanks for reading. Have a nice day 🙂
 
LOL. Who cares?

Warning issued for thread crapping.
-- stahlhart
 
Last edited by a moderator:
If I was going to deal with multi-GPU I'd rather do so in the form of two physical GPUs. I cringe a bit just thinking about the sacrifices that would need to be made to cram this much silicon on a single PCB.
 
I've enjoyed my SLI setups in the past and I certainly enjoy reading about these dual chip cards. But I can't think of a situation, other than SFF, where choosing this over two separate cards would make sense.
 
Its pretty obvious when you stick 2 midrange (GK104) or high end (GK110) GPU's on a single PCB that you have to reduce the potency of each chip by a little in order to accomodate a similar size PCB with the same number of fans on what is still a dual slot graphics card.

Naturally 2 x GTX 680 (GK104) takes up 4 slots and has a lot more surface area to cool each chip with each having their own dedicated fans and hence the SLI performance is greater than the single dual chip card.
 
Its pretty obvious when you stick 2 midrange (GK104) or high end (GK110) GPU's on a single PCB that you have to reduce the potency of each chip by a little in order to accomodate a similar size PCB with the same number of fans on what is still a dual slot graphics card.

Naturally 2 x GTX 680 (GK104) takes up 4 slots and has a lot more surface area to cool each chip with each having their own dedicated fans and hence the SLI performance is greater than the single dual chip card.

They're going to have to aggressively bin and reduce frequencies to get two GK110 dies on a PCI-E expansion card. Compare the GTX 580 vs the GTX 590 for an example of trying to cram two 250W TDP cards into a <400W package.

I really can't see the reasoning behind disabling ROPs and memory controllers for a card that's designed to run at high resolutions or with SSAA.
 
Me either, GK110 is already a bit light on ROP's compared to the competition and it shows, this will only make it worse.
 
Pretty cool for a single card solution! AMD will most likely have to do more than shrink Hawaii in 2014 in order to regain the performance crown.
 
Not sure how I feel. If it comes out early Feb I might be able to step-up, but will have to see the results.
 
Pretty cool for a single card solution! AMD will most likely have to do more than shrink Hawaii in 2014 in order to regain the performance crown.

Sadly, unless shrunken Hawaii is the R7 370 series, we're probably going to run into the same issue that we have in the x86 CPU market.
 
Back
Top