[SemiAccurate] Tesla K20 specs: 13 SMX, GeForce probably 12-13 SMX

Page 6 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

boxleitnerb

Platinum Member
Nov 1, 2011
2,601
2
81
Because there are not enough Chips after the ramping process. Why do you think the mobile versions of the mid- and high-end chips coming out much later? They are limited by power. nVidia and AMD are binning enough chips before they sell them to the OEMs.

Yes, so binning yields for a given performance target (i.e. 1.31 TF DP) are not good enough to sell them immediately in numbers. Yields affect power consumption. 30% of your chips might run at 1.0V at 800MHz, but 70% don't unless you disable something. (Binning) Yields have everything to do with it.

So do you really believe that in both circumstances the yields were so bad for the 15SM (GF100) and 16SM (GF110) chip that nVidia needed to sell them first in the Geforce market? o_O

In the case of GF100, they had to, with that else would they have been able compete? GF110 came on a much more mature process than 28nm is now. My point is, that if they can only supply 13 SMX parts in numbers now, most chips cannot reach either their target performance at the target TDP. As Arzachel said, you have to cut them down somehow.
 
Last edited:

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
We talking about a 7,1 billions transistor monster. I guess it's normal that they need time to bin chips for the K20X.
The K20 is the get go product: Easy to bin and able to produce in huge numbers.

I think you misunderstand nVidia's binning process. They are limited by power not by the defect rate of the process. More functional parts mean more parts which could not run at a specific vcore. Less parts mean less trouble.
There is another reason why nVidia is designing these huge chips: They don't need all parts to get the performance they want. And that offset the defect rate of the process.

So even if half of all chips could run with all 15 SMX it does not mean that they will make into the power target.
 

boxleitnerb

Platinum Member
Nov 1, 2011
2,601
2
81
I think you misunderstand nVidia's binning process. They are limited by power not by the defect rate of the process. More functional parts mean more parts which could not run at a specific vcore. Less parts mean less trouble.
There is another reason why nVidia is designing these huge chips: They don't need all parts to get the performance they want. And that offset the defect rate of the process.

So even if half of all chips could run with all 15 SMX it does not mean that they will make into the power target.

First, how can you be sure of the part quoted in bold? Unless you work for Nvidia, you cannot know that. Second, what you're saying doesn't make sense at all. If they can get the performance with a 13 SMX part, why pay for precious wafer space by disabling portions of the chip instead of designing a smaller one from the start? That is just a waste of money right there.
 

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
PCPER.com breaks the NDA:
http://pcper.com/news/General-Tech/...X-Accelerator-Card-Powers-Titan-Supercomputer

Beside the know numbers, K20X has 250GB/s with 6GB. That's alone 30% more bandwidth than the GTX680.

Titan is the fastest Supercomputer and the greenest on the planet.

And throughtput performance is 1,22TFLOPs/s. Which makes the card nearly 68% faster than AMD's S9000 server product.

So, that was it. :'(
Now it's time for Maxwell. :cool:
 
Last edited:

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
the K20 with 2496 cores at 705 Mhz and 5GB GDDR5 was listed in a workstation vendor's website as 225w TDP before it was pulled off. here is another article

http://hexus.net/tech/news/graphics/47805-nvidia-tesla-k20-kepler-gk110-wheres-at/

anyway the information is going to be out on Monday at SC12. so we will get official specs from Nvidia to end all speculation.

It's official: K20x has a 235 watt TDP, much much closer to 225 watts than the 300 watt tdp you suggested. At the same time, sontin was also not correct in saying it had the same TDP as the K20.


Now that we have that out of the way, GTX580 was able to achieve 19% higher core clocks and stay within the same power envelope as it's Tesla counterpart. I'm not sure how much of a direction comparison can be made here between Kepler @ 28nm and Fermi @ 40nm, but if Nvidia can get core clocks up the same 19% for it's Geforce card, even if GK110's TDP scales higher with clock speeds and ends up match the GTX580 TDP, we're looking at 875mhz. That is a 46% increase in throughput, 25% increase in ROP performance, and a 50% increase in bandwidth over GTX680.
 
Last edited:

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
K20X has a 4,4% higher TDP but delivers 12% more compute performance. I guess nearly all of the extra power over the K20 comes from the +1GB memory and the 25% higher bandwidth.
 
Feb 19, 2009
10,457
10
76
Cray, HP, Tyan etc. won't change their server racks just for Nvidia. 225W is max usually.

Exactly, they cant stray too far beyond 225W.. because server racks and density means the water cooling solutions are custom built, and the entire room again is custom cooled with aircond.

This bodes really well for the consumer variant, they can clock it up and we enjoy a beefy 300W card with no sacrifice for gaming and compute, perfect for the enthusiast segment.

ps. Its hard to believe TSMC could deliver "good yields" for a massive die on 28nm.
 

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
Ok, don't be naive. Of course they would say that ;)
It may be so, but it may also not.

Sure, but their gross margin is a good indicator that "their yields are terrific".

And that they can ship K20X right from the start shows that they have no production problem. Now they need only more wafers.
 

boxleitnerb

Platinum Member
Nov 1, 2011
2,601
2
81
Yields != yields. Yields for their smaller 28nm chips may be very good while yields for GK110 may be subpar. Gross margin and profits concern the whole product portfolio, do they not? We cannot be sure how those values apply to specific products or product lines.
 

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
Yields != yields. Yields for their smaller 28nm chips may be very good while yields for GK110 may be subpar. Gross margin and profits concern the whole product portfolio, do they not? We cannot be sure how those values apply to specific products or product lines.

Nvidia is no stranger to complaining about yields. They did it at 40nm, they did it when 28nm was ramping up, they did it during conference calls. If they say yields are bad, then everyone believes them. But if they say yields are good, then they are lying and I guess that means yields are still bad?
 

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
Just a healthy dose of skepticism :D

I do realize that sontin tries to wring and spin every bit of possible good out of anything Nvidia related, like saying GF100 was the first ever Nvidia die to be released with fused off parts, or saying that K20X has the same exact TDP as K20.... and so on, and so on.

I'm just trying to find the middle ground, believing some things at face value and using common sense as much as possible. I'm sure the percentage of functional dies for GK110 are not the same as GK107 per wafer, but perhaps when Nvidia says yields are good, they are speaking in reference to their own internal goals or in comparison to what GF110 yields were.
 

raghu78

Diamond Member
Aug 23, 2012
4,093
1,475
136
Sure, but their gross margin is a good indicator that "their yields are terrific".

And that they can ship K20X right from the start shows that they have no production problem. Now they need only more wafers.

Nvidia's gross margin is good because they have been selling a 294 sqmm GK104 (GTX 680) for the same price they were selling a 530 sq mm GF110 (GTX 580) last year. With GK110 the die size is going to be even bigger than GF110 and the fact that they can't get a 15 SMX yielding full GK110 chip at decent volumes even for the HPC crowd on relatively mature 28nm process means yields are nowhere near good. Did you think Nvidia would say "the yields suck " even if they really did ;)
 

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
Did you think Nvidia would say "the yields suck " even if they really did ;)

Yes, I do. They say it all the time. They said it when GK104 was ramping up (at back to back conference calls, no less) and they said several times at 40nm. SOOOO
Nvidia is no stranger to complaining about yields. They did it at 40nm, they did it when 28nm was ramping up, they did it during conference calls. If they say yields are bad, then everyone believes them. But if they say yields are good, then they are lying and I guess that means yields are still bad?

Seriously lets just move on.

People here are missing out on the better part of this news. The K20x has a lower TDP than what it is replacing (the M2090). GTX580 was able to increase it's core clocks by 19% over the M2090 within the same power enevelope (actually lower). If GK110 does the same, we're looking at a 875mhz GK110 Geforce card at a 235-250 watt TDP. That is a 46% increase in throughput, 25% increase in ROP performance, and a 50% increase in bandwidth over GTX680. In other words, a f0cking beast!
 

raghu78

Diamond Member
Aug 23, 2012
4,093
1,475
136
It's official: K20x has a 235 watt TDP, much much closer to 225 watts than the 300 watt tdp you suggested. At the same time, sontin was also not correct in saying it had the same TDP as the K20.

Now that we have that out of the way, GTX580 was able to achieve 19% higher core clocks and stay within the same power envelope as it's Tesla counterpart. I'm not sure how much of a direction comparison can be made here between Kepler @ 28nm and Fermi @ 40nm, but if Nvidia can get core clocks up the same 19% for it's Geforce card, even if GK110's TDP scales higher with clock speeds and ends up match the GTX580 TDP, we're looking at 875mhz. That is a 46% increase in throughput, 25% increase in ROP performance, and a 50% increase in bandwidth over GTX680.

yeah i was wrong. But if a highly binned K20x with 14 SMX is drawing 235w what can be the expectations for a 15 SMX consumer Geforce GTX 780 running at 775 - 800 Mhz. I think its realistically 250 - 260w. Also don't forget GTX 580 and Tesla M2090 were fully enabled chips. The only difference was core and memory clocks. Here the Tesla K20x is not a fully enabled chip.
I think Nvidia is doing slightly good on power but is doing not so good on yields. remember the GK110 based Tesla K20x is releasing a year after 28nm production started. So with a year of 28nm learning and yield improvements if you can't get a full chip for your flagship Tesla which costs 3000+ USD it means you are constrained by yields.
 

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
Hm, you know what is funny? You always ignoring AMD.
K20X has 14 of 15 SMX. That is more than AMD's 28 of 32. And you never said one bad word about AMD's yields.

Hm :colbert:

If you are right then let Ryan Smith know because he got it wrong: http://www.anandtech.com/show/6446/nvidia-launches-tesla-k20-k20x-gk110-arrives-at-last

I found that from HP:
Power Consumption Tesla M2075 200W TDP Tesla M2070Q 225W TDP Tesla M2090 250W TDP Tesla K10 225W TDP
http://h18006.www1.hp.com/products/quickspecs/13743_div/13743_div.html

So, maybe it is 250 Watt... My fault, sry.
 
Last edited:

boxleitnerb

Platinum Member
Nov 1, 2011
2,601
2
81
One more SMX would not impact power consumption much. Maybe barely at all. Subtract some watts from 3GB less memory and I think we're good. I would expect 850 MHz for a 14 SMX part and 800-850 for a 15 SMX part@250W.
 

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
yeah i was wrong. But if a highly binned K20x with 14 SMX is drawing 235w what can be the expectations for a 15 SMX consumer Geforce GTX 780 running at 775 - 800 Mhz. I think its realistically 250 - 260w.

This again? Really? http://en.expreview.com/2010/08/09/world-exclusive-review-512sp-geforce-gtx-480/9070.html OH NOES! THE GTX480 drew 8 million more watts when fully unlocked! There is NO WAY IT CAN EVER BE RELEASED. EVER. Except that it was released six months later after a respin, it was named gtx580, and it came with lower power consumption and 15-20% more performance. WEIRD.
 

boxleitnerb

Platinum Member
Nov 1, 2011
2,601
2
81
To be honest, Fermi was a bad example since GF100 was broken. GK110 on the other hand doesn't seem to be broken, so there is not as much room for improvement there.
 

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
To be honest, Fermi was a bad example since GF100 was broken. GK110 on the other hand doesn't seem to be broken, so there is not as much room for improvement there.

Agreed for the most part, but since we're probably looking at mid to late 2014 until GK110's replacement comes out, I am guessing GK110 as it is now will not be the same at the end of 2013. Either through manufacturing improvements and/or a respin, I think GK110 will end up with more capable chips later in it's life than what is being produced today.
 

beginner99

Diamond Member
Jun 2, 2009
5,210
1,580
136
If people spent billions of dollars on Kinects with 500+ms of lag, it's basically already a promising testament to the potential prospective market of cloud gaming with *some* lag. Simply upgrading the HDTV to a really fast SmartTV one would probably make it even "faster" for anybody with a <50ms ping connection.

If people actually cared about a good experience and quality games (terms of graphics) there wouldn't be any consoles...