[SemiAccurate] Tesla K20 specs: 13 SMX, GeForce probably 12-13 SMX

boxleitnerb · Nov 11, 2012

sontin said:
Because there are not enough Chips after the ramping process. Why do you think the mobile versions of the mid- and high-end chips coming out much later? They are limited by power. nVidia and AMD are binning enough chips before they sell them to the OEMs.

Yes, so binning yields for a given performance target (i.e. 1.31 TF DP) are not good enough to sell them immediately in numbers. Yields affect power consumption. 30% of your chips might run at 1.0V at 800MHz, but 70% don't unless you disable something. (Binning) Yields have everything to do with it.

sontin said:
So do you really believe that in both circumstances the yields were so bad for the 15SM (GF100) and 16SM (GF110) chip that nVidia needed to sell them first in the Geforce market?

In the case of GF100, they had to, with that else would they have been able compete? GF110 came on a much more mature process than 28nm is now. My point is, that if they can only supply 13 SMX parts in numbers now, most chips cannot reach either their target performance at the target TDP. As Arzachel said, you have to cut them down somehow.

sontin · Nov 11, 2012

We talking about a 7,1 billions transistor monster. I guess it's normal that they need time to bin chips for the K20X.
The K20 is the get go product: Easy to bin and able to produce in huge numbers.

I think you misunderstand nVidia's binning process. They are limited by power not by the defect rate of the process. More functional parts mean more parts which could not run at a specific vcore. Less parts mean less trouble.
There is another reason why nVidia is designing these huge chips: They don't need all parts to get the performance they want. And that offset the defect rate of the process.

So even if half of all chips could run with all 15 SMX it does not mean that they will make into the power target.

boxleitnerb · Nov 11, 2012

sontin said:
I think you misunderstand nVidia's binning process. They are limited by power not by the defect rate of the process. More functional parts mean more parts which could not run at a specific vcore. Less parts mean less trouble.
There is another reason why nVidia is designing these huge chips: They don't need all parts to get the performance they want. And that offset the defect rate of the process.

So even if half of all chips could run with all 15 SMX it does not mean that they will make into the power target.

First, how can you be sure of the part quoted in bold? Unless you work for Nvidia, you cannot know that. Second, what you're saying doesn't make sense at all. If they can get the performance with a 13 SMX part, why pay for precious wafer space by disabling portions of the chip instead of designing a smaller one from the start? That is just a waste of money right there.

sontin · Nov 12, 2012

PCPER.com breaks the NDA:
http://pcper.com/news/General-Tech/...X-Accelerator-Card-Powers-Titan-Supercomputer

Beside the know numbers, K20X has 250GB/s with 6GB. That's alone 30% more bandwidth than the GTX680.

Titan is the fastest Supercomputer and the greenest on the planet.

And throughtput performance is 1,22TFLOPs/s. Which makes the card nearly 68% faster than AMD's S9000 server product.

So, that was it. :'(
Now it's time for Maxwell.

sontin · Nov 12, 2012

From the anandtech.com article:

Interestingly NVIDIA tells us that their yields are terrific[...]

tviceman · Nov 12, 2012

raghu78 said:
the K20 with 2496 cores at 705 Mhz and 5GB GDDR5 was listed in a workstation vendor's website as 225w TDP before it was pulled off. here is another article

http://hexus.net/tech/news/graphics/47805-nvidia-tesla-k20-kepler-gk110-wheres-at/

anyway the information is going to be out on Monday at SC12. so we will get official specs from Nvidia to end all speculation.

It's official: K20x has a 235 watt TDP, much much closer to 225 watts than the 300 watt tdp you suggested. At the same time, sontin was also not correct in saying it had the same TDP as the K20.

Now that we have that out of the way, GTX580 was able to achieve 19% higher core clocks and stay within the same power envelope as it's Tesla counterpart. I'm not sure how much of a direction comparison can be made here between Kepler @ 28nm and Fermi @ 40nm, but if Nvidia can get core clocks up the same 19% for it's Geforce card, even if GK110's TDP scales higher with clock speeds and ends up match the GTX580 TDP, we're looking at 875mhz. That is a 46% increase in throughput, 25% increase in ROP performance, and a 50% increase in bandwidth over GTX680.

sontin · Nov 12, 2012

K20X has a 4,4% higher TDP but delivers 12% more compute performance. I guess nearly all of the extra power over the K20 comes from the +1GB memory and the 25% higher bandwidth.

boxleitnerb · Nov 12, 2012

sontin said:
From the anandtech.com article:

Ok, don't be naive. Of course they would say that

It may be so, but it may also not.

Silverforce11 · Nov 12, 2012

boxleitnerb said:
Cray, HP, Tyan etc. won't change their server racks just for Nvidia. 225W is max usually.

Exactly, they cant stray too far beyond 225W.. because server racks and density means the water cooling solutions are custom built, and the entire room again is custom cooled with aircond.

This bodes really well for the consumer variant, they can clock it up and we enjoy a beefy 300W card with no sacrifice for gaming and compute, perfect for the enthusiast segment.

ps. Its hard to believe TSMC could deliver "good yields" for a massive die on 28nm.

sontin · Nov 12, 2012

boxleitnerb said:
Ok, don't be naive. Of course they would say that
It may be so, but it may also not.

Sure, but their gross margin is a good indicator that "their yields are terrific".

And that they can ship K20X right from the start shows that they have no production problem. Now they need only more wafers.

boxleitnerb · Nov 12, 2012

Yields != yields. Yields for their smaller 28nm chips may be very good while yields for GK110 may be subpar. Gross margin and profits concern the whole product portfolio, do they not? We cannot be sure how those values apply to specific products or product lines.

tviceman · Nov 12, 2012

boxleitnerb said:
Yields != yields. Yields for their smaller 28nm chips may be very good while yields for GK110 may be subpar. Gross margin and profits concern the whole product portfolio, do they not? We cannot be sure how those values apply to specific products or product lines.

Nvidia is no stranger to complaining about yields. They did it at 40nm, they did it when 28nm was ramping up, they did it during conference calls. If they say yields are bad, then everyone believes them. But if they say yields are good, then they are lying and I guess that means yields are still bad?

boxleitnerb · Nov 12, 2012

Just a healthy dose of skepticism

tviceman · Nov 12, 2012

boxleitnerb said:
Just a healthy dose of skepticism

I do realize that sontin tries to wring and spin every bit of possible good out of anything Nvidia related, like saying GF100 was the first ever Nvidia die to be released with fused off parts, or saying that K20X has the same exact TDP as K20.... and so on, and so on.

I'm just trying to find the middle ground, believing some things at face value and using common sense as much as possible. I'm sure the percentage of functional dies for GK110 are not the same as GK107 per wafer, but perhaps when Nvidia says yields are good, they are speaking in reference to their own internal goals or in comparison to what GF110 yields were.

raghu78 · Nov 12, 2012

sontin said:
Sure, but their gross margin is a good indicator that "their yields are terrific".

And that they can ship K20X right from the start shows that they have no production problem. Now they need only more wafers.

Nvidia's gross margin is good because they have been selling a 294 sqmm GK104 (GTX 680) for the same price they were selling a 530 sq mm GF110 (GTX 580) last year. With GK110 the die size is going to be even bigger than GF110 and the fact that they can't get a 15 SMX yielding full GK110 chip at decent volumes even for the HPC crowd on relatively mature 28nm process means yields are nowhere near good. Did you think Nvidia would say "the yields suck " even if they really did

tviceman · Nov 12, 2012

raghu78 said:
Did you think Nvidia would say "the yields suck " even if they really did

Yes, I do. They say it all the time. They said it when GK104 was ramping up (at back to back conference calls, no less) and they said several times at 40nm. SOOOO

tviceman said:
Nvidia is no stranger to complaining about yields. They did it at 40nm, they did it when 28nm was ramping up, they did it during conference calls. If they say yields are bad, then everyone believes them. But if they say yields are good, then they are lying and I guess that means yields are still bad?

Seriously lets just move on.

People here are missing out on the better part of this news. The K20x has a lower TDP than what it is replacing (the M2090). GTX580 was able to increase it's core clocks by 19% over the M2090 within the same power enevelope (actually lower). If GK110 does the same, we're looking at a 875mhz GK110 Geforce card at a 235-250 watt TDP. That is a 46% increase in throughput, 25% increase in ROP performance, and a 50% increase in bandwidth over GTX680. In other words, a f0cking beast!

sontin · Nov 12, 2012

TDP of the M2090 was 225 Watt and not 250 Watt.

raghu78 · Nov 12, 2012

tviceman said:
It's official: K20x has a 235 watt TDP, much much closer to 225 watts than the 300 watt tdp you suggested. At the same time, sontin was also not correct in saying it had the same TDP as the K20.

Now that we have that out of the way, GTX580 was able to achieve 19% higher core clocks and stay within the same power envelope as it's Tesla counterpart. I'm not sure how much of a direction comparison can be made here between Kepler @ 28nm and Fermi @ 40nm, but if Nvidia can get core clocks up the same 19% for it's Geforce card, even if GK110's TDP scales higher with clock speeds and ends up match the GTX580 TDP, we're looking at 875mhz. That is a 46% increase in throughput, 25% increase in ROP performance, and a 50% increase in bandwidth over GTX680.

yeah i was wrong. But if a highly binned K20x with 14 SMX is drawing 235w what can be the expectations for a 15 SMX consumer Geforce GTX 780 running at 775 - 800 Mhz. I think its realistically 250 - 260w. Also don't forget GTX 580 and Tesla M2090 were fully enabled chips. The only difference was core and memory clocks. Here the Tesla K20x is not a fully enabled chip.
I think Nvidia is doing slightly good on power but is doing not so good on yields. remember the GK110 based Tesla K20x is releasing a year after 28nm production started. So with a year of 28nm learning and yield improvements if you can't get a full chip for your flagship Tesla which costs 3000+ USD it means you are constrained by yields.

tviceman · Nov 12, 2012

sontin said:
TDP of the M2090 was 225 Watt and not 250 Watt.

If you are right then let Ryan Smith know because he got it wrong: http://www.anandtech.com/show/6446/nvidia-launches-tesla-k20-k20x-gk110-arrives-at-last

sontin · Nov 12, 2012

Hm, you know what is funny? You always ignoring AMD.
K20X has 14 of 15 SMX. That is more than AMD's 28 of 32. And you never said one bad word about AMD's yields.

Hm :colbert:

tviceman said:
If you are right then let Ryan Smith know because he got it wrong: http://www.anandtech.com/show/6446/nvidia-launches-tesla-k20-k20x-gk110-arrives-at-last

I found that from HP:

Power Consumption Tesla M2075 200W TDP Tesla M2070Q 225W TDP Tesla M2090 250W TDP Tesla K10 225W TDP

http://h18006.www1.hp.com/products/quickspecs/13743_div/13743_div.html

So, maybe it is 250 Watt... My fault, sry.

boxleitnerb · Nov 12, 2012

One more SMX would not impact power consumption much. Maybe barely at all. Subtract some watts from 3GB less memory and I think we're good. I would expect 850 MHz for a 14 SMX part and 800-850 for a 15 SMX part@250W.

tviceman · Nov 12, 2012

raghu78 said:
yeah i was wrong. But if a highly binned K20x with 14 SMX is drawing 235w what can be the expectations for a 15 SMX consumer Geforce GTX 780 running at 775 - 800 Mhz. I think its realistically 250 - 260w.

This again? Really? http://en.expreview.com/2010/08/09/world-exclusive-review-512sp-geforce-gtx-480/9070.html OH NOES! THE GTX480 drew 8 million more watts when fully unlocked! There is NO WAY IT CAN EVER BE RELEASED. EVER. Except that it was released six months later after a respin, it was named gtx580, and it came with lower power consumption and 15-20% more performance. WEIRD.

boxleitnerb · Nov 12, 2012

To be honest, Fermi was a bad example since GF100 was broken. GK110 on the other hand doesn't seem to be broken, so there is not as much room for improvement there.

tviceman · Nov 12, 2012

boxleitnerb said:
To be honest, Fermi was a bad example since GF100 was broken. GK110 on the other hand doesn't seem to be broken, so there is not as much room for improvement there.

Agreed for the most part, but since we're probably looking at mid to late 2014 until GK110's replacement comes out, I am guessing GK110 as it is now will not be the same at the end of 2013. Either through manufacturing improvements and/or a respin, I think GK110 will end up with more capable chips later in it's life than what is being produced today.

beginner99 · Nov 12, 2012

BoFox said:
If people spent billions of dollars on Kinects with 500+ms of lag, it's basically already a promising testament to the potential prospective market of cloud gaming with *some* lag. Simply upgrading the HDTV to a really fast SmartTV one would probably make it even "faster" for anybody with a <50ms ping connection.

If people actually cared about a good experience and quality games (terms of graphics) there wouldn't be any consoles...

[SemiAccurate] Tesla K20 specs: 13 SMX, GeForce probably 12-13 SMX

Platinum Member

Diamond Member

Platinum Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Platinum Member

Lifer

Diamond Member

Platinum Member

Diamond Member

Platinum Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Platinum Member

Diamond Member

Platinum Member

Diamond Member

Diamond Member