GK110--when do you think it will be released?

raghu78 · Sep 2, 2012

The GTX 780 is going to be a GK110 based card. As to what would be the tradeoff we have to wait for the final product to draw conclusions. The GK110 will run at lower clock speeds. How low is not yet determined. If Nvidia hit 850 Mhz they will have a very powerful chip. but definitely the chip will run hot and draw lot of power like the GTX 580 did. At 7.1 billion transistors and close to 550 sq mm its a huge chip. I think Nvidia went aggressive on die size to combat Intel Knights Corner which will be a 22nm product.

GotNoRice · Sep 2, 2012

I just don't see them as having bothered with the 690 if the 780 was so soon in the works. Dual-GPU solutions like that are what companies go for when there isn't any other option.

bunnyfubbles · Sep 2, 2012

tviceman said:
LMFAO! I can see it now. Nvidia engineers: "I know we've spent $150+ million in developing this chip, but lets keep it to the HPC market, where we'll only sell 75,000 total units. I realize we won't make any significant money on it, but JHH's salary is only $1 per year, so it's all good."

Seriously though. December / January as a Tesla and/or Quadro card. Vr-zone said Big K as a Geforce card in March, and that sounds plausible. And like Tahiti, it's power draw will hold it back. Hopefully Nvidia won't lock down voltage control like they have done with GK104.

GK104 was so fast for its size, I'm convinced nVidia held it back on purpose with the voltage shenanigans in order to give themselves some breathing room to release either a faster GK104 part and/or not infringe too closely to a stock GK110 which would likely be clocked much lower to keep temps down.

That being said, even if they do lock down voltage control on GK110, it will at least have the voltage without hardware mods that a software/firmware mod to get voltage control would be all one would need to unlock the true potential of the card.

bunnyfubbles · Sep 2, 2012

GotNoRice said:
I just don't see them as having bothered with the 690 if the 780 was so soon in the works. Dual-GPU solutions like that are what companies go for when there isn't any other option.

right, but even if we see a relatively best-case scenario and GeForce GK110 gets here sometime Nov/Dec, that will have been 7 months that nVidia would have been able to double up with the GTX 690.

And to look at things realistically, the 690 would still have a place to be sold given that it would still likely be significantly faster than a single GK110 which would fall somewhere halfway between a GTX680 and the 690.

RussianSensation · Sep 2, 2012

If we assume that GK110 is easily manufacturable and profitable by Q4 2012 when NV intends to launch K20, and admit that GK104 is a mid-range Kepler, then NV is likely to sell as many GK110's in the professional markets to achieve the highest profit margins possible and only release GK110 in the consumer market when they have to (i.e., when HD8970 forces them so to speak). Why cannibalize the profit margins on both of your GK104 parts and reduce profits in the professional markets all at the same time? As long as every single GK110 can be sold for $3-5K, there is no need to launch GK110 until AMD forces their hand.

If we assume that GK110 is not easy to manufacture, that 28nm wafers are expensive, and the yields aren't great yet, then NV will sell GK110 to professionals where they can justify it financially and may at a later date start building up a supply of failed GK110 chips that couldn't meet the 15 SMX spec. After harvesting enough failed GK110 chips, they'd have a better idea if they can launch a 12-13 SMX cluster consumer product.

My guess is NV won't launch GK110-based product until after HD8970. Since from what I've read HD8970 won't launch until Q1 2013, I wouldn't expect GK110 until then at the earliest.

The other possibility is some other NV chip that we haven't heard about. Is it out of the realm of possibility that NV could simply build an entirely new chip from scratch based on GK104 1 SMX cluster = 192 SPs layout? Could NV build a 2304 SP "GK104" and call it GKxxx? Does it have to be based on GK110?

bunnyfubbles said:
GK104 was so fast for its size, I'm convinced nVidia held it back on purpose with the voltage shenanigans in order to give themselves some breathing room to release either a faster GK104 part and/or not infringe too closely to a stock GK110 which would likely be clocked much lower to keep temps down.

GTX680 with only 1536 SPs already uses 185W of power or so. How much power do you think a 1Ghz core clock + GPU Boost 2880 SP 550-600mm^2 die chip would use? It would be > 250W for sure. Since NV is selling 294mm^2 @ 2 units = GTX690 for $1,000 USD, they probably needed to sell GK110 for $900-1000 I bet to justify their 50%+ margins. And how many people would have cared for a $1,000 GK110? At first I thought NV held back GK110 on purpose but now 6 months passed since GTX680 launched and GK110 is nowhere on the horizon for another quarter; and then as a $3k+ part. NV didn't hold back GK110 on purpose, but because they couldn't release it for any reasonable price for the consumer market to justify the low volume of this part. If NV sells you 294mm^2 GK104 for $499 MSRP, what incentive do they have right now to sell 550-600mm^2 die for $700 for example? They'd be losing $ / mm^2 when instead they can sell 2x GTX680s from the same wafer.

While the Tesla K20 GK110 has 7.1 billion transistors, it is not guaranteed that the GeForce card would feature the same specificaiton. The GK110 is both difficult and expensive to manufacture. Also, backing into K10's single-precision performance of 4.58 Tflops / (1536 SPs / 2 GPUs on K10 / 2 Ops per clock) = 745mhz GPU clock. That means the K20 GK110 may also only be clocked at 750, maybe 850mhz. You guys can't just assume no GPU core clock penalty for a chip that's significantly larger in size.

2880 @ 850mhz vs. 1536 @ 1058mhz (GTX680 with GPU Boost at minimum) = 51% shading power.
2304 @ 1 Ghz vs. 1536 @ 1058mhz = 42% shading power

I think the greatest performance increase will come from more ROPs and memory bandwidth. We've seen that while GTX670 has 14% less shaders than GTX680, it's not a direct relationship in performance. That means Kepler is currently more limited somewhere else, not just shaders. NV can increase performance a lot without needing 2880 SP GK110.

ViRGE · Sep 2, 2012

RussianSensation said:
The other possibility is some other NV chip that we haven't heard about. Is it out of the realm of possibility that NV could simply build an entirely new chip from scratch based on GK104 1 SMX cluster = 192 SPs layout? Could NV build a 2304 SP "GK104" and call it GKxxx? Does it have to be based on GK110?

Hence GK114. With AMD and NV facing a 2+ year cycle on new process nodes, they will have to take up a tick-tock style product development cycle. Produce a fairly small GPU when a node is new and volumes/yields are low, and then in the next year produce a slightly larger chip taking advantage of the fact that volumes and yields have improved.

All NVIDIA needs to do for 2013 is build a larger GK104. Add another memory controller and ROPs partition, maybe another SMX, and call it a day. Consumers don't need the GK110 feature set, and if NVIDIA can make GK110 profitable without selling it to consumers then they're better off with another small(ish) chip with higher margins.

RussianSensation · Sep 2, 2012

ViRGE, I agree. Most consumers don't seem to care for compute performance and welcomed NV stripping Keper of that 'fat' Fermi had. NV can continue to focus on performance/watt which would be pretty difficult to do with a GK110 15 SMX part.

tviceman · Sep 2, 2012

ViRGE said:
All NVIDIA needs to do for 2013 is build a larger GK104. Add another memory controller and ROPs partition, maybe another SMX, and call it a day. Consumers don't need the GK110 feature set, and if NVIDIA can make GK110 profitable without selling it to consumers then they're better off with another small(ish) chip with higher margins.

I agree that a GK104, with some efficiency tweaks, natural node process improvement, and another 64-bit memory controller would do just fine for them on the high end, but I think that is much easier said than done. Has a chip ever been retro fitted with more cores and / or memory controllers on the same node and left architecturally unchanged (other than very small tweaks)?

Nvidia has released a larger chip than what GK110 is rumored to be at, and never before have they made a chip exclusively for HPC. Anything is possible, but I think it's highly likely GK110 will be released as very high end Geforce cards. I think the situation is Nvidia saw hd7970's performance in it's December paper launch, knew they were going to be able to match or beat it with GK104, and they didn't want a Fermi GF100 repeat so they opted to take extra time now to get big Kepler's finalization as sound as possible. Had AMD released an HD7970GE (or slightly faster) in December, I think we would have been looking at a cutdown big Kepler Geforce in the October / November time frame.

tviceman · Sep 2, 2012

GotNoRice said:
I just don't see them as having bothered with the 690 if the 780 was so soon in the works. Dual-GPU solutions like that are what companies go for when there isn't any other option.

This is completely untrue. Dual-GPU solutions have existed in Nvidia's lineup since the 7900 series. 7950GX2, 9800GX2, GTX295, GTX590, and GTX690. AMD's had just as many, if not more dual-card solutions. It's nothing new and isn't considered a last resort option. I think the only instance where a company's dual-card solution came out only to compete with another company's top end single solution was with the hd3870x2.

dangerman1337 · Sep 2, 2012

I don't think we will have the GK114 be buffed up but the chip more power optimized as it will probably be the 760(Ti and non-ti) whereas the GK110 takes the 770 and 780 monkier.

Though I do wonder if Nvidia made a GK102/112 that is basically a GK104 but upscaled the number of shader count and the memory bandwidth rather than having the GK110 for both consumer and HPC markets. I'm not sure how finacially viable that would be to develop 2 types of high end chips with one being HPC focused while the other focused on gaming.

Anarchist420 · Sep 2, 2012

RussianSensation said:
ViRGE, I agree. Most consumers don't seem to care for compute performance and welcomed NV stripping Keper of that 'fat' Fermi had. NV can continue to focus on performance/watt which would be pretty difficult to do with a GK110 15 SMX part.

I don't see any evidence that compute is useless for gaming. That said, GK110 could always be released as a consumer product with 8 SMXs and more bandwidth... and if we're lucky, then full speed RGBA16 blending or at least a 1.5x higher general fill rate (through higher clock speeds or the same clock speed and more ROPs) than GK104 if we're lucky.

tviceman said:
I agree that a GK104, with some efficiency tweaks, natural node process improvement, and another 64-bit memory controller would do just fine for them on the high end, but I think that is much easier said than done.[...]

I think it would be far wiser to just make a GK110 except with 8 SMXs instead of 15 than it is to add to a design that doesn't seem (to me, at least) like it will hold up in the future... future games will benefit from good compute (and DP) performance. Compute vs. gaming performance is a false dichotomy.

tviceman · Sep 2, 2012

Anarchist420 said:
I think it would be far wiser to just make a GK110 except with 8 SMXs instead of 15 than it is to add to a design that doesn't seem (to me, at least) like it will hold up in the future... future games will benefit from good compute (and DP) performance. Compute vs. gaming performance is a false dichotomy.

It is going to cost > 85% more die space for 50% more memory controllers and shaders for GK110 vs. GK104. Economically, that doesn't make any sense at all to add all that back into GK104. Had Nvidia kept the compute functionality in GK104, they would have likely ended up with a bigger, less efficient and slightly slower die than Tahiti. As it is right now, they have a die that is 50mm^2 smaller, 15% more efficient performance per watt, and within 5-10% of hd7970GE speed.

Another thing to look at is that Nvidia has been compute / GPGPU focused with their high end die since they unified their shaders with the 8800GTX back in 2008. Yet no game has utilized GPGPU functionality, no upcoming game has been said to be doing so, and on top of those two very obvious points, GK104 doesn't suffer any abnormal performance hits vs. GF110 (gtx580) and it's GPGPU-heavy architecture. The closest thing to compute capabilities in a game is probably physx, and even that doesn't display any significant performance deviations from the chip not having GPGPU functionality.

AdamK47 · Sep 2, 2012

I'm hoping ATI or nVidia release a new high end card at the end of this year. The end of fall / start of winter is when I get the upgrade itch.

blackened23 · Sep 2, 2012

Anarchist420 said:
I don't see any evidence that compute is useless for gaming.

This is actually an interesting question. Can compute be useful for games going forward? Secondly, if it is - is it worth the tradeoff in terms of power consumption?

I have no idea, but if anyone has technical information on this it would be interesting. My gut tells me that since consoles drive game development nowadays, compute could perhaps be useful
but whether it comes to fruition is very questionable (due to next gen consoles not using compute)

Anarchist420 · Sep 2, 2012

tviceman said:
Yet no game has utilized GPGPU functionality, no upcoming game has been said to be doing so, and on top of those two very obvious points, GK104 doesn't suffer any abnormal performance hits vs. GF110 (gtx580) and it's GPGPU-heavy architecture.

I could've sworn that AMD's forward+ rendering was compute based, but maybe I was imagining things or maybe I don't understand forward+ correctly.

I think double precision will probably be used in the future, and nvidia is crippled there.

blackened23 said:
This is actually an interesting question. Can compute be useful for games going forward? Secondly, if it is - is it worth the tradeoff in terms of power consumption?

See the reply above.

tviceman said:
Had Nvidia kept the compute functionality in GK104, they would have likely ended up with a bigger, less efficient and slightly slower die than Tahiti. As it is right now, they have a die that is 50mm^2 smaller, 15% more efficient performance per watt, and within 5-10% of hd7970GE speed.

If they also made it so they had Tahiti's bandwidth, then they would probably be faster.

RussianSensation · Sep 2, 2012

Anarchist420 said:
I don't see any evidence that compute is useless for gaming.

I didn't say I thought compute was useless for gaming (or otherwise). What I am saying is NV traded in compute for performance/watt and most consumers seem to prefer that based on how much they hated Fermi and assign little value to compute functionality of HD7900 series. Not every game will use compute like Dirt Showdown or Sniper Elite. NV still has the time to figure out what they want to do. HD7970 GE isn't faster by that much that NV has an argent need to launch GK110. They didn't even lower prices on GTX680 yet, that's how much they don't seem to care about the 7970 GE beating them on performance or price/performance.

AdamK47 said:
I'm hoping ATI or nVidia release a new high end card at the end of this year. The end of fall / start of winter is when I get the upgrade itch.

What about the new Corvette when it launches?

ViRGE · Sep 3, 2012

RussianSensation said:
I didn't say I thought compute was useless for gaming (or otherwise). What I am saying is NV traded in compute for performance/watt and most consumers seem to prefer that based on how much they hated Fermi and assign little value to compute functionality of HD7900 series. Not every game will use compute like Dirt Showdown or Sniper Elite. NV still has the time to figure out what they want to do.

I would even go so far as to clarify that to say that NV traded in some compute for performance/watt. GK104 isn't all-around terrible at compute; importantly its DirectCompute performance tends to hold up well. It's just not well-rounded like GF110. Depending on how well GK104's limited compute capabilities map to what developers will be doing with compute shaders, those limits may not be a problem.

To that end I'm interested in seeing if NVIDIA can do something about their DiRT: Showdown performance in the upcoming R310 drivers. I think if they can make DiRT's lighting compute shader map well to GK104, then GK104 will have passed an important litmus test as far as compute performance in games goes.

3DVagabond · Sep 3, 2012

ViRGE said:
I would even go so far as to clarify that to say that NV traded in some compute for performance/watt. GK104 isn't all-around terrible at compute; importantly its DirectCompute performance tends to hold up well. It's just not well-rounded like GF110. Depending on how well GK104's limited compute capabilities map to what developers will be doing with compute shaders, those limits may not be a problem.

To that end I'm interested in seeing if NVIDIA can do something about their DiRT: Showdown performance in the upcoming R310 drivers. I think if they can make DiRT's lighting compute shader map well to GK104, then GK104 will have passed an important litmus test as far as compute performance in games goes.

Agreed. I also think that if they can't it'll be an easy thing for AMD to exploit. Like gobs of tessellation was for nVidia. I wouldn't put it past Rory to do just that either.

BenSkywalker · Sep 3, 2012

it's held back by slow Double Precision performance (double precision will certainly be used for games in the future)

DP doesn't have much use in gaming, I honestly can't think of *any* use, even potential, that it would have. Forget "next gen" graphics, not even Hollywood CGI level. Medical imaging uses integer still, not even SP. Typical physics modeling, even the non real time type, uses SP(general purpose type physics, get into any advanced/theoretical level and it is mandatory).

Scientific and HPC is pretty much the only market for DP-

http://en.wikipedia.org/wiki/Single-precision_floating-point_format

http://en.wikipedia.org/wiki/Double-precision_floating-point_format

Just so people can understand what is being talked about. Also, DP without ECC just doesn't make much sense, with ECC being slower and more expensive then non ECC I don't see it making it into the consumer market in any large scale way for probably at least the next century(not that we won't have parts that can do it, obviously we already do, there just isn't any use for it).

tviceman · Sep 3, 2012

Anarchist420 said:
If they also made it so they had Tahiti's bandwidth, then they would probably be faster.

I concede this point. Had it just been fitted with a 320-bit memory bus and left otherwise unchanged, it would trade blows evenly with an hd7970GE.

SirPauly · Sep 3, 2012

ViRGE said:
To that end I'm interested in seeing if NVIDIA can do something about their DiRT: Showdown performance in the upcoming R310 drivers. I think if they can make DiRT's lighting compute shader map well to GK104, then GK104 will have passed an important litmus test as far as compute performance in games goes.

Excellent point! Curious about this as well!

Anarchist420 · Sep 3, 2012

BenSkywalker said:
DP doesn't have much use in gaming, I honestly can't think of *any* use, even potential, that it would have. [...]not even Hollywood CGI level. [...]

You know more than I do so you're probably right

... what I don't understand is how FP32 could be enough precision to emulate MSAA, with 100% accuracy, when a RGBA16FP color buffer and 64 bit FP depth buffer is used. Perhaps you could explain it to me

Also, a 32 bit FP zbuffer can't be 100% linear in eye space (error rate can be rather high with FP32 1 - (z/w) z buffer), so I don't understand how double precision shaders couldn't be useful for reducing error rate when trying to emulate a 32 bit fixed point logarithmic z buffer.

Also, how are CG movies not rendered with double precision if it's standard with SSE2? I was wondering because I couldn't imagine that the floating point hardware used for designing those movies wouldn't be double precision as standard yet an intel consumer processor's would be.

lopri · Sep 4, 2012

It is difficult to guesstimate what GK114 is going to look like in theory. GF104 -> GF114 was possible due to the fail nature of the first stepping Fermi. All Geforce 400 series were "defective" Geforce 500 series in a sense, so GF104 began with 768 MB GTX 460 and the last GF100 was savaged by 362 SP GTX 465.

On the other hand, Kepler began nicely with fully functional die in GK104 with 8 clusters and 256-bit memory controller. GK110 will have 15 clusters (I think it's a reasonable guess that there is 1 redundant cluster for better yield) and 384-bit memory controller.

http://www.anandtech.com/show/5840/...gk104-based-tesla-k10-gk110-based-tesla-k20/2

Where will the improvement of GK114 come from, leaving GK110 out of consumer space? There is nothing to "unlock" there like NV did going from GF104 to GF114. And simply tacking in one or two more clusters to GK104 while keeping 256-bit interface may result in undesirable performance. (Note that availability of GDDR5 is not exactly under NV's control)

And I can't imagine NV designing a brand new chip in a 660Ti-like configuration (320-bit?). So there has to be something more than that. Maybe someone more knowledgeable can chime in here what a theoretical GK114 might look like.

The worst case scenario imaginable is NV pulling 8800 GT -> 9800 GTX. Keep the inventory of GTX 680 low and the price high, while filling the "high-end" with GTX 670. Then introduce a slightly higher clocked, better-turboing GK104 (which then will be GK114) with beefed up VRMs as the GTX 780 for $350. With lots of Powerpoint pages explaining how NV cares enthusiasts' wallets, consumers be will happy to get a shiny GTX 780 for mere $350. (add extra ~$50 for custom designs ^_^ ) It's NV's small-die strategy, a la HD 4870!

Highly unlikely scenario, I know, but I think much will depend on what GCN 2.0 will turn out to be.

lopri · Sep 4, 2012

Well, reading the thread again, RussianSensation and ViRGE have already suggested a possible GK114. (quite thorougly) I largely missed it. I suppose it's possible for it to have 9 or 10, or even 12 Clusters with 320-bit memory interface. (How much memory? 2GB or 3GB? Maybe 2.5 GB?) That sounds like quite a bit of work, though.

NTMBK · Sep 4, 2012

BenSkywalker said:
Medical imaging uses integer still, not even SP.

Well I can tell you for a fact that that's wrong. It uses both SP and DP, where appropriate.

GK110--when do you think it will be released?

Diamond Member

Senior member

Lifer

Lifer

Elite Member

Elite Member, Moderator Emeritus

Elite Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Elite Member

Elite Member, Moderator Emeritus

Lifer

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Elite Member

Elite Member

Lifer