What happens to nvidia?

brybir · Sep 9, 2010

Paratus said:
I see where the confusion is now. I wasn't talking about disabling extra sp's to increase yields what I meant was AMD seams to be able fit 110-120 xtors in the same space that NV fits 100. From what Scali said there was some positive reason why NV chose to only fit only 100 xtors in that space.

Without getting overly complicated it is possible on any given process to have significant sub-micron defects for various reasons. Often times, these defects are the result of numerous factors, but are often exacerbated by things like voltage, current, temperatures, and EM field fluctuations etc...which ultimately cause what may be a trivial defect into a big problem. By having lower densities it is possible to avoid some problems caused by the stresses related to more dense placements (high density areas can create "hot-spots" which cause surrounding areas to fail, for example). I sort of think of it as having a lower density allows the part to "breath" better, even though its an absolutely terrible analogy...it works in my head.

Also consider that the problem of transistor placement (and by association, density) is a balancing act of meeting your goals and making money. Within those two factors is essentially a constrained optimization problem in that the constraints are the wire-ability of the design, wire length for interconnects and total size required by the transistors (among many other issues). What this all means is that a more complex part will often require more complex routing, wiring and strategic placement decisions in order for additional operations to be complete. If you have this complex part, it may serve you better to have a lower density in order to accommodate the other associated parts of the design more easily.

In the case of Fermi, they added quite a bit of logic to the design in the HPC area. These changes were fairly significant redesigns of the core logic and certainly made the IC design more complex. This complexity likely led to a lot of design and manufacturing trade offs that ultimately resulted in the parts we can purchase today. Perhaps one issue they were facing was a problem, as talked about a bit ago, involving "hotspots" which is what occurs when certain parts of the IC heat up from use while the other areas are not engaged and so are cooler. It could be that Fermi, in certain applications at some point, was having hotspot issues such that their design engineers predicted that issue would cause a certain amount of process failures at manufacturing or would reduce the life of the IC and increase warranty claims. To combat that hotspot perhaps they spread things out a bit on the IC as best they could. But, as stated above, the IC design is a constrained optimization problem, so when you move one thing, everything else has to go with it. Maybe you end up with a bit bigger chip. Maybe you end up with removing features, or maybe you end up with a more or less dense chip after all is said and done.

Point being, in the end, each issue that comes up has to be addressed, and the more significant ones requires revisions of the design. Those revisions, over time, ultimately dictate the density you see in ICs. In that sense then, density is often a goal (whatever that may be for any given IC), but at the same time it is often something that follows function and practicality in order to achieve a working part at the price and time frame that it is desired.

Madcatatlas · Sep 9, 2010

But all of this "lets make a bigger die for better yields" still tranlates into a more expensive chip, doesnt it?

If there was no competing product, how do you think the price/expense of this bigger die would materialize itself?

BenSkywalker · Sep 9, 2010

From what Scali said there was some positive reason why NV chose to only fit only 100 xtors in that space.

Let's say you are designing a chip and you want to put a decent amount of on die cache to help with bandwidth. Most people think the obvious choice is to of course use eDRAM. There are other options, 1T SRAM and 6T SRAM. eDRAM is tops for density, but is the slowest- 6T SRAM is much faster with 1T being about the same speed as 6T. In terms of what you pay in die space, eDRAM is about 10%-15% smaller then 1TSRAM which is ~15%-50% smaller then 6TSRAM. In transistor count, 1T is actually lower then eDRAM because of the way the circuitry is laid out per Mbit, despite taking up more space. The way chips are fabricated requires eDRAM to normally be handled at a different stage then the rest of the logic(temp varriances) which can have a negative impact on yields and increases the overall fabrication costs a considerable amount. 1T SRAM requires a certain type of build process using particular types of metal layering due to the way it works. 6T SRAM uses the least specialized fabrication technique, however its' areal density is by far the weakest in terms of mm/Mbit.

I mention this mainly as a generic example. That is just discussing the cache which is a fairly small portion of the GPU die. Everything has trade offs in how you approach it. As engineers, each area is going to have pros and cons of how it is laid out and how the circuit is designed and one of the many varriables is going to be Xtor density. Out of every possible concern people have, I think Xtor density is one of the lowest(outside of how it relates to power consumption although it is possible to have a larger physical chip consume less power of course).

But all of this "lets make a bigger die for better yields" still tranlates into a more expensive chip, doesnt it?

Not necessarily. A tiny chip that yields 3% is going to be more expensive then a large one that yields 90%, by a lot too.

If there was no competing product, how do you think the price/expense of this bigger die would materialize itself?

It would be priced based on what the market would pay. It wasn't that long ago that the 58xx parts were considerably more expensive then they launched at, how much good did their smaller die do consumers then? No matter what, a product will be priced at what the market will pay or it will fail.

Scali · Sep 9, 2010

Adding to brybir's post above...
nVidia's architecture is vastly different from AMD's anyway. It could be that nVidia's approach just cannot be implemented with the same density because of different requirements in routing/interconnects/etc.

And as a chip gets larger, it takes longer for electrons to travel from one end to the other. This can cause clock issues, which means you have to add extra clock repeater logic into the chip... and of course there are issues of power consumption and routing enough power to all the right places etc. As I already mentioned earlier... using larger vias (fat/double/triple/etc) take up more space as well...

There really is no way to make a fair comparison.
What would be interesting to see is whether the density between GF100 and GF104 is different.
GF100: 3000m/529mm^2 = 5.67m/mm^2
GF104: 1529m/332mm^2 = 5.87m/mm^2

See? The smaller chip with lower power consumption also has a higher xtor density. Coincidence?

Scali · Sep 9, 2010

Madcatatlas said:
But all of this "lets make a bigger die for better yields" still tranlates into a more expensive chip, doesnt it?

Not really, because the yields are the most important factor in your expenses.
How many *usable* dies can we get out of a wafer?
You still have to pay for defective dies. You pay per wafer, not per working die.
So you want to design your chip and binning process in a way that gives you the best possible balance between working dies from a wafer, and performance per working die.

If you can make a die a bit smaller, you may be able to fit a few more on a wafer... But if the result of you shrinking the chip is that less of the dies from the wafer actually work (or bin to the same clockspeeds etc), you've effectively made it more expensive.

Scali · Sep 9, 2010

BenSkywalker said:
It would be priced based on what the market would pay. It wasn't that long ago that the 58xx parts were considerably more expensive then they launched at, how much good did their smaller die do consumers then? No matter what, a product will be priced at what the market will pay or it will fail.

I'd like to add that production cost generally has VERY little to do with what a chip or video card ends up costing.
If you look up the pricing per wafer from companies like TSMC, and try to fill in the blanks... You'll come up with an estimate cost per die of about $50-$100 for a high-end GPU.
The rest of the videocard costs peanuts as well. So the total manufacturing cost of a GTX480 is probably below the $150 mark (if you think about it, you'll know that it costs no more to manufacture a GTX480 than it does to manufacture a GTX465... same components, difference is just 'cherry picking' the die, so assuming that the GTX465 still makes some profit, the GTX480 at least costs less to make than what a GTX465 costs in the store).
Is the rest all profit? Yes and no.
It has taken nVidia YEARS to design and test this GPU, which means a lot of R&D costs from a large team of the most competent engineers in the business.
And THAT is where the real cost is. A lot of the 'profit' on the GTX480 is return-on-investment for designing it. And ofcourse they will also have to finance the future GPU R&D which is ongoing as we speak. So even if the investment is repaid, they will continue with the higher margins in order to build up funds for the future, as long as the market allows.

The whole idea of die size and cost was invented by ... well, I don't know who...

Caribbean Geek · Sep 9, 2010

But I heard that the GT200 costed up to $60.00 per GPU when launched, how much would cost the GF100 at launch? Probably the difference isn't by much specially when both chips are in similar size. So assuming that the GF100 costs $50 per chip, plus memory costs, PCB stuff like wiring and routing, Voltage Regulators, capacitors and resistors, a GTX 480/470 should cost at least $135 as a guesstimate, and if we count relative size compared to Cypress which is almost half of the size of GF100 with a cheaper and less complex PCB thanks to the 256-Bits BUS, less complicated power circuitry and similar components plus more yields per waffer, a single HD 5870 GPU should cost less than $20.00 and along with the prices of the videocard, it shouldn't exceed the $65.00 as a guesstimate.

SolMiester · Sep 9, 2010

SlowSpyder said:
I would like to know, what do you have against gaming on a modern multi-GPU set up when you currently game on a 9600GT? I don't think you can complain about micro stutter, I'm sure with a 9600GT you get plain old stuttering in games. Many modern games will likely slow down tremendously at times with a 9600 level card. Otherwise you have to turn the details, AA, res way down, right? So can you really say that you don't like multi-GPU's any more than you like gaming on a really low end part?

And what was your terrible experience with AMD drivers? You seem to hate them. I don't remember the details, but weren't you the one who was trying to use some sort of non-supported configuration with an x1900 or something?

I have a young family 2 under 3, so I dont really have that much time for it. I dont use AA, and this version of the 9600GT is up there with the 88GT. @ 16x10 I have no problems with COD4, I dont play online. Racing games do slow down however my wheel is also showing its age, so dont do much of that either....I play Golf now when I can get out of the house and the wife commandeers the PC mostly anyhow. Working with PCs everyday has taken a bit of shine off gaming anyway!
regards ATi drivers, I 1st had to put up with them back on NT4 on a helpdesk. ATi and the dreaded Intel i740?.
I moved to nvidia after Voodoo was bought out, my mate had the 7000 and 9000 series cards and we always had issues with the bloody things. I tried an AGP x1900pro but gave up after about 5 sets of drivers....I also work with linux and found the driver just crap. CF was crap with profile though I understand they have made that easier after how many years?
So yeah, I dont like ATi software...the cards are great, but IMO the software is not. With NV, I think the card are great too, but the software is streets ahead IMO...simple as that!

busydude · Sep 9, 2010

SolMiester said:
So yeah, I dont like ATi software...the cards are great, but IMO the software is not. With NV, I think the card are great too, but the software is streets ahead IMO...simple as that!

You need to move on man. Drivers have improved, generally speaking.

I can understand if you use linux, other than that I see no difference between drivers today.

SolMiester · Sep 9, 2010

yasasvy said:
You need to move on man. Drivers have improved, generally speaking.

I can understand if you use linux, other than that I see no difference between drivers today.

Was there not issues with 10.5 all the way to 10.8?, the latest release giving a huge 50% increase in CF performance?....

busydude · Sep 9, 2010

SolMiester said:
Was there not issues with 10.5 all the way to 10.8?, the latest release giving a huge 50% increase in CF performance?....

You are being totally biased. What about GTX 460 SLI issues? Nvidia has not released any driver fix since the launch of 460 3 months ago.

AMD atleast fixed that issue a few days after that [H] article.

SolMiester · Sep 9, 2010

yasasvy said:
You are being totally biased. What about GTX 460 SLI issues? Nvidia has not released any driver fix since the launch of 460 3 months ago.

AMD atleast fixed that issue a few days after that [H] article.

Thats not answering the question!, no point releasing drivers if they dont fix the issue...I'm not up to speed on any SLI issue, I mentioned the CF one as that took 3\4 attempts to fix!

Paratus · Sep 9, 2010

Thanks guys - I feel I now understand the GPU design trade space a little better.

:beers;

Will Robinson · Sep 9, 2010

SolMiester said:
Thats not answering the question!, no point releasing drivers if they dont fix the issue...I'm not up to speed on any SLI issue, I mentioned the CF one as that took 3\4 attempts to fix!

Why are you whining about CrossfireX and ATi drivers when you run an obsolete low performance NVDA card?

busydude · Sep 9, 2010

Will Robinson said:
Why are you whining about CrossfireX and ATi drivers when you run an obsolete low performance NVDA card?

That kind of attitude is uncalled for. Try and counter his arguments in a mature manner, not by looking down on a person based on products he owns.

I am sure he has a reason for using that old card.

Dark Shroud · Sep 9, 2010

SolMiester said:
Was there not issues with 10.5 all the way to 10.8?, the latest release giving a huge 50% increase in CF performance?....

That wasn't an issue with the driver. Some people were having issues with the game profiles. And most of those people were running some type of crossfire set up. For the last 6 years I've had an ATI card in at least one of my PCs. I've never had a problem with drivers for the single cards.

ATI doesn't have Nvidia's resources yet AMD get fixes out while Nvidia has not done anything for the GTX 460. The very card they need to sell a lot of. Not to mention the way Nvidia's beta drivers usually suck and their normal drivers have bricked cards.

I don't like CCC yet AMD still updates on a normal monthy schedule and releases hotfixes & profile updates in between those releases.

Honestly the people who complain the loudest about AMD drivers usually don't even own AMD video cards. And haven't own one in years.

brybir · Sep 9, 2010

Paratus said:
Thanks guys - I feel I now understand the GPU design trade space a little better.

:beers;

one thing to note also is that AMD and Nvidia are on the same 40nm process at the same foundry, which means they are likely being exposed to the same "synthetic issues" i.e. process flaws that are inherent in TSMC's process at 40nm.

The real question though, is how does each design handle the flaws introduced? That is the true balancing act that goes on, and one that can often be unpredictable.

If we believe that AMD has had 9-12 months more experience on the 40nm process, that means they have a very good idea of *exactly* what kind of synthetic flaws (those introduced by the process) and how to design around them. That was basically what that engineer from AMD was talking about in the Anandtech article some time ago. They used the 4770 as a testbed to "feel out" the process, and designed the 5XXX in some way that they believed would hit their performance,power and costs goals while minimizing the risk that the synthetic errors would result in dead chips because of the IC design itself.

As they move to the 6XXX, this experience *should* help them to further optimize their design such to avoid crippled or non-working parts.

In the end Scali is right though, the physical making of the chips is a smaller overall cost. It is the advantages one can possibly get out of them, like being able to ship a product earlier (if it was true that Fermi required a 2nd revision because synthetic flaws were resulting in the first design having horrid yields is true), especially if AMD can stick to a yearly release cadence.

If we had concrete yield numbers (i.e. real numbers) and can say, okay AMD's yield on its juniper core is 75%, we could then say something like, on a 300mm wafer a 100% yield would be 200 working cores (making up numbers here), therefore a 75% yield is 150 working cores. If we were to look at Nvidia and say the same thing but change the yield to 50%, we can say they have 100 working cores. So, at this point, the larger die looks bad in comparison. But in actuality, the difference in the cost that chunk of silicon costs is pretty low, and if Nvidia can sell its 100 IC's for $100 each, but AMD can only sell their 150 IC's for $20 each, Nvidia still makes more money.

So, yes, Nvidia would LOVE 100% yields, but in the end the only thing they care about is making as much money as possible. So, a design that had 100% yields is 1. probably not possible and 2. Would cost a fortune to design, therefore its not expected. And at the end of the day, the cost of a defective die is some fraction of the cost of a wafer, which is not all that much in the overall picture. As stated above, the REAL problem it can represent is in lost sales if their was the demand to sell them, image problems, investors getting panties all knotted up etc.

Scali · Sep 10, 2010

Caribbean Geek said:
But I heard that the GT200 costed up to $60.00 per GPU when launched, how much would cost the GF100 at launch? Probably the difference isn't by much specially when both chips are in similar size. So assuming that the GF100 costs $50 per chip, plus memory costs, PCB stuff like wiring and routing, Voltage Regulators, capacitors and resistors, a GTX 480/470 should cost at least $135 as a guesstimate, and if we count relative size compared to Cypress which is almost half of the size of GF100 with a cheaper and less complex PCB thanks to the 256-Bits BUS, less complicated power circuitry and similar components plus more yields per waffer, a single HD 5870 GPU should cost less than $20.00 and along with the prices of the videocard, it shouldn't exceed the $65.00 as a guesstimate.

Well, if we take those guesstimates, and set them off to total price:
GTX480: street price ~$500, manufacturing costs: $135. Margin: $365
HD5870: street price ~$400, manufacturing costs: $65. Margin: $335

So does it really matter that GTX480 may cost about twice as much to make? No, in the total product price, the difference is small, and can easily be compensated by the fact that the GTX480 performs better.
If these guesstimates are true, then nVidia would make about $30 more per GTX480 than AMD does per HD5870.

In other words... die size? Useless metric for cost/profit/etc.

Scali · Sep 10, 2010

yasasvy said:
I can understand if you use linux, other than that I see no difference between drivers today.

I work with FreeBSD...

Scali · Sep 10, 2010

brybir said:
If we believe that AMD has had 9-12 months more experience on the 40nm process, that means they have a very good idea of *exactly* what kind of synthetic flaws (those introduced by the process) and how to design around them. That was basically what that engineer from AMD was talking about in the Anandtech article some time ago. They used the 4770 as a testbed to "feel out" the process, and designed the 5XXX in some way that they believed would hit their performance,power and costs goals while minimizing the risk that the synthetic errors would result in dead chips because of the IC design itself.

But that's the FUD spread by AMD, making it sound as if they invented double vias.
nVidia started building 40 nm DX10.1 parts at about the same time as AMD started with the 4770, same testbed approach. Fermi had been using double vias from the beginning. nVidia just never made a big deal about it, unlike AMD (to insiders, AMD's remark of moving to double vias sounds pretty amateuristic, actually. They've been around for many years).

busydude · Sep 10, 2010

Scali said:
I work with FreeBSD...

I don't understand your comment? I know you have a 5770. Does AMD have driver support for FreeBSD? I have never used FreeBSD before, just curious.

I read your posts Scali, you have been pretty vocal about AMD's non-inclusion of OpenCL runtime in their drivers.

Scali · Sep 10, 2010

yasasvy said:
I don't understand your comment? I know you have a 5770. Does AMD have driver support for FreeBSD? I have never used FreeBSD before, just curious.

Nope, AMD has never released a single driver for FreeBSD... and the open source support for most GPUs (especially newer ones) is horrible.
nVidia has supported FreeBSD x86 for a few years now, and a few months ago, they also started releasing x64 drivers. So I'm pretty happy about that (I've been running the x64 version for quite a while now).
nVidia also supports Solaris btw.

All these OSes use the same Xorg stuff, and mainly require OpenGL/OpenCL support. So once you have a working linux driver, it should not be THAT difficult to also add support for FreeBSD and Solaris to your codebase. But nVidia is the only one who has made the effort.
For all other GPUs, you're completely dependent on the bundled open source drivers in Xorg.

Seero · Sep 10, 2010

brybir said:
one thing to note also is that AMD and Nvidia are on the same 40nm process at the same foundry, which means they are likely being exposed to the same "synthetic issues" i.e. process flaws that are inherent in TSMC's process at 40nm.

The real question though, is how does each design handle the flaws introduced? That is the true balancing act that goes on, and one that can often be unpredictable.

I agree

brybir said:
If we believe that AMD has had 9-12 months more experience on the 40nm process, that means they have a very good idea of *exactly* what kind of synthetic flaws (those introduced by the process) and how to design around them. That was basically what that engineer from AMD was talking about in the Anandtech article some time ago. They used the 4770 as a testbed to "feel out" the process, and designed the 5XXX in some way that they believed would hit their performance,power and costs goals while minimizing the risk that the synthetic errors would result in dead chips because of the IC design itself.

As they move to the 6XXX, this experience *should* help them to further optimize their design such to avoid crippled or non-working parts.

I believe you read Charlie's article too much. What 9-12 months more experience? Both parties have been working with TSMC for years and the chip making process requires them to work together closely. The test bed theory is nothing but a "after the fact speech." The cypress is simply the shrinked version of the R700 chip, and the R700 chip's design had been modified for maximum yield. Unlike Fermi, the design is extremely new and there are parts there were has not been placed on the die before. Naturally, the cypress yield will be greater than fermi yield, and if nvidia chose to shrink gt200, its yield will be better than fermi.

It is hard to avoid those type of problems. The laser head simply wasn't calibrated correctly and what was suppose to be 40nm is more like 45nm. Due to the complexity of the fermi design (3b transitors), it is doomed with problems. Cypress is less complex, smaller in size, and therefore had slightly better yield.

Usually speaking, yield loss are the part of the wafer where it can't form a die. Try to draw a square within a circle and you will have about 35% of yield loss. Defects are not yield lost as defects are very random. Usually defects are less than 1% if there are no design flaw and problems of calibration of the laser.

At the end, defects + yield loss affects the quantity of the production, which wasn't big, at least not as big as people believed. What was big was the quality of those that made it out. Since the wire was thicker than design, it generates excessive heat. They didn't kill SPs because of defects, but to cut down heat generation. Even with 80% of the chip running, it is still extremely hot.

460 is a cut down design of the original Fermi, but that isn't the reason why it is a better chip. First, ECC is meaningless for gamers, and second, it was made after the TSMC issue had resolved. That means, the chip was actually closer to its design and each of them performs above standards. Look at 480 and 470. 470 are 480s that doesn't really meet its standard. By cutting down active sp further, it reduces heat generation to an toleratable level. 465 are the ones that were still too hot after the cut. We don't have a 450 from 460, and people are overclocking 460 like crazy because there is only one chip coming out of the wafer. If the worst ones met the 460 standard, then 460 is a dream to OC.

ATI had the same problem. 5850 from 5870, and 5750 from 5770. The different is the Cypress design is less complicated than Fermi, and therefore consume less electricity and therefore produce less heat. Unlike what people like to say "ATI knows better."

I bet the Cypress was suppose to run faster and cooler then it is now and ATI engineers were crying in their labs, but luckily their opponent had it worst.

ATI didn't have it easy either. At the time (before september last year) there were actually no DirectX 11 games, and therefore they did't know how their chip was going to perform under DirectX 11 environment. At the end all public cares is "FPS". They were more or less blindfolded and the method of testing used by QA may have been off. They pulled through as a very successful make regardless of the difficulty. 460 was created at the time where Nvidia sees the height of the bar, and therefore modified the design as the optimized version for gaming.

The 6xxx series will be quite interesting if you are looking solely on gaming. However, Nvidia is too working hard on fine tuning the Fermi design.

Kenmitch · Sep 10, 2010

Seero said:
It is hard to avoid those type of problems. The laser head simply wasn't calibrated correctly and what was suppose to be 40nm is more like 45nm. Due to the complexity of the fermi design (3b transitors), it is doomed with problems.

I don't know but it seems like nvidia is trying to cram too much into the little space of a gpu. Wouldn't they be better off just going with a dual chip setup instead. After all they are striving for world domination! From a biz point wouldn't it be better to spin the cuda, physX, or whatever else they have planned into a seperate chip? Wouldn't it help with things like yield, temps, power, etc?

Seero · Sep 10, 2010

Kenmitch said:
I don't know but it seems like nvidia is trying to cram too much into the little space of a gpu. Wouldn't they be better off just going with a dual chip setup instead. After all they are striving for world domination! From a biz point wouldn't it be better to spin the cuda, physX, or whatever else they have planned into a seperate chip? Wouldn't it help with things like yield, temps, power, etc?

Dual chip is plagued with problems since the beginning. Lots of games don't like CF and SLi. Having a seperate GPU for cuda and physX is a good idea, but it really isn't like ATI is going to play handicap. ATI had been all out against Nvidia in terms of the quality of the chip and Nvidia really don't have the room to play fancy.

Actually, Nvidia did tried to play fancy with the Fermi, and were landed face first...

It really wasn't up to ATI, but AMD. AMD is a CPU manufacturer and doesn't believe in gaming. If GITG had not been cut, TWIMTBP won't be the way it is now and ATI user will have their own version of eye candy.

Real time Tessellation was originated by ATI. Guess what would have been like...

What happens to nvidia?

Senior member

Golden Member

Diamond Member

Banned

Banned

Banned

Banned

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Lifer

Golden Member

Diamond Member

Golden Member

Senior member

Banned

Banned

Banned

Diamond Member

Banned

Golden Member

Diamond Member

Golden Member