nVidia GT300's Fermi architecture unveiled: 512 cores, up to 6GB GDDR5

BenSkywalker · Oct 3, 2009

Nice attack on the messenger, but no dice. The Fudzilla link simply reports the PCGA?s Horizon revenue report

PCGA is a bunch of liars. There is no splitting hairs here, they lie, badly. They don't have any basis in reality at all, no marketing data to support their obnoxiously obscene claims and nothing resembling a shred of credibility. By contrast, I linked NPD- which is able to demand thousands of dollars per annual subscription, and corporate financial reports that are bound by law to be accurate.

Thanks for proving my point for me, namely that digital transactions aren?t included in your figures.

They are reported in the financials I linked. My numbers are correct, the NPD number was just to use as a guideline of retail sales versus actual total sales numbers. Outside of MMOs, online sales are significantly smaller then most people seem to think. Again, they are listed, as a matter of criminal law, accurately in the financial links I provided.

There?s no need for me to respond to the rest of your figures given they likely misrepresent the PC gaming revenue in the same way.

They give absolute dollar values, and they are legally bound to report them accurately. They aren't a bunch of marketers trying to push an agenda in a financial report- they are accountants filing with the SCE. That is the difference, my numbers are real, the ones you are quoting are completely fabricated.

Nemesis 1 · Oct 3, 2009

As I said befor this is all NV Hype to stop people from buying 5870 5850. Latest from NV.

?[The Fermi chip was] not on the prototype that was held up [at the keynote and press conference]. Modifications may be made in this before the product is released. -Luciano Alibrandi

He goes on to say the demoes were done on fermi but with all the lies NV puts out I doubt it . Why should we believe the great deciever?

GaiaHunter · Oct 3, 2009

Originally posted by: LCD123
I agree, the 5870 has dx11 and more features and im sure people could overlook the 10% slower performance. Even at the same price, I bet it would be a tought choice between the two. Take 10% more performance or take futureproof and extra features? As for ATI having two cards in xfire, compare that vs. Nvidia SLI please.

He is talking about the 5870x2 as in SINGLE PCB with 2 GPUs not 2 5870 in crossfire.

That is the same as the GTX295 which is 2 GPUs in a SINGLE PCB.

Idontcare · Oct 3, 2009

Originally posted by: MODEL3
With 2,249X scaling for transistor number, the 5870 (40nm) was only 1,284X bigger (die size) in relation with 4870 (55nm) (334mm2/260mm2).

So the hypothetical GTX380 (40nm) with 3,15 billion transistors will have also 1,284X scalling regarding die size.

So GTX380 (3,15 billion transistors) will be 603mm2. (470mm2*1,284)

So the real GTX380 with 3 billion transistors (with the logic that you are using) is going to be less than 603mm2.

In the best case scenario will be 603mm2/3,15*3=575mm2.

I don't want to come across as trying to harass you over your estimations and logic here as it is all plausible, but I did want to quote you as a segue to my interjecting the following info regarding xtor density and why comparing xtor density between two architectures, and even comparing it to two IC's designed by the same company and using the same architecture, is a rather tricky thing to do with any degree of confidence because there are so many unknowns that we must make assumptions in regards to in order to perform such an analysis.

It is the absence of this info, the trade-off decisions made by project managers, that can result in such estimations being critically flawed because at best they will be right for all the wrong reasons.

Not that we shouldn't try, but we should be clear(er) on what it is we are assuming must be true when making xtor density comparisons. (such as "assuming the design optimization budget was the same for RV770 and Cypress, xtor density for Cypress was as equally optimized for 40nm as the xtor density for RV770 was optimized for 55nm", etc)

The following is dialogue I had with another forum member in a pm, it seems generically applicable enough to the current topic that I thought it would add value in my reposting it here in-toto. It is not meant to be all-encompassing of the subject matter of xtor density, a 600pg book could be written in pursuit of that, but it meant to communicate the salient points regarding such metrics

Die Density: Will more tightly packed transistors create more heat. Require less voltage to run? And how was ATI able to cram 900+ million transistors into such a small space and Nvidia, with 50% more transistors took up more than 50% larger die space?

xtor density doesn't really equate to power-consumption (heat) directly. Transistors are two dimensional creatures, they have a length and a width.

The length, or more specifically the minimum length possible for a given node, tends to be the metric that catches a lot of headlines. But the width is also important.

Drive currents are normalized per transistor width. nano-amps per micron.

http://www.realworldtech.com/p...ID=RWT072109003617&p=5

What is the relevance of drive current? It determines the amount of current that leaves the transistor which is then used to drive (turn on) subsequent transistors.

The higher your drive current is means the smaller (narrower in width) you can make the xtor (resulting in higher density) while achieving the same amount of amps coming out of the xtor to drive more xtors (the act of computing).

Now the architecture is what determines how many more xtors you need to drive to do your computation. This is where NV and ATI diverge and is why their xtor density can be so different.

Also drive current is voltage dependent, so you can use smaller (narrower) xtors but increase the operating voltage and get more drive current out of them that way.

Now increasing voltage will increase heat and power-consumption. So if you implement an architecture that needs lots of drive current but you want high xtor density (for lower manufacturing costs, higher yield) then you increase the voltage.

Or you could optimize your architecture to not need so much drive current and then you could use higher xtor density with lower volts.

ATI is able to cram so many xtors into such a small area because they use more narrower xtors, less net idrive per xtor, which means they either up the voltage to boost the net idrive per xtor or they implemented an architecture that is less demanding of drive currents.

(incidentally if you checkout Anand's article on Intel's choice of 8T sram on Nehalem versus 6T sram on Penryn you'll see it comes down to similar architecture vs. power-consumption vs. xtor density tradeoffs).

The architecture dependence is why you'll see me repeatedly stating in the forums that making xtor density comparisons between AMD and NV is pointless unless we know far more technical/intimate details about the architecture and design tradeoffs between Vcc and GHz that were made by the project managers.

I don't understand why if both ATI and NV use TSMC processes, why are the transistor densities so different? The transistor size of the ATI design and the Nvidia design are the same size on 40nm, correct?

The smallest xtor size (as in Lg or gate length as its called) is the same, as well as the Idrive (nA per um) for any given voltage. This is true because they both do use the same 40nm TSMC process tech.

But the architecture dictates the needs for Idrive, which then requires choices to be made in terms of voltage (power-consumption tradeoff) versus diesize (cost, yield tradeoff).

Also both voltage and the minimum gate length determine clockspeed.So if you find out you need a certain operating voltage to hit your targeted clockspeeds but you used needlessly wide xtors (too much Idrive generated from the voltage and your selected xtor width) then you are just needlessly generating power-consumption and heat. The design tools they have nowadays are good enough to eliminate much of this uncertainty though.

Idontcare · Oct 3, 2009

Originally posted by: Nemesis 1
As I said befor this is all NV Hype to stop people from buying 5870 5850. Latest from NV.

?[The Fermi chip was] not on the prototype that was held up [at the keynote and press conference]. Modifications may be made in this before the product is released. -Luciano Alibrandi

He goes on to say the demoes were done on fermi but with all the lies NV puts out I doubt it . Why should we believe the great deciever?

Nemesis, regardless whether deception is afoot the fact is none of it (the deception, real or perceived) changes the reality of when we can buy Fermi-based graphics products. The cause and effect is reversed here.

Deception does not cause a delay, a delay causes deception.

And unless I am mistaken, the hype so far is in regards to Tesla, GPGPU stuff, not hype about GPU graphics stuff which would impact Cypress sales. So if anything the deception here is simply going to hurt Tesla sales of GT200-based products that would have gone thru if it weren't for the promise of Fermi-based Tesla coming out soon.

MODEL3 · Oct 3, 2009

Originally posted by: Idontcare
Snip

Yes, i remember that. (I think you had posted something similar if i remember correctly, before a month or something)

That's why i started my post with:

Originally posted by: MODEL3
With this logic:

..............................................

and ending it with:

Originally posted by: MODEL3
..............................................

The thing is that with a new architecture, new process technology and with the way that NV calculates the transistors (does the old numbers include cache transistors? does in this case the 3billion figure include cache transistors?) it is too difficult to say what the die size will be.

It doesn't matter if we are accurate.
We are just examine the possibilities of what the die size may be for fun.
I like your estimation 530mm2.

Zstream · Oct 3, 2009

Originally posted by: Idontcare
What is important here? That Fermi can support the execution of C++ compiled code or that some dude mistakenly equates the native aspects of the this feature to that of the hardware? I program in C++ and I run programs that are compiled from Fortran (I have the source and I compile for my own unique hardware combos), personally I am quite excited about the prospects of getting my hands on a Fermi compiler. Could be super fun. Getting all excitable over semantics gaffes, not so much.

There a larger differences between hardware support and being able to run c++. Unfortunately, it appears you saying the differences are trivial. If you do not understand the differences... Well that is pretty bad. Trying to pump up a feature that you do not grasp is not only sad but pathetic.

C++ is a programming language plain and simple, there is no if or but about it. In order to run C++ native, the hardware will need a built in assembler, I/O capabilities etc. just for some C/++ to run. To compound the problem even further, in order to run programs from start to finish, you need another layer of code to take the x86/x64 and port them over to whatever language the hardware takes.

If you want to say Nvidia can handle code from C++ on an OS and assist in running those programs, then yes I agree. If you want to say Nvidia can run an entire OS or native x86/64 programs then no way.

alyarb · Oct 3, 2009

when nvidia says "native" it implies it will run without a recompile, but you know that's bullshit.

LCD123 · Oct 3, 2009

Originally posted by: GaiaHunter

Originally posted by: LCD123
I agree, the 5870 has dx11 and more features and im sure people could overlook the 10% slower performance. Even at the same price, I bet it would be a tought choice between the two. Take 10% more performance or take futureproof and extra features? As for ATI having two cards in xfire, compare that vs. Nvidia SLI please.

Click to expand...

He is talking about the 5870x2 as in SINGLE PCB with 2 GPUs not 2 5870 in crossfire.

That is the same as the GTX295 which is 2 GPUs in a SINGLE PCB.

We don't know if it's yet possible to slap on two cores for 5870x2. Doesn't one core already take so much power you need 2 connectors supplying power to the card?

alyarb · Oct 3, 2009

The 4870 and 9800 GTX and GTX 275 have two power connectors, and yet there are cards with two of these chips.

Power consumption with the 5870 has also decreased over past generations:

http://www.xbitlabs.com/articl...on-hd5870_7.html#sect0

I would imagine the furmark data point of 161 watts to be a peak that is not commonly achieved during normal gaming. you can see the peak 3D data point is only 107 watts, a significant reduction from 130 watts of the 4870. There should be no problem with a 5870 X2 as far as power goes. My concern would be the width and quality of the bus between the GPUs.

dguy6789 · Oct 3, 2009

Originally posted by: LCD123

Originally posted by: GaiaHunter

Originally posted by: LCD123
I agree, the 5870 has dx11 and more features and im sure people could overlook the 10% slower performance. Even at the same price, I bet it would be a tought choice between the two. Take 10% more performance or take futureproof and extra features? As for ATI having two cards in xfire, compare that vs. Nvidia SLI please.

Click to expand...

He is talking about the 5870x2 as in SINGLE PCB with 2 GPUs not 2 5870 in crossfire.

That is the same as the GTX295 which is 2 GPUs in a SINGLE PCB.

Click to expand...

We don't know if it's yet possible to slap on two cores for 5870x2. Doesn't one core already take so much power you need 2 connectors supplying power to the card?

What do you mean we don't know? The 5870x2 and the 5850x2 are real cards that are officially announced by ATI that will be out before Christmas.

happy medium · Oct 3, 2009

Originally posted by: dguy6789

Originally posted by: LCD123

Originally posted by: GaiaHunter

Originally posted by: LCD123
I agree, the 5870 has dx11 and more features and im sure people could overlook the 10% slower performance. Even at the same price, I bet it would be a tought choice between the two. Take 10% more performance or take futureproof and extra features? As for ATI having two cards in xfire, compare that vs. Nvidia SLI please.

Click to expand...

He is talking about the 5870x2 as in SINGLE PCB with 2 GPUs not 2 5870 in crossfire.

That is the same as the GTX295 which is 2 GPUs in a SINGLE PCB.

Click to expand...

We don't know if it's yet possible to slap on two cores for 5870x2. Doesn't one core already take so much power you need 2 connectors supplying power to the card?

Click to expand...

What do you mean we don't know? The 5870x2 and the 5850x2 are real cards that are officially announced by ATI that will be out before Christmas.

I didn't know they were officially announced . Could you give me the official link? Benchmarks,power consumption, length of card and clock speeds and what kind of power connectors needed?. What's the officail release date?

alyarb · Oct 3, 2009

Originally posted by: happy medium

I didn't know they were officially announced . Could you give me the official link? Benchmarks,power consumption, length of card and clock speeds and what kind of power connectors needed?. What's the officail release date?

since when does AMD release benchmarks with a product announcement? search google if you want information. there are plenty of pictures going around of AMD execs posing like douchebags, each with a 5870 X2 in his hands.

happy medium · Oct 3, 2009

Originally posted by: alyarb

Originally posted by: happy medium

I didn't know they were officially announced . Could you give me the official link? Benchmarks,power consumption, length of card and clock speeds and what kind of power connectors needed?. What's the officail release date?

Click to expand...

since when does AMD release benchmarks with a product announcement? search google if you want information. there are plenty of pictures going around of AMD execs posing like douchebags, each with a 5870 X2 in his hands.

Ale those pictures of those cards and not a single spec or benchmark?

The 4850x2 was not an official card (as far as I remember) but made by aftermarket companies right? I google the 5850x2 and didn't find anything official or official looking?

alyarb · Oct 3, 2009

i don't understand why you think benchmarks should immediately follow product meet-and-greets. We have no Fermi benchmarks, just pictures of people with the alleged card in their hands. We have the same exact thing with Hemlock. Just photos and little else. When AMD launched RV770 they didn't show us all the goods on the same day and I don't know why you think things should be any different now. The card isn't due for another ten weeks at least.

What specifications are left that you cannot surmise? People are saying the card is a foot long. The TDP is supposed to be 376 watts but I imagine the average consumption during gaming will be closer to 260-290 watts given the 5870's figures. And you're right, the 4850 and 3850 X2 were not official products but enjoyed a little success in their own right. I don't know if the 5850 X2 is official or not, but you will probably find one somewhere eventually.

The big question is whether or not the frame buffer is shared or segregated. AMD has garnered criticism for their segregated memory systems for years because it requires an expensive and slower-than-PCIe bridge chip to mediate between the processors. If they can have one large frame buffer that is shared, they can eliminate this bridge chip and save money on the card provided the proper on-die I/O was prepared ahead of time. They know they have to do something about it eventually and this is the big unanswered question with 5870 X2. It is way too early to be asking for benchmarks.

Cogman · Oct 3, 2009

Originally posted by: Idontcare
Yes the lingo "native C++" is intended to be restricted to characterization of the compiler and not meant to be descriptive of the attributes of the underlying hardware.

At the same time it is clear that Keys misstated this but it is also clear what he meant/intended to communicate (C++ code can be compiled to run on Fermi)...Cogman I know you are intelligent enough to see what he meant and understanding enough to give some room for the possibility of a person simply accidentally conflating the technical terms.

What is important here? That Fermi can support the execution of C++ compiled code or that some dude mistakenly equates the native aspects of the this feature to that of the hardware? I program in C++ and I run programs that are compiled from Fortran (I have the source and I compile for my own unique hardware combos), personally I am quite excited about the prospects of getting my hands on a Fermi compiler. Could be super fun. Getting all excitable over semantics gaffes, not so much.

I guess the reason it gets to me is because the dude totes it like some major architectural change that is allowing this when, in reality, it is more of a driver update that really should be retro-active (nvidias GPGPU's of the past should also be able to run c++ code if fermi can.)

It makes it more of an issue to me because I start to question the legitimacy of the site that it toting the features. (I've never heard of BSN, they may be legitimate, but I just haven't heard of them. Thus when a reviewer says stupid stuff, I loose faith in their site.)

*edit* I've probably just been hanging around P&N for too long. I didn't really mean for the response to be read as excited or anything like that, I was just trying to point out that the review statement was especially dumb.*/edit*

Soulkeeper · Oct 3, 2009

I'll take the low power version

HurleyBird · Oct 3, 2009

Originally posted by: MODEL3The thing is that in 5870-4870 case the memory controller is the same 256bit, while in the GTX380-GTX285 case the new memory controller has smaller width than the old one. (GTX380=384bit, GTX285=512bit)

So you have to deduct a little bit die space from the new design (GTX380).

Well, first, you're assuming that the memory controller is less dense than the rest of the chip, which may not be the case.

Secondly, you're leaving out the ECC bits of the memory controller. 72-bit (for ECC, vs. 64-bit without ECC) x 6 = a 432-bit memory controller. Performance wise it still behaves like a 384-bit controller, but size wise it's really 432-bit.

Really, being a new (or at leat for the most part new) architecture we don't really have any idea if Nvidia has improved transistor density or not. The fact that the chip doesen't have a 512 (576 with ECC) bit memory controller hints that it's probably closer to G80 than to GT200, but on the other hand Nvidia could have went with the smaller memory controller not because of pad limits but because a larger interface simply had no perfomance benefit.

Kuzi · Oct 3, 2009

Originally posted by: Zstream

Originally posted by: Idontcare
What is important here? That Fermi can support the execution of C++ compiled code or that some dude mistakenly equates the native aspects of the this feature to that of the hardware? I program in C++ and I run programs that are compiled from Fortran (I have the source and I compile for my own unique hardware combos), personally I am quite excited about the prospects of getting my hands on a Fermi compiler. Could be super fun. Getting all excitable over semantics gaffes, not so much.

Click to expand...

C++ is a programming language plain and simple, there is no if or but about it. In order to run C++ native, the hardware will need a built in assembler, I/O capabilities etc. just for some C/++ to run. To compound the problem even further, in order to run programs from start to finish, you need another layer of code to take the x86/x64 and port them over to whatever language the hardware takes.

Your post is confusing me a little. "Assembler" is the word used instead of "Compiler" when talking about Assembly language. So what you meant is probably in order to support C++ natively, the hardware will need a built in compiler. Although I agree with what you said, I'm not sure if such hardware exists.

About the x86/x64 instructions, can't all this be handled by the Fermi compiler? I'm just asking cause I don't know how all this works.

If you want to say Nvidia can handle code from C++ on an OS and assist in running those programs, then yes I agree. If you want to say Nvidia can run an entire OS or native x86/64 programs then no way.

Well of course Fermi won't be able to run x86 software, but it's possible a future GPU can emulate it, albeit it would be very slow.

Keysplayr · Oct 3, 2009

Originally posted by: HurleyBird

Originally posted by: MODEL3The thing is that in 5870-4870 case the memory controller is the same 256bit, while in the GTX380-GTX285 case the new memory controller has smaller width than the old one. (GTX380=384bit, GTX285=512bit)

So you have to deduct a little bit die space from the new design (GTX380).

Click to expand...

Well, first, you're assuming that the memory controller is less dense than the rest of the chip, which may not be the case.

Secondly, you're leaving out the ECC bits of the memory controller. 72-bit (for ECC, vs. 64-bit without ECC) x 6 = a 432-bit memory controller. Performance wise it still behaves like a 384-bit controller, but size wise it's really 432-bit.

Really, being a new (or at leat for the most part new) architecture we don't really have any idea if Nvidia has improved transistor density or not. The fact that the chip doesen't have a 512 (576 with ECC) bit memory controller hints that it's probably closer to G80 than to GT200, but on the other hand Nvidia could have went with the smaller memory controller not because of pad limits but because a larger interface simply had no perfomance benefit.

Sounds about right to me. With GDDR5, the 4870 seemed to do alright with a 256-bit memory interface. This continues with 5870/5850. Now, with GDDR3 like G80 and GT200, there probably was a trade off. Wider bus, slower memory. Regardless, a 384-bit bus and GDDR5 should provide all the bandwidth needed and then some. Or, 432-bit as you say with ECC for the parity bit.

MODEL3 · Oct 3, 2009

Originally posted by: HurleyBird
Well, first, you're assuming that the memory controller is less dense than the rest of the chip, which may not be the case.

I think that usually it is.
But anyway, this is not the point of my post.
With this assumption i deducted less than 15mm2. (x<591mm2 - 576mm2)
My point with my post was to show that if we use that logic the result is not 650-700mm2. (like some members were saying)

------------------------------------------------------

Originally posted by: HurleyBird
Secondly, you're leaving out the ECC bits of the memory controller. 72-bit (for ECC, vs. 64-bit without ECC) x 6 = a 432-bit memory controller. Performance wise it still behaves like a 384-bit controller, but size wise it's really 432-bit.

Correct.
In essense the 384bit is like a 432bit.
What does this change?
Is it lower than 512bit or not?
My assumption was that the memory controller in Fermi case has smaller width, which is true (384bit is the figure Nvidia used, but yes in essense it is like a 432bit)

-----------------------------------------------------

Originally posted by: HurleyBird
Really, being a new (or at leat for the most part new) architecture we don't really have any idea if Nvidia has improved transistor density or not.

That's what i meant with:

Originally posted by: MODEL3
The thing is that with a new architecture, new process technology and with the way that NV calculates the transistors (does the old numbers include cache transistors? does in this case the 3billion figure include cache transistors?) it is too difficult to say what the die size will be.

-----------------------------------------------------

Originally posted by: HurleyBird
The fact that the chip doesen't have a 512 (576 with ECC) bit memory controller hints that it's probably closer to G80 than to GT200, but on the other hand Nvidia could have went with the smaller memory controller not because of pad limits but because a larger interface simply had no perfomance benefit.

If i remember the G80 is 484mm2.
So, "closer to G80 than to GT200" only means less than 530mm2, which is a possibility.

The most probable thing is for Nvidia to went for 384bit memory controler because the little performance benefits of a larger interface simply could not justify all the other disadvantages.

Keysplayr · Oct 3, 2009

Originally posted by: MODEL3

Originally posted by: HurleyBird
Well, first, you're assuming that the memory controller is less dense than the rest of the chip, which may not be the case.

Click to expand...

I think that usually it is.
But anyway, this is not the point of my post.
With this assumption i deducted only 15mm2. (591mm2-576mm2)
My point with my post was to show that if we use that logic the result is not 650-700mm2.

------------------------------------------------------

Originally posted by: HurleyBird
Secondly, you're leaving out the ECC bits of the memory controller. 72-bit (for ECC, vs. 64-bit without ECC) x 6 = a 432-bit memory controller. Performance wise it still behaves like a 384-bit controller, but size wise it's really 432-bit.

Click to expand...

Correct.
In essense the 384bit is like a 432bit.
What does this change?
Is it lower than 512bit or not?
My assumption was that the memory controller in Fermi case has smaller width, which is true (384bit is the figure Nvidia used, but yes in essense it is like a 432bit)

-----------------------------------------------------

Originally posted by: HurleyBird
Really, being a new (or at leat for the most part new) architecture we don't really have any idea if Nvidia has improved transistor density or not.

Click to expand...

That's what i meant with:

Originally posted by: MODEL3
The thing is that with a new architecture, new process technology and with the way that NV calculates the transistors (does the old numbers include cache transistors? does in this case the 3billion figure include cache transistors?) it is too difficult to say what the die size will be.

Click to expand...

-----------------------------------------------------

Originally posted by: HurleyBird
The fact that the chip doesen't have a 512 (576 with ECC) bit memory controller hints that it's probably closer to G80 than to GT200, but on the other hand Nvidia could have went with the smaller memory controller not because of pad limits but because a larger interface simply had no perfomance benefit.

Click to expand...

If i remember the G80 is 484mm2.
So, "closer to G80 than to GT200" only means less than 530mm2, which is a possibility.

The most probable thing is for Nvidia to went for 384bit memory controler because the performance benefits of a larger interface simply could not justify all the other disadvantages.

You both could be right. No performance benenfit, or couldn't justify all the other disadvantages. You guys are basically saying the same thing, just in different ways.

Forumpanda · Oct 3, 2009

Originally posted by: Cogman

Originally posted by: Idontcare
Yes the lingo "native C++" is intended to be restricted to characterization of the compiler and not meant to be descriptive of the attributes of the underlying hardware.

At the same time it is clear that Keys misstated this but it is also clear what he meant/intended to communicate (C++ code can be compiled to run on Fermi)...Cogman I know you are intelligent enough to see what he meant and understanding enough to give some room for the possibility of a person simply accidentally conflating the technical terms.

What is important here? That Fermi can support the execution of C++ compiled code or that some dude mistakenly equates the native aspects of the this feature to that of the hardware? I program in C++ and I run programs that are compiled from Fortran (I have the source and I compile for my own unique hardware combos), personally I am quite excited about the prospects of getting my hands on a Fermi compiler. Could be super fun. Getting all excitable over semantics gaffes, not so much.

Click to expand...

I guess the reason it gets to me is because the dude totes it like some major architectural change that is allowing this when, in reality, it is more of a driver update that really should be retro-active (nvidias GPGPU's of the past should also be able to run c++ code if fermi can.)

It makes it more of an issue to me because I start to question the legitimacy of the site that it toting the features. (I've never heard of BSN, they may be legitimate, but I just haven't heard of them. Thus when a reviewer says stupid stuff, I loose faith in their site.)

*edit* I've probably just been hanging around P&N for too long. I didn't really mean for the response to be read as excited or anything like that, I was just trying to point out that the review statement was especially dumb.*/edit*

I think the biggest reason people barf about it is that in this case the marketing people used a terminology which have a specific technical meaning that is obviously different from what they really mean.

Most times forum posters (or at least me) couldn't care for all the silly marketing speak that is coming from both camps, except for the rare occasion when
A) They show something that is actually cool/interesting.
B) They screw up by saying stuff that is laughably wrong or misuse technical terms.

This just happens to be one of such times so it gives us something to make fun at them about (and all the people who just nod and repeat the marketing talk as the truth).

Keysplayr · Oct 3, 2009

Originally posted by: Forumpanda

Originally posted by: Cogman

Originally posted by: Idontcare
Yes the lingo "native C++" is intended to be restricted to characterization of the compiler and not meant to be descriptive of the attributes of the underlying hardware.

At the same time it is clear that Keys misstated this but it is also clear what he meant/intended to communicate (C++ code can be compiled to run on Fermi)...Cogman I know you are intelligent enough to see what he meant and understanding enough to give some room for the possibility of a person simply accidentally conflating the technical terms.

What is important here? That Fermi can support the execution of C++ compiled code or that some dude mistakenly equates the native aspects of the this feature to that of the hardware? I program in C++ and I run programs that are compiled from Fortran (I have the source and I compile for my own unique hardware combos), personally I am quite excited about the prospects of getting my hands on a Fermi compiler. Could be super fun. Getting all excitable over semantics gaffes, not so much.

Click to expand...

I guess the reason it gets to me is because the dude totes it like some major architectural change that is allowing this when, in reality, it is more of a driver update that really should be retro-active (nvidias GPGPU's of the past should also be able to run c++ code if fermi can.)

It makes it more of an issue to me because I start to question the legitimacy of the site that it toting the features. (I've never heard of BSN, they may be legitimate, but I just haven't heard of them. Thus when a reviewer says stupid stuff, I loose faith in their site.)

*edit* I've probably just been hanging around P&N for too long. I didn't really mean for the response to be read as excited or anything like that, I was just trying to point out that the review statement was especially dumb.*/edit*

Click to expand...

I think the biggest reason people barf about it is that in this case the marketing people used a terminology which have a specific technical meaning that is obviously different from what they really mean.

Most times forum posters (or at least me) couldn't care for all the silly marketing speak that is coming from both camps, except for the rare occasion when
A) They show something that is actually cool/interesting.
B) They screw up by saying stuff that is laughably wrong or misuse technical terms.

This just happens to be one of such times so it gives us something to make fun at them about (and all the people who just nod and repeat the marketing talk as the truth).

Ok, then lets straighten some stuff out then.

So, Fermi cannot run C++ natively on chip? Ok, I don't actually know what the difference between running on a chip or not is. Some are saying it needs specific hardware for this to actually happen. Could it be that what was meant by running C++ natively on the GPU is actually code created directly with C++ and not any other third party compilers thereafter?
I've heard that Fermi can now be directly programmed on, much the same as a CPU can.
And I think C++ can be used. Visual Basic, Fortran, others.
This isn't my area to be honest. So what I'm typing here are more or less questions for those who understand this type of programming.

So, don't throw any molotov's my way, I'm just asking some questions.

Idontcare · Oct 3, 2009

Originally posted by: Keysplayr
Ok, then lets straighten some stuff out then.

So, Fermi cannot run C++ natively on chip? Ok, I don't actually know what the difference between running on a chip or not is. Some are saying it needs specific hardware for this to actually happen. Could it be that what was meant by running C++ natively on the GPU is actually code created directly with C++ and not any other third party compilers thereafter?
I've heard that Fermi can now be directly programmed on, much the same as a CPU can.
And I think C++ can be used. Visual Basic, Fortran, others.
This isn't my area to be honest. So what I'm typing here are more or less questions for those who understand this type of programming.

So, don't throw any molotov's my way, I'm just asking some questions.

Keys I don't know if you had a chance to read through the somewhat lengthy Fermi whitepaper I linked above but it does have a page dedicated to discussing something Nvidia is calling "Nexus":

Introducing NVIDIA Nexus

NVIDIA Nexus is the first development environment designed specifically to support massively parallel CUDA C, OpenCL, and DirectCompute applications. It bridges the productivity gap between CPU and GPU code by bringing parallel-aware hardware source code debugging and performance analysis directly into Microsoft Visual Studio, the most widely used integrated application development environment under Microsoft Windows.

Nexus allows Visual Studio developers to write and debug GPU source code using exactly the same tools and interfaces that are used when writing and debugging CPU code, including source and data breakpoints, and memory inspection. Furthermore, Nexus extends Visual Studio functionality by offering tools to manage massive parallelism, such as the ability to focus and debug on a single thread out of the thousands of threads running parallel, and the ability to simply and efficiently visualize the results computed by all parallel threads.

http://redirectingat.com/?id=5...itectureWhitepaper.pdf

Also since some folks are concerned with the topic, here is another quote from the whitepaper:

Provide a machine-independent ISA for C, C++, Fortran, and other compiler targets.

I really do hope people find the time to plow thru the whitepaper, I think it would do well to help educate some well meaning folks on what exactly Nvidia is claiming Fermi brings to the table.

IMO its exactly what Nvidia needed to do to keep their GPGPU trajectory on course. They know they have to close these gaps in capability relative to what Larrabee is capable of doing out of the box, they aren't idiots, and Fermi is yet another step in that direction.

I'm still really curious where AMD is going with their GPGPU strategy. They give me the impression they are content with focusing on making great video cards and great cpus and that Fusion is the future, not GPGPU but CPUGPU. Which if that is the case then it makes a lot of sense why they haven't put as much emphasis on their GPU's beimg GPGPU as Nvidia has/is IMO.

nVidia GT300's Fermi architecture unveiled: 512 cores, up to 6GB GDDR5

Diamond Member

Lifer

Diamond Member

Elite Member

Elite Member

Senior member

Diamond Member

Platinum Member

Member

Platinum Member

Diamond Member

Lifer

Platinum Member

Lifer

Platinum Member

Lifer

Diamond Member

Platinum Member

Senior member

Elite Member

Senior member

Elite Member

Member

Elite Member

Elite Member