Anyone care to revisit an old discussion?

alyarb · Oct 14, 2009

i know, he's like a robot sometimes. it's really weird.

SlowSpyder · Oct 14, 2009

Originally posted by: Wreckage

Originally posted by: alyarb
i'm just coming in to say i seriously laughed out loud (i'm by myself) when I read wreckage's response to the OP. he didn't respond to a single question the OP presented.

just GTX 295 is still the fastest. check out batman with physx. Radeon 4000 was always behind, and they are outsold 2:1.

heh, thank you wreckage. everyone appreciates it.

Click to expand...

I answered both his questions. Why not address that instead of chasing after me?

My statements were in direct response to his summary. I think he and the other "zoners" just wanted a one sided discussion.

"AMD GPU's are smaller than Nvidia GPU's but performance is similar.
Are Nvidia GPU's really more aimed towards GPGPU?

"

Both statements are false as any gaming benchmark will show you.

"AMD's GPU's are smaller than Nvidia's GPU's but performance is similar." - 100% true, not at all meant to be a 'zoner' comment, it's simply true.

"Are Nvidia GPU's really more aimed toward GPGPU" (than AMD GPU's). Seems like a pretty neutral question. I'd like to know what makes people say that Nvidia GPU's are more GPGPU capable than AMD GPU's. Nvidia's GT200 uses something close to 400 million more transistors for what amounts to very similar performance in the gaming world. Is this extra silicon used for GPGPU? If so, how? How do the changes both companies made to their next gen GPU's add to the abilities of those GPU's to act as GPGPU?

Thanks again for going out of your way to wreck a thread.

Wreckage · Oct 14, 2009

Originally posted by: SlowSpyder

- 100% true, not at all meant to be a 'zoner' comment, it's simply true.

The GTX285 was the fastest GPU, The 295 is still the fastest card. "similar performance" is not true, sorry.

"Are Nvidia GPU's really more aimed toward GPGPU" (than AMD GPU's).

Ah now you add "than AMD GPUs". It was more implying that it hampered their gaming performance.

I'd like to know what makes people say that Nvidia GPU's are more GPGPU capable than AMD GPU's. Nvidia's GT200 uses something close to 400 million more transistors for what amounts to very similar performance in the gaming world. Is this extra silicon used for GPGPU? If so, how? How do the changes both companies made to their next gen GPU's add to the abilities of those GPU's to act as GPGPU?

That's why I pointed to PhysX, Folding@home, video transcoding (without the cpu), the hundreds of CUDA apps, etc, etc. Enough proof to fill a galaxy really.

Thanks again for going out of your way to wreck a thread.

By wreck, you mean give an opposing view point. That's what an open discussion is all about.

alyarb · Oct 14, 2009

most people consider performance within 5% of the leader to be considered similar and negligible given the huge difference in cost, die area, etc. AMD's architecture is simply more efficient.

the amount of transistors per TPC spent on CUDA (the only GPGPU API supported by G200) was not addressed, and there is some real interest in this question. but you just keep banging out the same old rhetoric about brute speed, physx, and sales. there are no technical/engineering posts from you, just unconditional love.

this is coming from a centrino 2 laptop on a GMA4500. i'm a paula deen fanboy and that's about it.

SSChevy2001 · Oct 14, 2009

Originally posted by: Wreckage
That's why I pointed to PhysX, Folding@home, video transcoding (without the cpu), the hundreds of CUDA apps, etc, etc. Enough proof to fill a galaxy really.

Actually you have it backwards about the video encoding. Nvidia solution depends more on the CPU.

Since the ATI Radeon HD 4770 was faster in the benchmarks it goes to show more work is being off loaded and done on the GPU than the CPU. It seems that on the NVIDIA solution more of the work is being dumped to the processor and the dual-core AMD Athlon X2 4850e isn't that quick.

http://www.legitreviews.com/article/978/2/

alyarb · Oct 14, 2009

i won't consider transcoding on the GPU, regardless of who is faster, until the output quality is up to the same level of CPU encoders.

evolucion8 · Oct 14, 2009

Originally posted by: SlowSpyder
Thanks again for going out of your way to wreck a thread.

Hence, his nickname, Wreckage aka nVckage aka Crapckage aka derailckage aka FUDckage etc etc etc...

Now, back to topic, the Stream Processors on nVidia are simply bigger than the ATi counterparts, nVidia's approach is more hardware based than ATi's approach which needs more driver optimization to perform at its fullest, that's why when the HD 4870 debuted, it was as fast or slighly slower than the GTX 260+ and look now, even in Crysis, the HD 4870 is slighly ahead compared to the GTX 260., and the HD 4890 = GTX 285, a much more expensive card loll

http://www.anandtech.com/video/showdoc.aspx?i=3658&p=5

So as driver matures, games and GPGPU software will outshine the more hardware oriented nVidia counterpart.

It so happened that the HD 4870 won 4 of 8 games in that review, tied 2 and only lost one against the GTX 260+

bryanW1995 · Oct 14, 2009

Originally posted by: Idontcare

Originally posted by: SlowSpyder
Cliffs:
AMD GPU's are smaller than Nvidia GPU's but performance is similar.

Click to expand...

It seems to me that you are assuming both architectures have been equally optimized in their respective implementations when making comparisons that involve things like die-size.

Let me use an absurd example to show what I mean.

Suppose NV's decision makers decided they were going to fund GT200 development but gave the project manager the following constraints: (1) development budget is $1m, (2) timeline budget is 3 months, and (3) performance requirements were that it be on-par with anticipated competition at time of release.

Now suppose AMD's decision makers decided they were going to fund RV770 development but gave the project manager the following constraints: (1) development budget is $10m, (2) timeline budget is 30 months, (3) performance requirements were that it be on-par with anticipated competition at time of release, and (4) make it fit into a small die so as to reduce production costs.

Now in this absurd example the AMD decision makers are expecting a product that meets the stated objectives, and having resourced it 10x more so than NV did their comparable project, one would expect the final product to be more optimized (fewer xtors, higher xtor density, smaller die, etc) than NV's.

In industry jargon the concepts I am referring to here are called R&D Efficiency and Entitlement.

Now of course we don't know whether NV resourced the GT200 any less than AMD resourced the RV770, and likewise for Fermi vs. Cypress, but what we can't conclude by making die size comparisons and xtor density comparisons is that one should be superior to the other in those metrics without our having access to the necessary budgetary informations that factored into the project management aspects of decision making and tradeoff downselection.

This is no different than comparing say AMD's PhII X4 versus the nearly identical in die-size Bloomfield. You could argue that bloomfield shows that AMD should/could have implemented PhII X4 as a smaller die or they should/could have made PhII X4 performance higher (given that Intel did)...or you could argue that AMD managed to deliver 90% of the performance while only spending 25% the coin.

It's all how you want to evaluate the metrics of success in terms of entitlement or R&D efficiency (spend 25% the budget and you aren't entitled to expect your engineers to deliver 100% the performance, 90% the performance is pretty damn good).

So we will never know how much of GT200's diesize is attributable to GPGPU constraints versus simply being the result of timeline and budgetary tradeoffs made at the project management level versus how similar tradeoff decisions were made at AMD's project management level.

good point, but if anything here nvidia is the one with 4x the R&D budget. how bad would it be to come in 6 months late, be larger, AND cost 4x the R&D budget? that could be the ultimate gpu trifecta.

bryanW1995 · Oct 14, 2009

Originally posted by: SlowSpyder
Thanks again for going out of your way to wreck a thread.

that's why he's called "wreckage"

Genx87 · Oct 14, 2009

Originally posted by: Vertibird

Originally posted by: Wreckage

Originally posted by: SlowSpyder
AMD GPU's are smaller than Nvidia GPU's but performance is similar.

Are Nvidia GPU's really more aimed towards GPGPU?

Click to expand...

Not really. NVIDIA was faster in games than ATI, in fact the GTX295 is still the fastest card available. The HD4xxx series was always behind.

The extra GPGPU capability is just icing on top of the gaming cake. Look how well Batman AA plays when PhysX is enabled.

With many games playable on even mid range cards, you need to offer your customers something more. I think this is why NVIDIA outsells ATI 2 to 1.

Click to expand...

If PhysX is so great why don't they let customers mix cards on mobos? (ie, we buy ATI cards for the superior gaming value, but then mix it with a Nvidia card for Physx)

Or maybe Nvidia doesn't want us to use Lucid hydra? But why would they care since they don't even want to make chipsets anymore?

I can think of support issues being a real pita. Especially trying to fix an issue with a competitors card that doesnt support PhysX.

SSChevy2001 · Oct 14, 2009

Originally posted by: Genx87

Originally posted by: Vertibird

Originally posted by: Wreckage

Originally posted by: SlowSpyder
AMD GPU's are smaller than Nvidia GPU's but performance is similar.

Are Nvidia GPU's really more aimed towards GPGPU?

Click to expand...

Not really. NVIDIA was faster in games than ATI, in fact the GTX295 is still the fastest card available. The HD4xxx series was always behind.

The extra GPGPU capability is just icing on top of the gaming cake. Look how well Batman AA plays when PhysX is enabled.

With many games playable on even mid range cards, you need to offer your customers something more. I think this is why NVIDIA outsells ATI 2 to 1.

Click to expand...

If PhysX is so great why don't they let customers mix cards on mobos? (ie, we buy ATI cards for the superior gaming value, but then mix it with a Nvidia card for Physx)

Or maybe Nvidia doesn't want us to use Lucid hydra? But why would they care since they don't even want to make chipsets anymore?

Click to expand...

I can think of support issues being a real pita. Especially trying to fix an issue with a competitors card that doesnt support PhysX.

The Nvidia card is just doing PhysX calculations, not rendering.

That's like AMD saying well we don't support Nvidia GPUs with our CPUs, because it might cause problems.

Funny how it works just fine once the limitations are removed by a patch.
http://www.youtube.com/watch?v=Fgp1mYRYLS0

dev0lution · Oct 15, 2009

Originally posted by: SlowSpyder

Cliffs:
AMD GPU's are smaller than Nvidia GPU's but performance is similar.
Are Nvidia GPU's really more aimed towards GPGPU?

Theoretical performance has always favored AMD if you believe the marketing slides. However, there's plenty of examples where this theoretical performance still hasn't been realized - care to ask F@H 4xxx card owners? Or how about where high performance (read: speed) is realized but output was unacceptable (Google AVIVO Transcoder reviews). Sadly, this transcoding library was what AMD supplied many third parties so their "stream" features end up with similar results.

Or you could pull some havok and bullet physics benches... oh nevermind.

Yes, both software and hardware is important for GPU computing. Since AMD hasn't really managed to fix the software problem, it's a bit hard to tell if their past generation's hw is limited by the architecture or their ability to tap into theoretical performance from the hw from a lack of sw. As far as most of the "standards" based GPU compute flavors available (OpenCL & DirectCompute), AMD has already stated anything older than the 4xxx generation won't support it - which would lead one to believe it does have some hw dependence on the AMD architecture.

Maybe we'll see how that plays out this cycle...

cusideabelincoln · Oct 15, 2009

Anandtech's launch article for the HD4870 attempts to tackles the very questions you are raising.

http://www.anandtech.com/video/showdoc.aspx?i=3341&p=3

Reading all six pages dedicated to hardware design might enlighten a few of your questions, or just bring about more questions. Enjoy!

dev0lution · Oct 15, 2009

Pointing to a year old review doesn't really illuminate much on the amount of traction each side has been able to gain since then.

"Even while staying still at 4:1 with RV770, AMD's ratio is still more aggressively geared towards compute than NVIDIA's is."

This has not been borne out in real world compute applications.

Again:

"AMD offers an advantage in the SPMD paradigm in that it maintains a global store (present since RV670) where all threads can share result data globally if they need to (this is something that NVIDIA does not support). This feature allows more flexibility in algorithm implementation and can offer performance benefits in some applications."

Over a year later and this theory still hasn't shown up in a real-world app or credible review since then. Again, AMD looks impressive on paper, but there in terms of real-world compute adoption or even getting the theoretical performance out of the hw, there hasn't been a whole lot of progress beyond the ppt's.

cusideabelincoln · Oct 15, 2009

Filling the execution units of each to capacity is a challenge but looks to be more consistent on NVIDIA hardware, while in the cases where AMD hardware is used effectively (like Bioshock) we see that RV770 surpasses GTX 280 in not only performance but power efficiency as well. Area efficiency is completely owned by AMD, which means that their cost for performance delivered is lower than NVIDIA's (in terms of manufacturing -- R&D is a whole other story) since smaller ICs mean cheaper to produce parts.

While shader/kernel length isn't as important on GT200 (except that the ratio of FP and especially multiply-add operations to other code needs to be high to extract high levels of performance), longer programs are easier for AMD's compiler to extract ILP from. Both RV770 and GT200 must balance thread issue with resource usage, but RV770 can leverage higher performance in situations where ILP can be extracted from shader/kernel code which could also help in situations where the GT200 would not be able to hide latency well.

We believe based on information found on the CUDA forums and from some of our readers that G80's SPs have about a 22 stage pipeline and that GT200 is also likely deeply piped, and while AMD has told us that their pipeline is significantly shorter than this they wouldn't tell us how long it actually is. Regardless, a shorter pipeline and the ability to execute one wavefront over multiple scheduling cycles means massive amounts of TLP isn't needed just to cover instruction latency. Yes massive amounts of TLP are needed to cover memory latency, but shader programs with lots of internal compute can also help to do this on RV770.

All of this adds up to the fact that, despite the advent of DX10 and the fact that both of these architectures are very good at executing large numbers of independent threads very quickly, getting the most out of GT200 and RV770 requires vastly different approaches in some cases. Long shaders can benefit RV770 due to increased ILP that can be extracted, while the increased resource use of long shaders may mean less threads can be issued on GT200 causing lowered performance. Of course going the other direction would have the opposite effect. Caches and resource availability/management are different, meaning that tradeoffs and choices must be made in when and how data is fetched and used. Fixed function resources are different and optimization of the usage of things like texture filters and the impact of the different setup engines can have a large (and differing with architecture) impact on performance.

We still haven't gotten to the point where we can write simple shader code that just does what we want it to do and expect it to perform perfectly everywhere. Right now it seems like typical usage models favor GT200, while relative performance can vary wildly on RV770 depending on how well the code fits the hardware. G80 (and thus NVIDIA's architecture) did have a lead in the industry for months before R600 hit the scene, and it wasn't until RV670 that AMD had a real competitor in the market place. This could be part of the reason we are seeing fewer titles benefiting from the massive amount of compute available on AMD hardware. But with this launch, AMD has solidified their place in the market (as we will see the 4800 series offers a lot of value), and it will be very interesting to see what happens going forward.

How is this article irrelevant? It's not entirely so. One has to understand what has happened and what is happening before attempting to predict what will happen.

dev0lution · Oct 15, 2009

Originally posted by: cusideabelincoln

Right now it seems like typical usage models favor GT200, while relative performance can vary wildly on RV770 depending on how well the code fits the hardware. G80 (and thus NVIDIA's architecture) did have a lead in the industry for months before R600 hit the scene, and it wasn't until RV670 that AMD had a real competitor in the market place. This could be part of the reason we are seeing fewer titles benefiting from the massive amount of compute available on AMD hardware. But with this launch, AMD has solidified their place in the market (as we will see the 4800 series offers a lot of value), and it will be very interesting to see what happens going forward.

I'm not arguing on the relevance of the article at the time it was written, but even then, the author concedes that 4xxx hardware varies wildly in real world applications. I'm just pointing out that there hasn't been a flurry of situations since then that have contradicted that point.

Note that the "offers a lot of value" hasn't been disputed, but you could say that the "titles benefiting from the massive amount of compute available on AMD hardware" could be. Very little since this architecture review was written have changed the original conclusions. If anything, the GT200 architecture has gained in acceptance in the GPU compute market vs the equivalent AMD hw.

alcoholbob · Oct 15, 2009

Is Wreckage an Nvidia Focus Group member or is he just happy to be here?

Keysplayr · Oct 15, 2009

Originally posted by: Astrallite
Is Wreckage an Nvidia Focus Group member or is he just happy to be here?

He isn't a focus group member. I often believe that he is part of an AMD campaign for reverse viral marketing though. Makes more sense to me as he consistently goes overboard in an obvious fashion. So, the more you "dislike" him and his posts, the more you may associate Nvidia with him. Might be reverse psychological warfare.

OT: GT200 for example, has about 20% of it's transistor budget towards GPGPU purposes.
I mentioned this a few times in the past, but I had discussions with Nvidia engineers about this. 20% transistor budget for GPGPU, and they were fighting for more. This 20% cannot be utilized for graphics. This, as well as it's 512-bit interface probably contributed significantly to the overall die size.

Fermi will most likely have quite a bit larger GPGPU transistor budget. The larger L1, L2 and shared caches alone would take up more real estate. And, it appears that Fermi does have a 512-bit bus for future usage, but will initially only utilize 3/4 of that at 384-bit.

cusideabelincoln · Oct 15, 2009

Originally posted by: Keysplayr

Originally posted by: Astrallite
Is Wreckage an Nvidia Focus Group member or is he just happy to be here?

Click to expand...

He isn't a focus group member. I often believe that he is part of an AMD campaign for reverse viral marketing though. Makes more sense to me as he consistently goes overboard in an obvious fashion. So, the more you "dislike" him and his posts, the more you may associate Nvidia with him. Might be reverse psychological warfare.

Just like what you're doing right now, RIGHT!?

Icee hot judith hair.

Keysplayr · Oct 15, 2009

Originally posted by: cusideabelincoln

Originally posted by: Keysplayr

Originally posted by: Astrallite
Is Wreckage an Nvidia Focus Group member or is he just happy to be here?

Click to expand...

He isn't a focus group member. I often believe that he is part of an AMD campaign for reverse viral marketing though. Makes more sense to me as he consistently goes overboard in an obvious fashion. So, the more you "dislike" him and his posts, the more you may associate Nvidia with him. Might be reverse psychological warfare.

Click to expand...

Just like what you're doing right now, RIGHT!?

Icee hot judith hair.

What am I doing now?

HurleyBird · Oct 15, 2009

Rule 1:

Don't feed the troll.

Not matter how stupid his comment, or how brilliant your response to him, Wreckage wins the moment you decide to reply to one of his inflammatory posts.
Ignoring a troll has never caused a thread to derail, guys.

Keysplayr · Oct 15, 2009

Originally posted by: HurleyBird
Rule 1:

Don't feed the troll.

Not matter how stupid his comment, or how brilliant your response to him, Wreckage wins the moment you decide to reply to one of his inflammatory posts.
Ignoring a troll has never caused a thread to derail, guys.

Seconded. Ignore from this point on?

faxon · Oct 15, 2009

ignore who?

Cookie Monster · Oct 15, 2009

Why is GT200 so large?

512bit memory interface, 1.4 billion transistors, 240SPs/80TMUs/32ROPs (thats roughly two G92s) on a old 65nm process. G92 was roughly 334m^2 big. Im not sure how big of an impact the GPGPU functionalities had on the transistor budget/die size but you cannot count out the fact that GT200 was almost double G80/G92.

Genx87 · Oct 15, 2009

Originally posted by: SSChevy2001

Originally posted by: Genx87

Originally posted by: Vertibird

Originally posted by: Wreckage

Originally posted by: SlowSpyder
AMD GPU's are smaller than Nvidia GPU's but performance is similar.

Are Nvidia GPU's really more aimed towards GPGPU?

Click to expand...

Not really. NVIDIA was faster in games than ATI, in fact the GTX295 is still the fastest card available. The HD4xxx series was always behind.

The extra GPGPU capability is just icing on top of the gaming cake. Look how well Batman AA plays when PhysX is enabled.

With many games playable on even mid range cards, you need to offer your customers something more. I think this is why NVIDIA outsells ATI 2 to 1.

Click to expand...

If PhysX is so great why don't they let customers mix cards on mobos? (ie, we buy ATI cards for the superior gaming value, but then mix it with a Nvidia card for Physx)

Or maybe Nvidia doesn't want us to use Lucid hydra? But why would they care since they don't even want to make chipsets anymore?

Click to expand...

I can think of support issues being a real pita. Especially trying to fix an issue with a competitors card that doesnt support PhysX.

Click to expand...

The Nvidia card is just doing PhysX calculations, not rendering.

That's like AMD saying well we don't support Nvidia GPUs with our CPUs, because it might cause problems.

Funny how it works just fine once the limitations are removed by a patch.
http://www.youtube.com/watch?v=Fgp1mYRYLS0

Except what could a CPU calculate that would cause a rendering issue? So it really isnt the same now is it?

Anyone care to revisit an old discussion?

Platinum Member

Lifer

Banned

Platinum Member

Senior member

Platinum Member

Platinum Member

Lifer

Lifer

Lifer

Senior member

Senior member

Diamond Member

Senior member

Diamond Member

Senior member

Diamond Member

Elite Member

Diamond Member

Elite Member

Platinum Member

Elite Member

Platinum Member

Diamond Member

Lifer