Architectural Direction of GPUs

BenSkywalker · Apr 12, 2010

What about MilkyWay@Home? Doesn't that get ridiculous numbers for ati?

Yes it does, that goes back to what I was saying about the loose comparison to Cell. If you have a code base that is very friendly to Vec5 layout and optimize it properly it will be very fast. If you don't have both of those factors at play, then the situation changes significantly.

Isn't folding at home blatantly unoptimized for newer ati hardware?

It very well may be the case that that is true, but if the code base is very poorly suited to a Vec5 layout then there may be little point in them taking the time to improve it as the performance increase isn't likely to be worth the effort for them. Their interest is in simply getting the packets crunched, and they do have limited resources. More then likely it will remain that MW@H is going to be the distributed platform of choice for ATi until they head in a direction other then Vec5 for their shader layout.

Isn't Nvidia's huge lead in Metro 2033 due to an AA bug?

It doesn't appear that way based on whay Kyel reported. Both ATi and nV exhibit the same issue when MSAA is enabled and nV is still considerably faster. It appears that the bug is just that, a bug.

Is Unigine anything other than a controversial benchmark at this point?

Perhaps it is, but using ATi's SDK nVidia has an even larger performance advantage.

http://ixbtlabs.com/articles3/video/gf100-2-p11.html

Using ATi's SDK the 480 is close to four times faster then the 5970. That is using ATi's own benchmark. Also, Heaven only really became controversial when one company that was dominating it fell far behind the other company. From an analytical point of view, if anything, the bench is far more friendly to ATi then any of the other tesselation benches we have available including ATi's own.

ugaboga232 · Apr 12, 2010

But now you can't say one architecture is "better" than the other. We need to define better, and even the 5XXX series, which wasn't made at all for GPGPU can still get some very good results.

Regarding nVidia's greater tessellation performance on the Unigine benchmark, the supposed controversy is nVidia buying their own license for it, the fact it uses far too much tessellation, it doesn't use the tessellation well, and as seen in other games with tessellation + other stuff going on, the 4XX series loses some of its tessellation performance.

BenSkywalker · Apr 12, 2010

But now you can't say one architecture is "better" than the other.

I haven't said one is better then the other, I've pointed out the strengths and weaknesses of each.

We need to define better, and even the 5XXX series, which wasn't made at all for GPGPU can still get some very good results.

In terms of GPGPU uses, the parts aren't even close. Under a very narrow set of parameters the 5xxx parts can perform extremely well. The GF100 parts are very fast under all circumstances by comparison. The penalty for this is the increased die size, power draw and heat. Saying one is better then the other depends on where your priorities are, but there are very clearly defined distinct differences between the two parts on this front.

Regarding nVidia's greater tessellation performance on the Unigine benchmark, the supposed controversy is nVidia buying their own license for it, the fact it uses far too much tessellation, it doesn't use the tessellation well, and as seen in other games with tessellation + other stuff going on, the 4XX series loses some of its tessellation performance.

While I wouldn't disagree with any of that I would say that the bench the shows the biggest disparity in tesselation performance between the two parts is ATi's own bench. nVidia's bench is actually far more represenative of what we still in games and has ATi's parts performing relatively closer then the bench ATi wrote themselves.

Janooo · Apr 13, 2010

At the end of the day it seems GF100 is a nice architecture but unfortunately broken by the manufacturing process.

AzN · Apr 13, 2010

A balanced card is best. 58xx series is just that. Well balanced part that's not particularly too limited by bandwidth, fillrate, or processing power for that matter.

GF100 pushes lot of color fill, bandwidth, processing power and tessellation but lacks texture. So you get results that barely squeeze by 58xx series when it should be kicking the shit out of 58xx. 1 thing it does better than 5xxx series is have better minimum frame rates due to bandwidth.

Tessellation is still at infant stages. Putting too much emphasis without games acknowledging has no real use nor will this generation be enough for next generation tessellation games to come.

Nvidia is in big trouble. They know it. ATI can easily make one gigantic chip like GF100 and spank the shit out of it. Nvidia really need to go back to the drawing board and stop putting too much emphasis on GPU computing and go back to their roots.

badb0y · Apr 13, 2010

Isn't ATi also going to change it's architecture with the Northern Islands/Souther Islands cards?

BTRY B 529th FA BN · Apr 13, 2010

BenSkywalker said:
Serious discussion only. Folks who troll, cr@p or otherwise disrupt this thread will be punted. Thanks in advance.
Anandtech Moderator - Keysplayr

This needs to be brought up again.

BFG10K · Apr 13, 2010

Regarding texture fillrate, things arent quite cut and dry there. Yes, according to the specs derived from TMU count and clock speed, the fillrate is lower. But nVidia claims texturing performance is actually higher overall compared to the GT200 because of better caching and TMU arrangement. I covered this in ABTs Fermi architecture analysis.

Perhaps the drivers simply arent tuned in that area yet. Also we have yet to determine if the games where the GTX480 doesnt do well are actually texture bound.

As for architecture in general, nVidia shouldve released a GTX480 "gamer edition" with the fluff that gamers dont need removed (e.g. DP). The transistor savings could go towards a smaller die size, better thermals, and better yields (hence better prices). ATis 57xx (and lower) parts dont have DP.

BenSkywalker · Apr 13, 2010

But nVidia claims texturing performance is actually higher overall compared to the GT200 because of better caching and TMU arrangement.

Which is obvious in the numbers we have seen- clearly it bests the 285 and even the 295, but that isn't what it has to compete with. The close to linear drop off based on pixel requirements is more then a bit odd(36%, 36%, 37%, 34% on the games I checked on AT moving from 19x12 to 25x16).

Also we have yet to determine if the games where the GTX480 doesn’t do well are actually texture bound.

A game that is heavily shader bound on one of the other parts could very easily be nigh entirely texel bound on the Fermi parts.

As for architecture in general, nVidia should’ve released a GTX480 "gamer edition" with the fluff that gamers don’t need removed (e.g. DP).

That is a huge reworking of the die for relatively small die space gains, it also destroys their design philosophy in terms of binning. I understand where you are coming from, but a more realistic angle would have been to cut the shader hardware in ~half. Based on current games, that seems about where it should be to match up with what we currently consider balanced.

ATi’s 57xx (and lower) parts don’t have DP.

But their higher end parts do, and the GeForce is going to be used in certain worksation applications(CS5 as a general example).

Lonyo · Apr 13, 2010

ugaboga232 said:
But now you can't say one architecture is "better" than the other. We need to define better, and even the 5XXX series, which wasn't made at all for GPGPU can still get some very good results.

Regarding nVidia's greater tessellation performance on the Unigine benchmark, the supposed controversy is nVidia buying their own license for it, the fact it uses far too much tessellation, it doesn't use the tessellation well, and as seen in other games with tessellation + other stuff going on, the 4XX series loses some of its tessellation performance.

Well, you can say one is better than the other with some accompanying statements.

NV seems to have the better architecture going forward (obviously you can't say that as an absolute fact unless you can see the future).
ATI has the better architecture from an efficiency standpoint at the moment.

The main thing is that ATI are going to have to develop their architecture more than NV, while NV are going to be able to refine their architecture, at least in the short term future (until we see a further fundamental change).

AzN · Apr 13, 2010

Definitely Ben. GF100 might have higher texture efficiency than GTX285 while having less but it just can't compete with 58xx in this department which is competing with not GTX 285.

Again FILLRATE is still king.

ArchAngel777 · Apr 13, 2010

AzN said:
Again FILLRATE is still king.

I don't particularly agree with this. Just like Ben and others have mentioned, it isn't as simple as that. No reason to rehash what has already been said.

AzN · Apr 13, 2010

ArchAngel777 said:
I don't particularly agree with this. Just like Ben and others have mentioned, it isn't as simple as that. No reason to rehash what has already been said.

You don't have to agree that fillrate is king but time after time it's been repeating history. A card with higher fillrate with more bandwidth has been crowned king.

Mr. Pedantic · Apr 13, 2010

But that doesn't necessarily mean that fillrate is king, does it? It just means that the card that happens to have the highest fillrate is king.

ArchAngel777 · Apr 13, 2010

AzN said:
You don't have to agree that fillrate is king but time after time it's been repeating history. A card with higher fillrate with more bandwidth has been crowned king.

Actually, this is not correct unless were talking performance per watt. The 5870, as demonstrated in your graph has way more fillrate than the GTX 480, but is still slower in 85% of the scenarios and test settings. It could be argued I suppose, that if you increase resolution further than what is available on todays mainsteam monitorst hat the 5870 would become the victor, but there is where practicality comes into play.

The current thought is that at 2560x1600 that ATI competes well, or improves rather as resolution increases. The only problem I have with that is that not very many people run 30" monitors, or even want to run 30" monitors. Which puts 1920x1200 as the practical resolution that a card should shoot for. That doesn't mean I discredit the 2560x1600 scores, just that it represents a very tiny user base.

I have $2,000 set asside in my computer fund and could easily afford a 30" monster display, but have absolutely no desire to upgrade to such a display. I absolutely love my 22" at 1680x1050. I had owned 30", 24", 22" etc... I like the 22" best.

But not go on a tangent, I guess where I am going is that even though the 5870 in theory would take the advantage when increasing fill rate, it currently doesn't with the highest fill rate device on the available market (2560x1600). So perhaps if a display with even higher resolution comes out, it in theory, could take the crown... But it is a non issue.

To me, the benchmarks at 1920x1200 are the most relevent for most people and even though the 5870 has double the fill rate, it still loses almost every time to the GTX 480. We could take the argument to performance per watt... But I am not going there.

ArchAngel777 · Apr 13, 2010

Mr. Pedantic said:
But that doesn't necessarily mean that fillrate is king, does it? It just means that the card that happens to have the highest fillrate is king.

Which is still demonstratably false with all the reviews showing the GTX 480 as beating the 5870 in games, while having the theoretical fillrate disadvantage.

Anarchist420 · Apr 13, 2010

I think that what they ought to do is have double or quad floating point precision shaders emulate the ROPs and depth/stencil units and have fixed function TMUs. Example:
32 TMUs @ 3.0GHz
3200 Stream processors @ 3 GHz.

Raytracing would also be a reality that way.

AzN · Apr 13, 2010

ArchAngel777 said:
Actually, this is not correct unless were talking performance per watt. The 5870, as demonstrated in your graph has way more fillrate than the GTX 480, but is still slower in 85% of the scenarios and test settings. It could be argued I suppose, that if you increase resolution further than what is available on todays mainsteam monitorst hat the 5870 would become the victor, but there is where practicality comes into play.

To me, the benchmarks at 1920x1200 are the most relevent for most people and even though the 5870 has double the fill rate, it still loses almost every time to the GTX 480. We could take the argument to performance per watt... But I am not going there.

I don't think you can dictate what's correct or not correct when time after time this has been true.

58xx has more texture fillrate. GF100 has much more Pixel fillrate and bandwidth though. Now if GF100 had less pixel and bandwidth guess which one would be the clear victor?

Fact: Core overclocking nets you the best results long as it's not constraint by bandwidth or SP.

Madcatatlas · Apr 13, 2010

Shower said:
I want to point out that the reverse SHOULD be the case when talking heat, and ATI has managed this while nVidia hasnt.

lol what? Did you just quote my exact words?

HOOfan 1 · Apr 13, 2010

Madcatatlas said:
lol what? Did you just quote my exact words?

Apparently a spammer....has done the same in several threads.

deimos3428 · Apr 13, 2010

ArchAngel777 said:
The current thought is that at 2560x1600 that ATI competes well, or improves rather as resolution increases. The only problem I have with that is that not very many people run 30" monitors, or even want to run 30" monitors. Which puts 1920x1200 as the practical resolution that a card should shoot for. That doesn't mean I discredit the 2560x1600 scores, just that it represents a very tiny user base.

Could the reason for the relatively large fill rate on 5xxx be Eyefinity support? Even at a lowly 1680x1050, three monitors would require ~30% more pixels than a single one at 2560x1600.

blanketyblank · Apr 13, 2010

ArchAngel777 said:
Actually, this is not correct unless were talking performance per watt. The 5870, as demonstrated in your graph has way more fillrate than the GTX 480, but is still slower in 85% of the scenarios and test settings. It could be argued I suppose, that if you increase resolution further than what is available on todays mainsteam monitorst hat the 5870 would become the victor, but there is where practicality comes into play.

The current thought is that at 2560x1600 that ATI competes well, or improves rather as resolution increases. The only problem I have with that is that not very many people run 30" monitors, or even want to run 30" monitors. Which puts 1920x1200 as the practical resolution that a card should shoot for. That doesn't mean I discredit the 2560x1600 scores, just that it represents a very tiny user base.

I have $2,000 set asside in my computer fund and could easily afford a 30" monster display, but have absolutely no desire to upgrade to such a display. I absolutely love my 22" at 1680x1050. I had owned 30", 24", 22" etc... I like the 22" best.

But not go on a tangent, I guess where I am going is that even though the 5870 in theory would take the advantage when increasing fill rate, it currently doesn't with the highest fill rate device on the available market (2560x1600). So perhaps if a display with even higher resolution comes out, it in theory, could take the crown... But it is a non issue.

To me, the benchmarks at 1920x1200 are the most relevent for most people and even though the 5870 has double the fill rate, it still loses almost every time to the GTX 480. We could take the argument to performance per watt... But I am not going there.

I disagree with this statement. I believe there are very few people with a single 1920 x 1200 who would be or should be in the market for a $500 video card since it is overkill for their needs. So in all honesty 2560 x 1600 or dual screen should be the standard by which these cards are judged.

1920 x 1200 might be a good benchmark for a step down in the 300-350 range where the 5850 and 470 compete however.

I wonder how much and what advantages of NV's architecture could be negated by ATI simply OCing and binning their own chips further. The 5870 is known to OC pretty well, and the Asus matrix is supposed to add additional power and hardware to the PCB to help OCing. Just how much extra clockspeed could ATI get out of their chips if they were willing to accept a quantity comparable to NVidia and undergo a stricter binning process.
In other words what is the full potential of ATI's current architecture?

Skurge · Apr 13, 2010

deimos3428 said:
Could the reason for the relatively large fill rate on 5xxx be Eyefinity support? Even at a lowly 1680x1050, three monitors would require ~30% more pixels than a single one at 2560x1600.

the anandtech article about RV870 said eyefinity was kept a secret from most of guys at ATi, so I don't think the made the gpu have such a high fill rate for eyefinity.

cbn · Apr 13, 2010

Skurge said:
the anandtech article about RV870 said eyefinity was kept a secret from most of guys at ATi, so I don't think the made the gpu have such a high fill rate for eyefinity.

The people that designed Cypress knew about Eyefinity (according to the RV870 article), but the ATI software people didn't.

Speaking of 1080p resolution for both upcoming consoles and PCs, maybe we should be looking for breakthrough in monitor tech that allows PCs to go beyond this?

1h4x4s3x · Apr 13, 2010

HOOfan 1 said:
Apparently a spambot....has done the same in several threads.

http://www.google.com/search?hl=en&...+are"&btnG=Search&aq=f&aqi=&aql=&oq=&gs_rfai=

Architectural Direction of GPUs

Diamond Member

Member

Diamond Member

Golden Member

Banned

Diamond Member

Lifer

Lifer

Diamond Member

Lifer

Banned

Diamond Member

Banned

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Banned

Golden Member

Platinum Member

Senior member

Golden Member

Diamond Member

Lifer

Senior member