Architectural Direction of GPUs

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
What about MilkyWay@Home? Doesn't that get ridiculous numbers for ati?

Yes it does, that goes back to what I was saying about the loose comparison to Cell. If you have a code base that is very friendly to Vec5 layout and optimize it properly it will be very fast. If you don't have both of those factors at play, then the situation changes significantly.

Isn't folding at home blatantly unoptimized for newer ati hardware?

It very well may be the case that that is true, but if the code base is very poorly suited to a Vec5 layout then there may be little point in them taking the time to improve it as the performance increase isn't likely to be worth the effort for them. Their interest is in simply getting the packets crunched, and they do have limited resources. More then likely it will remain that MW@H is going to be the distributed platform of choice for ATi until they head in a direction other then Vec5 for their shader layout.

Isn't Nvidia's huge lead in Metro 2033 due to an AA bug?

It doesn't appear that way based on whay Kyel reported. Both ATi and nV exhibit the same issue when MSAA is enabled and nV is still considerably faster. It appears that the bug is just that, a bug.

Is Unigine anything other than a controversial benchmark at this point?

Perhaps it is, but using ATi's SDK nVidia has an even larger performance advantage.

http://ixbtlabs.com/articles3/video/gf100-2-p11.html

Using ATi's SDK the 480 is close to four times faster then the 5970. That is using ATi's own benchmark. Also, Heaven only really became controversial when one company that was dominating it fell far behind the other company. From an analytical point of view, if anything, the bench is far more friendly to ATi then any of the other tesselation benches we have available including ATi's own.
 

ugaboga232

Member
Sep 23, 2009
144
0
0
But now you can't say one architecture is "better" than the other. We need to define better, and even the 5XXX series, which wasn't made at all for GPGPU can still get some very good results.

Regarding nVidia's greater tessellation performance on the Unigine benchmark, the supposed controversy is nVidia buying their own license for it, the fact it uses far too much tessellation, it doesn't use the tessellation well, and as seen in other games with tessellation + other stuff going on, the 4XX series loses some of its tessellation performance.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
But now you can't say one architecture is "better" than the other.

I haven't said one is better then the other, I've pointed out the strengths and weaknesses of each.

We need to define better, and even the 5XXX series, which wasn't made at all for GPGPU can still get some very good results.

In terms of GPGPU uses, the parts aren't even close. Under a very narrow set of parameters the 5xxx parts can perform extremely well. The GF100 parts are very fast under all circumstances by comparison. The penalty for this is the increased die size, power draw and heat. Saying one is better then the other depends on where your priorities are, but there are very clearly defined distinct differences between the two parts on this front.

Regarding nVidia's greater tessellation performance on the Unigine benchmark, the supposed controversy is nVidia buying their own license for it, the fact it uses far too much tessellation, it doesn't use the tessellation well, and as seen in other games with tessellation + other stuff going on, the 4XX series loses some of its tessellation performance.

While I wouldn't disagree with any of that I would say that the bench the shows the biggest disparity in tesselation performance between the two parts is ATi's own bench. nVidia's bench is actually far more represenative of what we still in games and has ATi's parts performing relatively closer then the bench ATi wrote themselves.
 

Janooo

Golden Member
Aug 22, 2005
1,067
13
81
At the end of the day it seems GF100 is a nice architecture but unfortunately broken by the manufacturing process.
 

AzN

Banned
Nov 26, 2001
4,112
2
0
A balanced card is best. 58xx series is just that. Well balanced part that's not particularly too limited by bandwidth, fillrate, or processing power for that matter.

GF100 pushes lot of color fill, bandwidth, processing power and tessellation but lacks texture. So you get results that barely squeeze by 58xx series when it should be kicking the shit out of 58xx. 1 thing it does better than 5xxx series is have better minimum frame rates due to bandwidth.

Tessellation is still at infant stages. Putting too much emphasis without games acknowledging has no real use nor will this generation be enough for next generation tessellation games to come.

Nvidia is in big trouble. They know it. ATI can easily make one gigantic chip like GF100 and spank the shit out of it. Nvidia really need to go back to the drawing board and stop putting too much emphasis on GPU computing and go back to their roots.
 

badb0y

Diamond Member
Feb 22, 2010
4,015
30
91
Isn't ATi also going to change it's architecture with the Northern Islands/Souther Islands cards?
 

BFG10K

Lifer
Aug 14, 2000
22,709
2,971
126
Regarding texture fillrate, things aren’t quite cut and dry there. Yes, according to the specs derived from TMU count and clock speed, the fillrate is lower. But nVidia claims texturing performance is actually higher overall compared to the GT200 because of better caching and TMU arrangement. I covered this in ABT’s Fermi architecture analysis.

Perhaps the drivers simply aren’t tuned in that area yet. Also we have yet to determine if the games where the GTX480 doesn’t do well are actually texture bound.

As for architecture in general, nVidia should’ve released a GTX480 "gamer edition" with the fluff that gamers don’t need removed (e.g. DP). The transistor savings could go towards a smaller die size, better thermals, and better yields (hence better prices). ATi’s 57xx (and lower) parts don’t have DP.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
But nVidia claims texturing performance is actually higher overall compared to the GT200 because of better caching and TMU arrangement.

Which is obvious in the numbers we have seen- clearly it bests the 285 and even the 295, but that isn't what it has to compete with. The close to linear drop off based on pixel requirements is more then a bit odd(36%, 36%, 37%, 34% on the games I checked on AT moving from 19x12 to 25x16).

Also we have yet to determine if the games where the GTX480 doesn’t do well are actually texture bound.

A game that is heavily shader bound on one of the other parts could very easily be nigh entirely texel bound on the Fermi parts.

As for architecture in general, nVidia should’ve released a GTX480 "gamer edition" with the fluff that gamers don’t need removed (e.g. DP).

That is a huge reworking of the die for relatively small die space gains, it also destroys their design philosophy in terms of binning. I understand where you are coming from, but a more realistic angle would have been to cut the shader hardware in ~half. Based on current games, that seems about where it should be to match up with what we currently consider balanced.

ATi’s 57xx (and lower) parts don’t have DP.

But their higher end parts do, and the GeForce is going to be used in certain worksation applications(CS5 as a general example).
 

Lonyo

Lifer
Aug 10, 2002
21,939
6
81
But now you can't say one architecture is "better" than the other. We need to define better, and even the 5XXX series, which wasn't made at all for GPGPU can still get some very good results.

Regarding nVidia's greater tessellation performance on the Unigine benchmark, the supposed controversy is nVidia buying their own license for it, the fact it uses far too much tessellation, it doesn't use the tessellation well, and as seen in other games with tessellation + other stuff going on, the 4XX series loses some of its tessellation performance.

Well, you can say one is better than the other with some accompanying statements.

NV seems to have the better architecture going forward (obviously you can't say that as an absolute fact unless you can see the future).
ATI has the better architecture from an efficiency standpoint at the moment.

The main thing is that ATI are going to have to develop their architecture more than NV, while NV are going to be able to refine their architecture, at least in the short term future (until we see a further fundamental change).
 

AzN

Banned
Nov 26, 2001
4,112
2
0
Definitely Ben. GF100 might have higher texture efficiency than GTX285 while having less but it just can't compete with 58xx in this department which is competing with not GTX 285.

Again FILLRATE is still king.

3dm-texture.gif
 

AzN

Banned
Nov 26, 2001
4,112
2
0
I don't particularly agree with this. Just like Ben and others have mentioned, it isn't as simple as that. No reason to rehash what has already been said.

You don't have to agree that fillrate is king but time after time it's been repeating history. A card with higher fillrate with more bandwidth has been crowned king.
 

Mr. Pedantic

Diamond Member
Feb 14, 2010
5,039
0
76
But that doesn't necessarily mean that fillrate is king, does it? It just means that the card that happens to have the highest fillrate is king.
 

ArchAngel777

Diamond Member
Dec 24, 2000
5,223
61
91
You don't have to agree that fillrate is king but time after time it's been repeating history. A card with higher fillrate with more bandwidth has been crowned king.

Actually, this is not correct unless were talking performance per watt. The 5870, as demonstrated in your graph has way more fillrate than the GTX 480, but is still slower in 85% of the scenarios and test settings. It could be argued I suppose, that if you increase resolution further than what is available on todays mainsteam monitorst hat the 5870 would become the victor, but there is where practicality comes into play.

The current thought is that at 2560x1600 that ATI competes well, or improves rather as resolution increases. The only problem I have with that is that not very many people run 30" monitors, or even want to run 30" monitors. Which puts 1920x1200 as the practical resolution that a card should shoot for. That doesn't mean I discredit the 2560x1600 scores, just that it represents a very tiny user base.

I have $2,000 set asside in my computer fund and could easily afford a 30" monster display, but have absolutely no desire to upgrade to such a display. I absolutely love my 22" at 1680x1050. I had owned 30", 24", 22" etc... I like the 22" best.

But not go on a tangent, I guess where I am going is that even though the 5870 in theory would take the advantage when increasing fill rate, it currently doesn't with the highest fill rate device on the available market (2560x1600). So perhaps if a display with even higher resolution comes out, it in theory, could take the crown... But it is a non issue.

To me, the benchmarks at 1920x1200 are the most relevent for most people and even though the 5870 has double the fill rate, it still loses almost every time to the GTX 480. We could take the argument to performance per watt... But I am not going there.
 

ArchAngel777

Diamond Member
Dec 24, 2000
5,223
61
91
But that doesn't necessarily mean that fillrate is king, does it? It just means that the card that happens to have the highest fillrate is king.

Which is still demonstratably false with all the reviews showing the GTX 480 as beating the 5870 in games, while having the theoretical fillrate disadvantage.
 

Anarchist420

Diamond Member
Feb 13, 2010
8,645
0
76
www.facebook.com
I think that what they ought to do is have double or quad floating point precision shaders emulate the ROPs and depth/stencil units and have fixed function TMUs. Example:
32 TMUs @ 3.0GHz
3200 Stream processors @ 3 GHz.

Raytracing would also be a reality that way.
 

AzN

Banned
Nov 26, 2001
4,112
2
0
Actually, this is not correct unless were talking performance per watt. The 5870, as demonstrated in your graph has way more fillrate than the GTX 480, but is still slower in 85% of the scenarios and test settings. It could be argued I suppose, that if you increase resolution further than what is available on todays mainsteam monitorst hat the 5870 would become the victor, but there is where practicality comes into play.

To me, the benchmarks at 1920x1200 are the most relevent for most people and even though the 5870 has double the fill rate, it still loses almost every time to the GTX 480. We could take the argument to performance per watt... But I am not going there.

I don't think you can dictate what's correct or not correct when time after time this has been true.

58xx has more texture fillrate. GF100 has much more Pixel fillrate and bandwidth though. Now if GF100 had less pixel and bandwidth guess which one would be the clear victor?

Fact: Core overclocking nets you the best results long as it's not constraint by bandwidth or SP.
 

deimos3428

Senior member
Mar 6, 2009
697
0
0
The current thought is that at 2560x1600 that ATI competes well, or improves rather as resolution increases. The only problem I have with that is that not very many people run 30" monitors, or even want to run 30" monitors. Which puts 1920x1200 as the practical resolution that a card should shoot for. That doesn't mean I discredit the 2560x1600 scores, just that it represents a very tiny user base.
Could the reason for the relatively large fill rate on 5xxx be Eyefinity support? Even at a lowly 1680x1050, three monitors would require ~30% more pixels than a single one at 2560x1600.
 

blanketyblank

Golden Member
Jan 23, 2007
1,149
0
0
Actually, this is not correct unless were talking performance per watt. The 5870, as demonstrated in your graph has way more fillrate than the GTX 480, but is still slower in 85% of the scenarios and test settings. It could be argued I suppose, that if you increase resolution further than what is available on todays mainsteam monitorst hat the 5870 would become the victor, but there is where practicality comes into play.

The current thought is that at 2560x1600 that ATI competes well, or improves rather as resolution increases. The only problem I have with that is that not very many people run 30" monitors, or even want to run 30" monitors. Which puts 1920x1200 as the practical resolution that a card should shoot for. That doesn't mean I discredit the 2560x1600 scores, just that it represents a very tiny user base.

I have $2,000 set asside in my computer fund and could easily afford a 30" monster display, but have absolutely no desire to upgrade to such a display. I absolutely love my 22" at 1680x1050. I had owned 30", 24", 22" etc... I like the 22" best.

But not go on a tangent, I guess where I am going is that even though the 5870 in theory would take the advantage when increasing fill rate, it currently doesn't with the highest fill rate device on the available market (2560x1600). So perhaps if a display with even higher resolution comes out, it in theory, could take the crown... But it is a non issue.

To me, the benchmarks at 1920x1200 are the most relevent for most people and even though the 5870 has double the fill rate, it still loses almost every time to the GTX 480. We could take the argument to performance per watt... But I am not going there.

I disagree with this statement. I believe there are very few people with a single 1920 x 1200 who would be or should be in the market for a $500 video card since it is overkill for their needs. So in all honesty 2560 x 1600 or dual screen should be the standard by which these cards are judged.

1920 x 1200 might be a good benchmark for a step down in the 300-350 range where the 5850 and 470 compete however.

I wonder how much and what advantages of NV's architecture could be negated by ATI simply OCing and binning their own chips further. The 5870 is known to OC pretty well, and the Asus matrix is supposed to add additional power and hardware to the PCB to help OCing. Just how much extra clockspeed could ATI get out of their chips if they were willing to accept a quantity comparable to NVidia and undergo a stricter binning process.
In other words what is the full potential of ATI's current architecture?
 

Skurge

Diamond Member
Aug 17, 2009
5,195
1
71
Could the reason for the relatively large fill rate on 5xxx be Eyefinity support? Even at a lowly 1680x1050, three monitors would require ~30% more pixels than a single one at 2560x1600.

the anandtech article about RV870 said eyefinity was kept a secret from most of guys at ATi, so I don't think the made the gpu have such a high fill rate for eyefinity.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
the anandtech article about RV870 said eyefinity was kept a secret from most of guys at ATi, so I don't think the made the gpu have such a high fill rate for eyefinity.

The people that designed Cypress knew about Eyefinity (according to the RV870 article), but the ATI software people didn't.

Speaking of 1080p resolution for both upcoming consoles and PCs, maybe we should be looking for breakthrough in monitor tech that allows PCs to go beyond this?