G70 and Fillrate...

Gamingphreek

Lifer
Mar 31, 2003
11,679
0
81
Reading all these articles recently brings this question.

ATI has a higher Pixel Fill Rate as a result of a higher Core clock frequency. Nvidia continues to use low clocked cores. Can someone please explain why Nvidia wouldn't increase the core speed further. It would seem that ATI holds the upper hand in fill rate since their Core is clocked at 520.

Additionally, why did Nvidia choose to remain with 16 ROP's? Would it be expensive to match the ROP's with the Pixel Pipelines. Finally, would there have been large gains over the performance that we already have if they had used 24 ROP's instead of 16?

-Kevin
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Can someone please explain why Nvidia wouldn't increase the core speed further.

Power useage, heat and yields. The first two can grow at significantly greater then linear rates while the last one can lower the same. This increases cost considerably.

Additionally, why did Nvidia choose to remain with 16 ROP's?

Transistor useage. Moving forward having 24 ROPS with 16ALUs is clearly inferior to having it the way they chose. For that matter, having 32ROPs with 16ALUs would likely still be inferior to having 16ROPs with 24ALUs.

Finally, would there have been large gains over the performance that we already have if they had used 24 ROP's instead of 16?

Not at all. 16 ROPs paired with a 256bit bus is already pushing it(barring RAM far faster then what we have)- 32bit*16= 512bits(256bit bus double pumped is comparable) and that is just writes, never mind bandwidth needed for reads.
 

Genx87

Lifer
Apr 8, 2002
41,091
513
126
Lower clocked usually means high yields. The last two generations Nvidia has beat ATI in terms of availability. The X800PEs were backordered from ATI themselves for nearly 6 months after the release of the card. And this generation they are completely non-existent.

 

Gamingphreek

Lifer
Mar 31, 2003
11,679
0
81
So they did this to cut costs, and to lower transistors and improve yields.

Another hypothetical question then.

If they were to use 24ROP's paired with the full 24ALU's would that not help a lot. It seems that ATI edges out sometimes because of rall fill rate.

-Kevin
 

Keysplayr

Elite Member
Jan 16, 2003
21,211
50
91
Originally posted by: Gamingphreek
So they did this to cut costs, and to lower transistors and improve yields.

Another hypothetical question then.

If they were to use 24ROP's paired with the full 24ALU's would that not help a lot. It seems that ATI edges out sometimes because of rall fill rate.

-Kevin

According to a few articles, the 16 ROP's are not saturated yet by even the 24 piplines. Kind of like AGP 4x/8x BS.

 

Gamingphreek

Lifer
Mar 31, 2003
11,679
0
81
But if the GPU can use 24 shader pipelines at a time but only out put 16 at a time, it leads me to believe that there is a bottleneck. Am i missing something fundamental here?

-Kevin
 

Genx87

Lifer
Apr 8, 2002
41,091
513
126
Many shaders are not completed in a single clock cycle. The more resources you can throw at the problem the less of a bottleneck that becomes.

Think of it like an Athlon64 processor.

It has 9 execution units but can only retire 3 operations per clock. The 9 execution units work as hard as they can to try and sustain the 3 max operations per clock. Usually however due to code limitations and timing the A64 hovers aright around 1.2-1.5 operations per clock retired.

It doesnt make much sense to keep your backend wide open when your front end cant even feed it enough to keep it busy. AMD could have designed a 9 wide retire processor. But what a waste of silicon and transistors.
 

Munky

Diamond Member
Feb 5, 2005
9,372
0
76
About the whole clock issue, I'm sure we all know that the gf6 and 7 series already perform better clock for clock than the r430/480 cards. Which means they dont really need an equivalently high clockrate, and more importantly, achieveing the equivalent clockrate would be a lot harder due to the large increase in heat dissipation that would follow. Sort of like the A64 vs P4 clock war. An A64 would have a very hard time achieving 3.8ghz, it might not even be possible at all with the current design, but it doesnt need to in order to stay competitive.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Sort of like the A64 vs P4 clock war. An A64 would have a very hard time achieving 3.8ghz, it might not even be possible at all with the current design, but it doesnt need to in order to stay competitive.

Superior? :)
 

Munky

Diamond Member
Feb 5, 2005
9,372
0
76
Originally posted by: BenSkywalker
Sort of like the A64 vs P4 clock war. An A64 would have a very hard time achieving 3.8ghz, it might not even be possible at all with the current design, but it doesnt need to in order to stay competitive.

Superior? :)

In the case of A64 vs P4, yeah, the A64 is already superior even at lower clock speeds. For the gfx card, we'll have to wait for the r520 to decide which one is superior.