6600GT "fragment crossbar"

Marsumane

Golden Member
Mar 9, 2004
1,171
0
0
Basically my question is self explanitory. Does this only apply to the 6600 series or is it also in the 6800 series? Also, does this make the fillrate of the 6600 series clock speed times 8 or times 4 for multi/single texture situations?

-Info on this topic can be found here: http://www.bit-tech.net/review/352/
 

Pete

Diamond Member
Oct 10, 1999
4,953
0
0
It has eight pixel pipelines and the corresponding shaders, but four ROPs means it can only output four pixels per clock. This becomes less of an issue with more computationally-intensive titles, that won't be cranking out a pixel per pipe per clock, but rather will need to "shade" pixels for multiple clocks.

I've been told the 6600 has more of a FIFO buffer than a crossbar. The 6800 has a crossbar b/c it has 16 pipes and 16 ROPs, so as soon as a fragment/pixel is done it's shunted to an available ROP (as there's always one available). The 6600 has more pipes than ROPs, so it would seem to need to queue finished pixels if it finishes more than four per clock. I'd imagine there are measures taken to prevent that buffer from overflowing, like maybe a pipe or quad being "locked" if the FIFO is near full.

You can think of it as being basically 4x1 or 4x2 in terms of texture fillrate, but it's a full eight pipes when it comes to shaders. The main limitation seems to be in older games (the X700XT beats the 6600GT in Quake 3, IIRC, and that would be b/c the X700 has a ROP attached to each pipe, just like the X800).
 

Marsumane

Golden Member
Mar 9, 2004
1,171
0
0
Originally posted by: Pete
It has eight pixel pipelines and the corresponding shaders, but four ROPs means it can only output four pixels per clock. This becomes less of an issue with more computationally-intensive titles, that won't be cranking out a pixel per pipe per clock, but rather will need to "shade" pixels for multiple clocks.

I've been told the 6600 has more of a FIFO buffer than a crossbar. The 6800 has a crossbar b/c it has 16 pipes and 16 ROPs, so as soon as a fragment/pixel is done it's shunted to an available ROP (as there's always one available). The 6600 has more pipes than ROPs, so it would seem to need to queue finished pixels if it finishes more than four per clock. I'd imagine there are measures taken to prevent that buffer from overflowing, like maybe a pipe or quad being "locked" if the FIFO is near full.

You can think of it as being basically 4x1 or 4x2 in terms of texture fillrate, but it's a full eight pipes when it comes to shaders. The main limitation seems to be in older games (the X700XT beats the 6600GT in Quake 3, IIRC, and that would be b/c the X700 has a ROP attached to each pipe, just like the X800).

So it seems to be the opposite strategy of the 5xxx series. It concentrates more on shading performance then fillrate, while the 5xxx series lacked on shading performance, but was strengthened on fillrate (ex: quake 3).
-What actually is the difference between a FIFO buffer vs a crossbar? What do these do? From what im understanding, it seems to be the first stage in the pipeline or some sort of buffer that is before the pipeline actually starts?
-Based on what u said, an rop is the last step in the pipeline stage for rendering the pixels. Am i correct on this?
-Also, the x700 seems to barely beat the 6600 series in fillrate limited games. Why do u think this is? It would seem that it's theortical maximum fillrate would be significantly higher then that of the 6600 series in single textured situations. Then again it would also seem based on your explanation above, that the pipeline limitation in modern games has less to do with fillrate, and more to do with shader calculations within the pipeline that is the limiting factor and why more pipelines enhance framerate as they do.
-Where did u read this at? Nobody ever talks about graphic cards this indepth on anandtech. I'd be interested in reading whatever site/forum you learned this info from :)
 

Pete

Diamond Member
Oct 10, 1999
4,953
0
0
AFAIK:

Both the FIFO and crossbar would be between the pixel pipes and the ROPs. FIFO is a first in, first out buffer. This makes sense with more pipes than ROPs, as the quads (group of four pipes) just "drop" in their pixels as they finish them, and the buffer maintains a queue of pixels to be rendered by the available ROPs. You'd want a crossbar if there were as many ROPs as pipes, which there is in a 6800. The crossbar isn't a buffer, it's more of a traffic cop, directing pixels to free ROPs. I learned this myself not too long ago here.

Yes, ROP is last in the pipeline (altho the definition of pipeline is http://www.beyond3d.com/forum/...ic.php?p=378404#378404). You can see that in the NV40 (6800) diagrams posted at Anandtech (and everywhere else) in the initial 6800 p/reviews.

As for the X700XT barely beating 6600GT in "fillrate limited games," there are a number of explanations. One, those games aren't that fillrate limited. ;) Two, it's the drivers, particularly with OGL games. Three, the X700XT doesn't seem to have the bandwidth to really outstretch the 6600GT in single-texture situations, where its extra ROPs would be apparent. (I guess this also applies to alpha blends, too, but I'm really out of my depth here--moreso than the rest of this post.)

Edit: As for your first point, the 5800 and 5900 has the same theoretical fillrate simply b/c they had the same number of ROPs. Of course, they had far less pixel shader power, something that was corrected in the 6800 by both adding more pipes and making the shaders more capable. Someone noted how similar the 6600GT is to the 5800U. It is an interesting comparison, and shows how much nV has learned since then.