Originally posted by: stelleg151
I think I get it now, thanks munky. Although I'm unsure how you could get ahold of stats like that? How did you find out that the 6600gt has 4 ROPs? (also what does ROP stand for?)
Either nV tells you, or you discover it through synthetic tests,
like here. As you can see, the 6600GT only hits a fillrate of 2000 pixels/sec. Given a core clock of 500MHz (clocks/sec), that gives you 2000/500 = 4 pixels/clock. You can also see it achieves 4000 z-ops/sec, which corresponds to nV's double z-ops per ROP. So, 4 ROPs.
IIRC, ROP stands for raster output processor (edit:
Render OutPut), and here's what AT says about 'em:
The end of the pipeline consists of the ROP pixel pipeline. These are the units that take care of antialiasing, as well as z and color compression and final drawing of a pixel. There are 16 of these units, and they are capable of either computing one color+z pixel, or calculating 2 z/stencil operations per clock. This means that 32 z or stencil operations (think shadowing), or 16 pixels can be drawn per clock cycle. Thus NVIDIA has dubbed this architecture a 16x1 / 32x0 architecture. On a side note, they have retroactively dubbed the NV3x a 4x2 / 8x0 architecture.
You can only draw as many pixels per second as you have ROPs. Having fewer ROPs than pixel pipelines isn't an issue, as both nV and ATI have said that you lose maybe 5-10% performance in certain situations, but you can put the transistors saved to better use elsewhere.
The X1600XT's problem isn't its four ROPs, but rather its four texture units. At least, that's the only thing I can think of to explain its relatively poor performance, considering its immense shader power (600MHz * 12 pixel shader units and 5 vertex shader units gives it theoretical power at a level with most previous gen 16 pipe parts, like the X800XL and 6800GT). Even shaders need data, and if you only have four texture units feeding data to 12 pixel shaders, I guess you end up with pixel shaders waiting on the texture units.
Edit: You can probably learn even more about ROPs by reading the Xenos/R500/C1 articles (the Xbox360's GPU), as the Xenos takes the unusual step of moving the ROPs off the main GPU die onto a daughter die and surrounding it with fast EDRAM, thereby (theoretically) reducing the typical AA hit to practically nil.