Originally posted by: BenSkywalker
The R580 has 48 shader units each with its own ALU, the 7900 has 24 shader units each with 2 ALUs. The R580 and 7900 have 48ALUs each for fragment shading.
R580= 36 FLOPs (8 FLOPs for 4D MADD and 4 FLOPs for ADD per ALU) * 16 SIMD channels * 0.65GHz (only for pixel shaders) ..
Compared to the NV40-
- Each pipeline is capable of performing a four-wide, coissue-able multiply-add (MAD)
or four-term dot product (DP4), plus a four-wide, coissue-able and dual-issuable
multiply instruction per clock in series, as shown in Figure 30-11. In addition, a
multifunction unit that performs complex operations can replace the alpha channel
MAD operation. Operations are performed at full speed on both fp32 and fp16 data,
although storage and bandwidth limitations can favor fp16 performance sometimes.
In practice, it is sometimes possible to execute eight math operations and a texture
lookup in a single cycle.
- Dedicated fp16 normalization hardware exists, making it possible to normalize a
vector at fp16 precision in parallel with the multiplies and MADs just described.
- An independent reciprocal operation can be performed in parallel with the multiply,
MAD, and fp16 normalization described previously.
How exactly are you figuring the R580 has an edge in shader performance? Compare the raw numbers, it isn't there.
Like this
Having as given that:
multiply (MUL) = 4 floating point operations (FLOPs)
add (ADD) = 4 FLOPs
multiply + add (MADD) = 8 FLOPs
Also quad!=quad, ALU!=ALU
Don't confuse ALUs with Flops.. You cannot measure shader performance the way you attempted.. Again
G71: 24 SIMD channels * 16 FLOPs * 0.65GHz = 249.6 GFLOPs
R580: 16 SIMD channels * 36 FLOPs * 0.65GHz = 374 GFLOPs
Flops are a theoretical number of
internal shader fillrate.They are indeed measurable in real time but you need applications like GPUbench to see the difference..
ATI has a massive advantage in internal shader fillrate , what remains to be seen is the application(game) to show it..
Things are much more complicated if you take into account the fact that ADDs can be used in Radeons under circumstances, but also in GeForces a part of fillrate is deducted from texture operations.. That's why we see major differences from theoretical numbers to "real" gaems..
By a huge margin looking at the comparable gen products. Compare the NV2x parts to the R2x0 parts and they leave a bloody smear on the side of the road. SplinterCell, Halo and pretty much every other port that went from console to PCs was an absolute anhilation for the GF4 over the R8500. Current get titles for the 360 were developed using the R580 platform- next round will be aimed at the R500 explicitly which should be interesting as ATi's R600 part will be around to capitalize on that(once the PC catches up to where consoles were last year) although clearly it is a staggering amount close this time around then when nVidia had the built in edge.
Ben this was my quote..
" It's not a myth that ATI is handling DX better than Nvidia since R3xx.. "
The vast majority of the games for Xbox were out when R3xx was introduced.. An equal if not worse annihilation for the Geforce FX series..From the time DX9 was introduced Nvidia in comparable products always stays back in high end single cards with small or big difference in performance depending on the case..
I'm not a fan of ATI, just stating that your theory is way too speculating for my ears without sufficient feedback to sound real..
Both cards right now have very good products and I'm reallly glad to see competition heating..
I for personal reasons found the solution of R580 more suitable for my needs, and I'm glad I did.. R580 is a magnificent piece of hardware and it doesn't need the backup of SexBox to showcase its power IMHO..