are dx10 games likely to benefit ati or nvidia?

her34

Senior member
Dec 4, 2004
581
1
81
http://www.anandtech.com/video/showdoc.aspx?i=2988&p=6

However, more work (up to 5x) is potentially getting done on each of those 64 threads than on NVIDIA's 128 threads. This is because R600 can execute up to five parallel operations per thread while NVIDIA hardware is only able to handle one operation at a time per SP (in most cases). But maximizing throughput on the AMD hardware will be much more difficult, and we won't always see peak performance from real code. On the best case level, R600 is able to do 2.5x the work of G80 per clock (320 operations on R600 and 128 on G80). Worst case for code dependency on both architectures gives the G80 a 2x advantage over R600 per clock (64 operations on R600 with 128 on G80).

The real difference is in where parallelism is extracted. Both architectures make use of the fact that threads are independent of each other by using multiple SIMD units. While NVIDIA focused on maximizing parallelism in this area of graphics, AMD decided to try to extract parallelism inside the instruction stream by using a VLIW approach. AMD's average case will be different depending on the code running, though so many operations are vector based, high utilization can generally be expected.

is there any reason to believe that the best looking dx10 would be written one way or the other?

 

destrekor

Lifer
Nov 18, 2005
28,799
359
126
from HardOCP
The ATI Radeon HD 2900 XT has 16 texture units and can perform 16 bilinear filtered FP16 pixels per clock. In comparison the GeForce 8800 GTX has twice as many texture units, 32 and does 32 FP16 pixels per clock, and the GTS has 50% more with 24 FP16 pixels per clock.
...
There are also 16 ROPs in the ATI Radeon HD 2000 series. The GeForce 8800 GTS has 20 ROPs and the GTX has 24. The Radeon HD 2900 XT can perform 32 pixels per clock for Z, the GeForce 8800 GTS can do 40 and the GTX does 48.
...
All of this sounds great on paper, but the facts are we never really saw any major specific examples of this new memory subsystem making a specific impact in games with the previous generation. We may be looking at a rather unbalanced GPU. The memory subsystem potential is incredible, but if the GPU cannot keep the memory fed the point is lost.