• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

FS: ATi Ready to Respond / R350 Recently Taped Out

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
very true. seeing the paper lanch of the nv30 makes me not want it. it shows that the numbers are more impresive the product itself. but i could be wrong.
 
What I dont understand is why nVidia is not listening to Microsoft's Direct X 9 secifications. Sure it has EXTRA stuff that DirectX 9 needs. But why is it that it does NOT even meet the BASIC requirements of it? I dont care if its 500 MHZ core and can run around hoops around any video card, but to me its not "fully" directX 9
compatible.

I mean ATI seems to have 2 aces under their sleeves and 1 has been used and the next one is ready to be used when Geforce FX is released. Man competition is great. I cannot wait to pick up a $200 Radeon 9700 Pro in March (hopefully) and be set. I was so close to buying one for $315 retail the second week it came out but I held firm, I hope I didn't make a mistake. I like my Geforce 2ti 500, also some other nvidia cards I got, but did nVidia also get 3DfX's curse when they acquired it?
 
ATi's comments to Firing Squad suddenly got very interesting indeed folks. It is amazing how much clearer everything becomes when you discover exactly why ATi said what they did.

... yet a 400-500MHz chip with 8 pixel pipelines running very long shaders would spend all of its time in geometry, bringing frame rate to a crawl. ATI feels that with RADEON 9700?s multi-pass capability, having native support for thousands of shaders is useless ...
What ATi didn't tell us is that their vertex shaders are dumb - they has no flow control of their own.

You may recall that DirectX9 allows for vertex programs to be 256 instructions in length and to have a total instruction execution limit of 65535 per shader program.

This is so GPU's like ATi's can totally unroll (resolve all the loops and jumps prior to the shader running) the vertex program in the driver (using CPU acceleration not GPU acceleration) and then upload the unrolled shader program to the GPU which executes it by simply stepping through it. ATi's vertex shader can only read jump/branch points and control flags - it cannot set them itself.

Everytime the application needs to change the vertex shader program the driver must unroll the new program and send it up to the GPU. This is a lot of work for the CPU, especially if it is not a fast CPU.

Little wonder ATI dislikes long Vertex/geometry shaders so much - they show up a nasty little shortcut in their own hardware. Now we know why they think things will slow to a crawl...

Isn't it the job of the GPU to do hardware acceleration of 3D, not the CPU? If not why is the card so expensive?

source
...From this point of view a vertex processor reminds any other general-purpose processor. But what's about programmability? A shader is a program which controls a vector ALU processing 4D vectors. A shader's program can be 256 ops long but it can contain loops and transitions. For organization of loops there are 16 integer registers of counters I which are accessible from the shader only for reading, i.e. they are constants assigned outside in an application. For conditional jumps there are 16 logic (one-bit) registers B. Again, they can't be changed from the shader. As a result, all jumps and loops are predetermined and can be controlled only from outside, from an application. Remember that this is a basic model declared by the DX9.

Besides, the overall number of instructions which can be processed within the shader with all loops and branches/jumps taken into account is limited by 65536. What for such strict limitations? Actually, to meet such requirements the chip can do without any logic controlling execution of loops and jumps. It's enough to organize successive execution of shaders up to 665536 instructions and unroll all conditions and loops in advance in the driver. Actually, every time the program has its constants controlling branch and jump parameters changed, we have to load into the chip a new shader. The R300 uses exactly such approach. Exactly this approach lets us have only one set of control logic and a copy of vertex program shared by all vertex processors. And this approach doesn't make the vertex processor normal - we can't make on-the-fly decisions unique for each vertex taking into account criteria calculated right in the shader. Moreover, such unrolling of jumps and loops can make a process of replacement of the shader or its parameters controlling jumps quite demanding in terms of CPU resources. That is why ATI recommends to change vertex shaders as seldom as possible - the cost of such replacement is comparable to change of an active texture...

Greg
 
Isn't it the job of the GPU to do hardware acceleration of 3D, not the CPU? If not why is the card so expensive?

Spoken like a true nVidiot! Seriously, though, talk about nit-picking when DX-9 based games are 1+ years away! The benches speak for themselves.
 
Back
Top