About nVidia and Det 50

dragonic

Senior member
May 2, 2003
254
0
0
User Balderdash posted in Tom's Hardware forums quite interesting stuff regarding FX-series and Det 50

ATI utilizes a 8x1 pixel shader path, with one path at 8 bits. Nvidia, on the other hand, uses a 4x2 path with two paths each 4 bits wide. Currently, any game using PS 2.0 with the FX cards is only accessing shaders at 4x1, due to driver and DX9b limitations (we will see DX9c soon, mark my words) and so, the DX9 games and 45:23 driver is effectively ignoring the second PS 2.0 path. The preview 51:75 driver alleviates this problem, enabling the full second path for use in the game as much as possible before any update to DX9 is implemented to allow true dual channels as intended by its design. We see these HL2 benchmark results now because HL2 is seriously dependant on pixel shaders in their current form and it is singly responsible for the framerate discrepancies. The fix coming with the Det.50 should bring the numbers in line with ATI's, and additionally, the updated DX9c from Microsoft will likely make the FX cards the winner once true dual channel shaders are implemented and dual channel benefits can be accessed. The next incarnation of DX9 should include the ability to use simultaneous wait states for PS 2.0 textures in DX9 applications. This will greatly reduce the 'problem' shown in these 'benchmarks.' The DX9 SDK was built (without any hardware available mind you) to favor one long pipe (and thus currently favor the ATI 8x1 version) since each texture has to go through a myriad of call back and wait/check states and has a definite FIFO for all textures in the pipe the nV (4x2) pipe is crippled during these operations. With the next version of DX9 you'll see the included paired texture waits in the shader process allowing the nV to actually utilize the 4x2 pipe simultaneously instead of a defined FIFO for each.

Hopefully this will be true and all the people who have bought FX will be able to play DX9 games

EDIT: Also here's the Link
 

Genx87

Lifer
Apr 8, 2002
41,091
513
126
Yet another explanation?

My brain is hurting. I will wait on all these variables to play out before deciding what is going on lol.

 

TourGuide

Golden Member
Aug 19, 2000
1,680
0
76
Originally posted by: NYHoustonman
Che

That does not look good. I would guess by looking at those screen-shots that nV's "performance enhancements" are coming through shaving image quality just enough to where the engineers figure they can get away with it. They're going to get crucified for it too. Sheesh. Looks like my next card is going to be an ATI, quirky drivers or not.

 

oldfart

Lifer
Dec 2, 1999
10,207
0
0
Wow! Another big OUCH! for Nvidia
Now, im sure most of you have read Gabes recent comments regarding the detonator 51.75s, and Nvidia's offical response but I really do have to say, having seen this first hand it confirms to both myself and Veridian that the new detonators are far from a high quality IQ set. Alot of negative publicity is currently surrounding Nvidia, and here at driverheaven we like to remain as impartial and open minded as we possibly can, but after reading all the articles recently such as here coming from good sources and experiencing this ourselves first hand, I can no longer recommend an nvidia card to anyone. Ill be speaking with nvidia about this over the coming days and if I can make anything public I will.
 

ParrotHeadToo

Junior Member
Jul 27, 2003
21
0
0
I'm returning my new unopened GeForce 5900 Ultra. Next week Saturday is the last day to return this thing.

IMO the management at Nvidia doesn't have their act together, don't care, or don't have competitive skills. Or all three. They got blind sided on this and they are probably in damage control mode. Not a good place to be when trying to manufacture QUALITY products. Updated drivers aside, there is no reason for me to keep their top of the line card for results like this. Even though I purchased this card on sale at $400, I am expecting Nvidia (or any company for that matter at this price) to take "care" of it's product during it's life. To take 'care' of it's products the company must have an excellent managemt team. IMO this is Nvidia's REAL problem and that is what needs to be addressed. They need a shake up in management.

This doesn't mean I'm jumping to ATI as I'm sure they have their managemt issues too. That said, I also believe they have worked better as a management team. It shows in their products and how they position their products in the market.

For now and the next 3 months, I'm waiting until all this dust settles.
 

ponyo

Lifer
Feb 14, 2002
19,688
2,811
126
Go return that 5900 Ultra you bought at CompUSA for $400 plus tax and buy ~$250 9800 NP. Simple bios flash and you have 9800 Pro. Great card at 1/2 price of your 5900 Ultra.
 
Apr 17, 2003
37,622
0
76
Originally posted by: Naustica
Go return that 5900 Ultra you bought at CompUSA for $400 plus tax and buy ~$250 9800 NP. Simple bios flash and you have 9800 Pro. Great card at 1/2 price of your 5900 Ultra.

thats what i would. the only reason i got a 9800 pro instead on a 9800 NP is cuz my friend gave me his 9800 pro for my 9700 pro for putting together his system
 

Rogodin2

Banned
Jul 2, 2003
3,219
0
0
Not acording to carmack

If you want to run the highest possible IQ with decent speed the default ARB2 pathway (that the NV3x chokes on) is only able to run decently on a 9700/9800.

rogo
 

DAPUNISHER

Super Moderator CPU Forum Mod and Elite Member
Super Moderator
Aug 22, 2001
31,689
31,557
146
Give this a read link
R350 shows a similar curve as R300, just shifted up a bit because of a higher clock. A maximum of 6080 MI/s at 1:1 ratio and a minimum of 3040 MI/s when there are only instructions of one kind.

NV35 shows a different behavior than NV30. In the ideal case of paired texture instructions followed by an arithmetic instruction, it can reach a maximum of 5400 MI/s. The second line shows a shader without paired texture operations. The last curve shows how NV35 behaves when using PS1.4 and PS2.0 in the form preferred by ATi. Because this means either 8 texture or 8 arithmetic instructions per clock, we get a constant 3600 MI/s. At a 1:6 ratio and below, NV35 is able to beat R350.

With shaders optimized for both architectures, NV35 does a much better job than its predecessor did. NV35 beats R350 outside the range of 2:1 to 1:3. But in between, ATi dominates and even R300 is able to beat NV35 here. If we consider the bigger performance hit of R350 when doing dependent reads, we can conclude that NV35 and R350 are competitors of equal weight if both get fed with optimized shader code.

But nVidia can't expect an application to always deliver such code. At this point we can only preach that nVidia has to put instruction reordering capabilities into their drivers, but without changing the function of the code. The method the current driver uses nothing more than a stop-gap solution that is acceptable as long as the number of applications containing unfavorable shader code is small. But with a growing number of titles using shader technology, nVidia can't expect GeForceFX users to wait for new drivers to enjoy higher performance through replacement shaders.

That's the first possibility for nVidia to improve utilization of the limited shading power. Experiments with a new shader compiler showed that most shaders can be converted into a more favorable form. Unlike the old shader compiler that only supportet the ATi shader model, the new compiler uses less temporary registers and produces the instruction order that fits CineFX best.

CineFX II has been considered, too, so every pair of texture instructions is followed by an arithmetic instruction if possible. The CineFX I pipeline also seems to gain performance from this. The loopback controller of the shader core seems to be able to perform several shader instructions in a row without depending on a "big loopback" over the whole pipeline. Because this loopback is punished with an extra cycle per quad group, a reduction of these big loopbacks is desirable.

Additionally, this new compiler is able to use the extended features of CineFX compared to SmartShader II to realize the same effect with less instructions. Tests with an early version of this compiler showed that these measures alone can lead to an up to 40% higher frame rate in certain cases, without modifications in the driver. Using this driver is up to the software developers, though, and nVidia shouldn't always expect cooperation. But a great deal of these techniques could also be implemented in the driver and used on shaders already compiled for the ATi model.

Pure scalar operations leave the biggest part of processing power unused. We don't want to raise too high hopes here, since we don't have any really reliable information on how flexible te VLIW aspect of the CineFX pipeline really is. Maybe the suggestion of an nVidia employee that there are still many ways to improve shader performance without using lower precision aimed at exactly this point.


 

McArra

Diamond Member
May 21, 2003
3,295
0
0
Originally posted by: TourGuide
Originally posted by: NYHoustonman
Che

That does not look good. I would guess by looking at those screen-shots that nV's "performance enhancements" are coming through shaving image quality just enough to where the engineers figure they can get away with it. They're going to get crucified for it too. Sheesh.

 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Not acording to carmack

If you want to run the highest possible IQ with decent speed the default ARB2 pathway (that the NV3x chokes on) is only able to run decently on a 9700/9800.

It's a good thing you are here to translate for people-

Your .plan indicates that the NV30-path that you use implements only 16-bits floating-point (FP), i.e. half precision FP, for most computation, which should be sufficient for most pixel shading. The ARB2-path does not have 16-bits FP, and so all computation are done with 32-bits FP on the NV30. With regards to the R300, there shouldn't be a difference since it is always 24-bits FP on the R300. According to your .plan, NV30 is twice as slow on 32-bits FP - that is why the NV30 is slower than the R300 on the ARB2-path, but faster on the NV30-path. The question is what sort of quality difference are we talking about (in DOOM3) for such a difference between FP formats?

There is no discernable quality difference, because everything is going into an 8 bit per component framebuffer. Few graphics calculations really need 32 bit accuracy. I would have been happy to have just 16 bit, but some texture calculations have already been done in 24 bit, so it would have been sort of a step back in some cases. Going to full 32 bit will allow sharing the functional units between the vertex and pixel hardware in future generations, which will be a good thing.

Carmack's words.

All this time I read 'no discernable quality difference' as meaning no discernable quality difference when all this time I should have been reading it as there are major quality differences between the modes. My English must be getting real rough.
 

nRollo

Banned
Jan 11, 2002
10,460
0
0
All this time I read 'no discernable quality difference' as meaning no discernable quality difference when all this time I should have been reading it as there are major quality differences between the modes. My English must be getting real rough.

LOL
 

oLLie

Diamond Member
Jan 15, 2001
5,203
1
0
Originally posted by: oldfart
Wow! Another big OUCH! for Nvidia
]Now, im sure most of you have read Gabes recent comments regarding the detonator 51.75s, and Nvidia's offical response but I really do have to say, having seen this first hand it confirms to both myself and Veridian that the new detonators are far from a high quality IQ set. Alot of negative publicity is currently surrounding Nvidia, and here at driverheaven we like to remain as impartial and open minded as we possibly can, but after reading all the articles recently such as here coming from good sources and experiencing this ourselves first hand, I can no longer recommend an nvidia card to anyone. Ill be speaking with nvidia about this over the coming days and if I can make anything public I will.

I briefly read through that link as well, and I did not see any comments from Gabe about detonator 51.75, did I miss it?
 

nRollo

Banned
Jan 11, 2002
10,460
0
0
I want to see more screenshots that expose the difference between DX 8.1 and Dx9. What I've seen so far hasn't made DX9 seem worth the performance hit.
 

Alkaline5

Senior member
Jun 21, 2001
801
0
0
Originally posted by: Rollo
I want to see more screenshots that expose the difference between DX 8.1 and Dx9. What I've seen so far hasn't made DX9 seem worth the performance hit.

I agree that DX9 doesn't seem to look that much better. Based on what I've seen so far it looks just like DX 8.1 only shinier. As for performance hit, if you look at these benches then that only applies to FX cards. ATI hardware isn't affected by moving from 8.1 to 9 at all.