HL2 performance boost for nV owners?

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Pete

Diamond Member
Oct 10, 1999
4,953
0
0
Originally posted by: Rollo
If this is the case, FX series cards should be able to run HL2 as fast or faster than 9700/9800Pros, and 6800s should be as fast or faster than their ATI counterparts.
Well, that's an interesting proposition. Considering that Radeons are 100% faster than FX cards in HL2 DX9 mode, how will the "20/30%" speedup reported in that NVN thread result in FX cards running HL2 as fast as Radeons?

Remember, again, that the four-pipe FX line has proven to be only as fast as the four-pipe RV3x0 line in heavy-PS2.0 games. I'm not sure I'd expect to see a FX5900 be much faster than an equally-clocked 9600 as a result of this.

Anyway, I'm looking forward to a fuller exploration, but I don't expect a 5950U to bridge the gap to a 9800XT, or a 5700 to a 9600.

And, Genx87, the FX's shortcoming was mainly lack of temp registers, tho--and this is a very rough explanation--the 5900 replaced the 5800's FX16 pixel shaders with FP16 ones, so a 5800 probably won't benefit as much as a 5900.

BTW, priceless quote from the NVN thread:

Originally Posted by BioHazZarD
Isnt this illegal considering they have on purpose tryied to ruin the game for Nvidia owners ?

:laugh:

I'm not trying to make excuses here, but someone at (surprise!) B3D pointed out that JC did something similar with his texture lookup, apparently to make things easier on his artists. ATi "fixed" that in a subsequent driver, so you can expect nVidia to do the same to HL2 if this precision-lowering turns out to be feasible.
 

DAPUNISHER

Super Moderator CPU Forum Mod and Elite Member
Super Moderator
Aug 22, 2001
32,067
32,593
146
Originally posted by: nitromullet
Oh yeah, I forgot, their next planned move is to re-release their old core again with a slight speed bump.
<Kelso> Burn!</>

Anyway, back on topic, has anyone seen any benches or screenshots of NV30 running 16-bit floating point precision under DirectX9. My old FX is in my fiance's PC, and I really don't feel lile loading HL2 via Steam on her rig to test this. .
Congrats on your pending Nuptials :beer:

I'm not trying to make excuses here, but someone at (surprise!) B3D pointed out that JC did something similar with his texture lookup, apparently to make things easier on his artists. ATi "fixed" that in a subsequent driver, so you can expect nVidia to do the same to HL2 if this precision-lowering turns out to be feasible.
That is what I was getting at, thanks for articulating it properly :)
 

PrayForDeath

Diamond Member
Apr 12, 2004
3,478
1
76
You can't say: "If this is the case, FX series cards should be able to run HL2 as fast or faster than 9700/9800Pros" Simply because the FX and the 9800/9700 are not running the same precision, it's like saying the FX5900U running DX8.1 is faster than the 9800Pro running DX9.0, it's an apples to oranges comparison, same goes for 6800 vs X800.
And even when running 16bit precision the FX series wouldn't match the 9xxx series because of its poor DX9.0 support, it's a fact. FarCry and 3DMark05 are good examples.
Still, it's an interesting find, Rollo, this should give some hope for the FX users running H-L2 with its full eye-candy.
 

Pete

Diamond Member
Oct 10, 1999
4,953
0
0
Originally posted by: Rollo
Originally posted by: gururu
blaming other companies for this apparent nVidia deficiency is silly. If their cards weren't fast enough for 32-bit DX 9, why the heck did they implement it?

Perhaps to give developers the tools they needed to code for what everyone knew the spec would end up with? (32bit)

Unlike ATI who don't really seem to care about vendor relations, nVidia has a huge partnership with the software community. They were the first to bring hardware T/L to market, the first to bring FP32, and the first to bring SM3. (among others)

Games are in development for years. You can't just expect developers to work in 24 bit SM2 for all of 2003 and 2004 because ATI can't get the 32 bit SM3 together can you, Gururu? It would stagnate the industry to the level the tech has been at since fall 2002?

Quoted in case Rollo wonders why people call him an nV fanboy. I didn't realize FP32 was holding the industry back to such a degree, Rollo. :p ;)

Lest you forget, ATi will apparently be the first to bring "fast" conditionals to SM3 hardware with the Xbox 2 GPU. IIRC, you derided the 9700P for not being fast enough, despite being the first to SM2?
 

nRollo

Banned
Jan 11, 2002
10,460
0
0
Originally posted by: Pete
Originally posted by: Rollo
Originally posted by: gururu
blaming other companies for this apparent nVidia deficiency is silly. If their cards weren't fast enough for 32-bit DX 9, why the heck did they implement it?

Perhaps to give developers the tools they needed to code for what everyone knew the spec would end up with? (32bit)

Unlike ATI who don't really seem to care about vendor relations, nVidia has a huge partnership with the software community. They were the first to bring hardware T/L to market, the first to bring FP32, and the first to bring SM3. (among others)

Games are in development for years. You can't just expect developers to work in 24 bit SM2 for all of 2003 and 2004 because ATI can't get the 32 bit SM3 together can you, Gururu? It would stagnate the industry to the level the tech has been at since fall 2002?

Quoted in case Rollo wonders why people call him an nV fanboy. I didn't realize FP32 was holding the industry back to such a degree, Rollo. :p ;)

Lest you forget, ATi will apparently be the first to bring "fast" conditionals to SM3 hardware with the Xbox 2 GPU. IIRC, you derided the 9700P for not being fast enough, despite being the first to SM2?

Kneel before your green nVidia OVERLORD, Pete!

LOL

This thread is not meant to be an ATI vs nVidia thread, I was honestly attempting to answer Gururu's question: Why did nVidia implement FP32 before it worked well on the card?

While I'm just regurgitating what I have read, I thought that when the nV30 was in development nVidia basically bet wrong that MS would have a 16 bit standard for DX9, and implemented the FP32 and much longer instruction sets with a focus toward developers/Quadro line?

If I'm on crack and someone has a link to an interview where they state,"It was costly and foolish to make those first steps toward FP32, we should have waited for 2005 like ATI" I'll gladly recant.

You're right 20-30% won't close the gap between nV3x and R300, but it is a huge gain.

 

nRollo

Banned
Jan 11, 2002
10,460
0
0
Dravyn:
I wanted this thread to be about the HL2 tweak, I probably should have let Gururu's question about why nVidia implements features before they're seemingly useful go.

You make some good points.
 

Drayvn

Golden Member
Jun 23, 2004
1,008
0
0
Originally posted by: Rollo
Dravyn:
I wanted this thread to be about the HL2 tweak, I probably should have let Gururu's question about why nVidia implements features before they're seemingly useful go.

You make some good points.


Sorry if i seemed a bit flamey there tho, when i read it, it sounded like a flame and ur first post did also, but i just read the post above and the other one above and i was wrong, sorry!

I just hope someone decent like Anandtech comes along and does some benchmarking, but instead of doind an FP16 to and FP24 comparison, an FP16 to and FP16 as we know the ATi cards can do FP16 and then we can see who edges out over who.

Again sorry if my post sounded to much of a flame post.
 

nRollo

Banned
Jan 11, 2002
10,460
0
0
Originally posted by: Drayvn
Originally posted by: Rollo
Dravyn:
I wanted this thread to be about the HL2 tweak, I probably should have let Gururu's question about why nVidia implements features before they're seemingly useful go.

You make some good points.


Sorry if i seemed a bit flamey there tho, when i read it, it sounded like a flame and ur first post did also, but i just read the post above and the other one above and i was wrong, sorry!

I just hope someone decent like Anandtech comes along and does some benchmarking, but instead of doind an FP16 to and FP24 comparison, an FP16 to and FP16 as we know the ATi cards can do FP16 and then we can see who edges out over who.

Again sorry if my post sounded to much of a flame post.


No problem Dravyn, didn't take it as a flame at all.

I also didn't know ATI cards could do FP16, live and learn.
 

LocutusX

Diamond Member
Oct 9, 1999
3,061
0
0
I'm personally more interested in Valve sorting out their stuttering/framerate-drop problem. Anyways it seems a lot of nV owners are suffering from that too (i.e. it has nothing to do with your graphics card or video drivers) so it's in all our best interest for them to deliver a Real Fix(tm) for that issue.
Also, I'm sure that in time we'll get a definitive answer on this FP16 vs. FP24 shader issue.
 

James3shin

Diamond Member
Apr 5, 2004
4,426
0
76
yeah those shots were pretty helpful, perhaps fp24 should be compared to fp32, 16 to 32 had a few noticeable differences, i'd like to see 24 to 32.
 

fbrdphreak

Lifer
Apr 17, 2004
17,555
1
0
When my HL2 CE arrives, I'll run benchies with both for all concerned. Problem is who knows when it'll get here, shipped "media mail" on that GoGamer deal that was in HD :roll: I'll keep you all updated
 

Marsumane

Golden Member
Mar 9, 2004
1,171
0
0
Originally posted by: Genx87
Originally posted by: gururu
blaming other companies for this apparent nVidia deficiency is silly. If their cards weren't fast enough for 32-bit DX 9, why the heck did they implement it?

There has been several opinions raised on the subject. Ranging from wanting to go with a standard IEEE precision to professional offloading of rendering duties on their Quadro cards.

I think looking at the FX series of cards it is pretty clear partial precision was designed for the gaming card and FP32 was for the Quadro cards.

Edit:

btw my understanding is the 6800 pipeline is designed for FP32 from the ground up. Meaning it will run FP16 and FP32 at the same speed. Where as the FX pipeline can run two as many ops through its pipeline if it is FP16 vs FP32. So I am not sure how much of a performance increase the 6800s can get out of this. The FX cards could see a substantial increase in performance.

Your last quote reguarding fp16 vs fp32 on the 6xxx series running at the same speed is what im questioning. I'm not quite sure on this but I dont think that is the case. I just think that at fp32, the card is much more ABLE to run fp32 PLUS dx9 shaders simotaniously as compared to the FX series. For that matter, the fp32 isnt even what is killing the FX series. It definatlely hurts, but its more of the order in which it did what it did within the shader pipeline that killed the FX. I'm thinking that fp16 would speed up the 6xxx series as well. I dont see a reason to do so because it already can be ran at high settings and fp16 does degrade IQ much more then the difference between fp24 vs fp32. It may be worth the tradeoff for the FX series so that u can run dx9 shaders, but I really don't see the benefit of running a 6800gt (for example) on fp16.

I should be recieving a 5700 in the mail soon from buy.com for a computer that I am building for someone so i may run some benches and IQ tests then. A 5700 running dx9 code in hl2 at a decent framerate at a res above 800x600 would be impressive, yet i dont even think this will get it there.
 

jiffylube1024

Diamond Member
Feb 17, 2002
7,430
0
71
Originally posted by: Rollo
Kneel before your green nVidia OVERLORD, Pete!

LOL

This thread is not meant to be an ATI vs nVidia thread, I was honestly attempting to answer Gururu's question: Why did nVidia implement FP32 before it worked well on the card?

While I'm just regurgitating what I have read, I thought that when the nV30 was in development nVidia basically bet wrong that MS would have a 16 bit standard for DX9, and implemented the FP32 and much longer instruction sets with a focus toward developers/Quadro line?

If I'm on crack and someone has a link to an interview where they state,"It was costly and foolish to make those first steps toward FP32, we should have waited for 2005 like ATI" I'll gladly recant.

You're right 20-30% won't close the gap between nV3x and R300, but it is a huge gain.

These threads turn into flame fests whenever someone says something disagreeable to others, but I'll contribute what I read.

When Nvidia designed the NV3x line (which was several years before it launched, as you are well aware), they seemed to be in their 3dfx phase of "forcefeed the industry what we think it needs," ie. with Cg and the like. From what I heard, Nvidia essentially said 16/32-bit precision is 'what the industry needs,' and designed FP16 into their cores knowing full well FP24 was the standard for full precision (or was going to become the standard). Moreover, with their design of NV30, FP24 performance may have been inferior (possibly drastically) to FP16 (more on this below), so they went with the FP16/FP32 design with the intention of using FP16 now and FP32 on the future generations of cards. But hey they'd still support FP32 so you'd feel future-proof ;) .

ATI, on the other side, was mimicking Nvidia's past success by "designing a GPU around Microsoft's DirectX standards" and went with a strict FP24 design (which was part of DirectX 9.0's spec for "full precision." I believe Microsoft has ammended DX9.0c to now include FP32 for "full precision," although this is several years after DirectX9.0 was launched).

One of the biggest surprises on the 9700 Pro (and one of the reasons it's one of the longest surviving cards in the top performance brackets) is its 256-bit memory architecture, which matched very well with the FP24 spec as it turns out (although honestly, full DX9 games like Half Life 2 are best played on an X800 or 6800 series card). This last point, for the record, I'm totally pulling out of my @ss so I'm probably wrong or at least drastically stretching the truth. Alas, since I'm not an engineer for either company, I don't possess intimate knowledge of the inner workings of the GPU's (forgive me, please!).

Anyways, back to Nvidia; unless you believe the DustbusterFX cooling was their original intention, Nvidia had less memory bandwidth to work with, being on the standard 128-bit bus, and the GPU was probably designed to launch at a slower core than the 500 MHz the DustbusterFX equipped Rollo-approved 5800U debuted with.

---------------------------------------------------------

This is all hearsay, but from what I gathered it's as close to the truth as I can find. ATI did gamble correctly by going with FP24 for their 9700/9800 series cards (sticking to this on their current generation of cards is an entirely different argument), and Nvidia did, apparently, make a misstep for adopting FP16/32 so early.

Of course, FP16/32 is a necessary transition for the industry, and will be the standard most definately next generation (or it it already? Regardless all of the competitors will support it next generation. Unless Rollo deems ATI unfit to call competition ;) ). Crytek, for one, already uses FP blending for HDR (which uses FP32 AFAIK), and thus only works on Nvidia cards. HDR doesn't exactly perform that great even on current generation cards, though, so again ATI's choice of FP24 (at least a generations plus a refresh ago) looks wise.


--------------------------------------------------------


I have a couple more points I'd like to discuss later, regarding HDR in HL2 (or the fact that it may be missing) and regarding how FP24 became the initial standard for DX9 but I'll save them for a later thread because I've spread enough hearsay for one post :) .
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,003
126
force 16bit shader precision in HL2 and no difference in IQ over 24/32 bit.
I find that highly unlikely.

(which is why they've always been slower at HL2- they're doing more "work")
And ATi will be doing more work if nV is forced into 16 bit mode.
 

Pete

Diamond Member
Oct 10, 1999
4,953
0
0
Rollo, I don't think FP32 was an afterthought for nV. I don't know anything about GPU engineering, but surely tacking on the ability to do one FP32 or two FP16 calculations per clock isn't something you do at the end of a GPU's design cycle. Or maybe it is, with the temp registers being the clue that nV aimed for full-speed FP16. I agree with the idea that FP16/32 and SM2.0"+" were workstation features, not really meant to be used hard in games for quite some time. But it wasn't that nV was counting on FP16 as standard precision, as their cards are slower than ATi's even when comparing FP16 vs. FP24 (for many reasons: 4 vs. 8 pipes, harder-to-optimize shader units, etc.). I think it's that nV just didn't count on the 9700P accelerating the FP schedule like it did. nV made what looks to be a GF4+/DX8+ card, whereas ATi produced a DX9 card.

ATi cards can accept FP16/_PP commands, but they'll still do the calculations at FP24 precision. AFAIK, the only benefit is reduced memory usage. Otherwise, ATi's GPUs appear to be designed for full-speed FP24.

Early reviews of the GF6 with RightMark and ShaderMark showed that the 6800 didn't gain much speed at all with FP16 (in fact, it was actually slower in one or two sub-tests). Apparently nV supplied the pixel shader units with sufficient temp registers for FP32 speed that was limited only by the ALUs, not by lack of memory/register space.
 

VIAN

Diamond Member
Aug 22, 2003
6,575
1
0
What if Nvidia gave FX owners the option to choose partial or full precision, just for cases such as this, where some wise guy thinks it's funny to kick Nvidia in the nuts.

Kickin'em when their down. Shame on you Valve.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Few random things-

FP24/FP32 was not finalized when the R3x0 and NV3x were in that stage of development, the spec actually wasn't finalized until some time after it was too late to change them. NV went IEEE, ATi was hoping MS would go FP24 which they did.

You can't say: "If this is the case, FX series cards should be able to run HL2 as fast or faster than 9700/9800Pros" Simply because the FX and the 9800/9700 are not running the same precision

Using that logic you can't compare the FX to the R3x0 ever as they are always incapable of running the same precission.

As far as Valve's part in this-

The sacrifices that you encounter by running either the mixed mode path or the DX8 path are obviously visual. The 5900 Ultra, running in mixed mode, will exhibit some banding effects as a result of a loss in precision (FP16 vs. FP32), but still looks good - just not as good as the full DX9 code path. There is a noticeable difference between this mixed mode and the dx82 mode, as well as the straight DX8 path. For example, you'll notice that shader effects on the water aren't as impressive as they are in the native DX9 path.

Are the visual tradeoffs perceptive? Yes. The native DX9 path clearly looks better than anything else, especially the DX8.0/8.1 modes.

September 12, 2003 They HAD the PP hints in back then. Why did they get pulled out? There appears to be several shaders in the game that are not impacted by the reduction in accuracy, and some that clearly are. Leaving the option for PP that they already had in over a year before the game came out sounds like it would be a bit less work then going through and pulling it all out(even if it wasn't quite optimal in terms of which shader needed what level of accuracy).

What if Nvidia gave FX owners the option to choose partial or full precision, just for cases such as this, where some wise guy thinks it's funny to kick Nvidia in the nuts.

If nVidia did this it would be considered cheating- after the whole FX debacle I don't think they are up for another round of that.
 

daveybrat

Elite Member
Super Moderator
Jan 31, 2000
5,818
1,032
126
I still want to know if anyone has tried this yet on their 5900XT or other FX cards??
 

Keysplayr

Elite Member
Jan 16, 2003
21,219
55
91
Originally posted by: sbuckler
Originally posted by: Matthias99
Originally posted by: Genx87
While that may be true. The FX does support FP32 which is higher than the full precision of DX9.

:confused: ???

I didn't think that was in dispute. To recap:

MS says DX9 (at least with SM2.0) = FP24, with optional FP16 support. NVIDIA does only FP16 and FP32 -- so they have to run FP24 shaders in FP32 (which they do very slowly on GeForceFX hardware). ATI runs everything at full speed at FP24.

HL2 normally requests that everything is run at FP24, which hoses GeForceFX cards in DX9 mode. Someone discovered a way to force it to run in FP16 and says there's no/minimal image quality loss (although it appears nobody has done any in-depth testing), and it makes GeForceFX hardware actually run acceptably in DX9 mode. Various parties are crying "foul" and claiming huge ATI/Valve conspiracies and the like because Valve didn't bend over backwards and support partial precision shaders for NVIDIA.

I think that about sums it up.


FP24 is partial precision when compared to FP32, it's all relative and I'm sure MS doesn't make up it's mind then force ati and nvidia to follow some spec, they just play middle man and try to come up with something that supports the features on the cards both companies produce.

If the tweak really works (I await an indepth review) valve could have added a simple checkbox in the renderer options which made the engine use FP16 everywhere, and switched FX cards to use DX9 path with this option ticked by default. It seems a bit unfair on all those tens of thousands of paying customers using FX hardware that they should needlessly have to run with worse visuals.

Would be interesting to try the same trick with farcry.


Why would you say something like this? What crevice did you reach down and pull this out of. If you don't know what your saying, then just don't say it. Microsoft is the BOSS when it comes to DXspecs and Nvidia/ATI both have to scramble to comply if they want that precious WHQL certificate for Windows.
I'm sure they all work together to some extent, but MS has the final word.