Interesting read about nVidia's hardware bug

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Dark4ng3l

Diamond Member
Sep 17, 2000
5,061
1
0
Yes ATi used a hack but at least they fixed the problem. If hsr drivers for the v5 would have been completed they would have been considered a hack but the fact is that they worked so who cares?
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Doomguy-

Is he still saying that? Don't know him from Ars(except that one thread you pointed me to), but I do know him from other forums. nVidia's boards are not displaying any sort of Z errors as ATi boards are(which he dances around by saying Mac drivers don't have that problem), and there isn't an application I have seen yet that backs his claim. I thought I cleared that up in that thread a while back? Give me some linkage if he is at is again;):)
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,007
126
BenSkywalker:

nVidia does not use 16bit ZBuffer, they are by far the best in this aspect

OK, so please explain two things for me:

Firstly, the GF2 was losing to a Radeon when using the Det 2 drivers in memory bandwidth limited situations. Then when nVidia released the Det 3 drivers suddenly they pulled ahead. Tell me, how did nVidia circumvent the problem of memory bandwidth via driver update?

ATi used Hyper-Z which was already available to them in their hardware. What did nVidia use? Did they unlock some magic HSR feature which was lying dormant or something?

Secondly, why are a number people reporting visual artifacts with the Det 3 drivers at long distances in Quake 3? I've spoken to a few guys who know what to look for and they told me that was the first thing they noticed. I'm just kicking myself for forgetting to test it on my GF2 MX when I had it.

they are by far the best in this aspect and do not have the rather serious flaws that ATi displays in this particular area with the Radeon.

Which errors are these? Not one ATi user has complained about such errors. The only game I know of that has errors is Mercedes Beinz Truck Racing, and I'm pretty sure that ATi said they had fixed the problems in the latest beta driver release.

Another quick point, you take on average a whopping 2FPS hit using DXT3 instead of DXT1(running Quaver, not Timedemo1),

What were the system specs? What were the game settings?

Doomguy:

BFG: You believe MrNSX at arstechnica about the NVidia 20bit zbuffer thing? HAHA. Why is NVidia's Zbuffer so much more accurate than ATI's if its a 20 bit zbuffer?

Well I said nVidia use 16 bits and he said they use 20 bits and said something about the upper 4 bits being used for some special trick. If anything he was kinder to them than I was. He also says he has several scenarios which proves that he is correct.

I agree with him because there are a lot of Quake 3 players who also complained about the same thing.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
BFG-

"Firstly, the GF2 was losing to a Radeon when using the Det 2 drivers in memory bandwidth limited situations. Then when nVidia released the Det 3 drivers suddenly they pulled ahead. Tell me, how did nVidia circumvent the problem of memory bandwidth via driver update?"

By optimizing the drivers for the GeForce series of boards. The Detonator3 drivers were completely new, they were not the traditional update.

"ATi used Hyper-Z which was already available to them in their hardware. What did nVidia use? Did they unlock some magic HSR feature which was lying dormant or something?"

I would wager that ATi still has quite a bit of performance to gain through drivers. 3dfx also was keeping up with nVidia quite well, surpassing ATi in many situations using default settings.

"Secondly, why are a number people reporting visual artifacts with the Det 3 drivers at long distances in Quake 3? I've spoken to a few guys who know what to look for and they told me that was the first thing they noticed. I'm just kicking myself for forgetting to test it on my GF2 MX when I had it."

I can't say that I have noticed it in Quake3, though I have in UnrealTournament using the earlier Det3 drivers(pre-6.31). Why is it doing it? Perhaps some sort of occlusion culling.

"What were the system specs? What were the game settings?"

Multiple systems, my main rig that I tested with was an Athlon 550, Asus K7M, 192MB and 320MB of RAM(tried with both), GeForce DDR(130/301 default for my board), Det2 and Det3 drivers half a dozen different versions at least. Game settings were UHQ(everything cranked) with anisotropic both on and off resolutions up to and including 1600x1200. Other systems included various PIIIs and T-Bird/Duron/Athlons(no Celerons though) with GF2 and GF2MX graphics boards. This only works on nVidia boards that I have tested, although it should also work on S3 boards if anyone wants to try it out. There are some files up on Gamebasement for anyone who wants to try it for any of the Quake3 versions prior to the latest(certain problems pop up currently for some reason with people who have TeamArena installed using the latest fix I have whipped up, need to test a bit more).

"Which errors are these? Not one ATi user has complained about such errors. The only game I know of that has errors is Mercedes Beinz Truck Racing, and I'm pretty sure that ATi said they had fixed the problems in the latest beta driver release."

Games, heh. I have already told him to load up ViewPerf and compare the two, something that truly stresses ZBuffer accuracy. If nVidia were using 16 or 20 bit Z it would show up with some serious flaws under the tests in that application(in fact it does if you use 16bit color which uses 16bitZ). The errors are very clear on both the Radeon and the nV boards when running in 16bit, only the Radeon still displays them when running in 32bit. A precise Zbuffer is very important to me, I would revert back to the 5.xx Dets if there was any truth to his assertion.
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,007
126
BenSkywalker:

By optimizing the drivers for the GeForce series of boards. The Detonator3 drivers were completely new, they were not the traditional update.

Yes but what exactly did they optimise? Driver optimisations only have an effect on CPU limited situations (ie low res) and not in hardware limited situations (ie high res).

Unless the drivers utilise some hardware that was previously unused (some tiling scheme for example) or adopt a radically different approach to rendering (ie HSR), hardware limitations are not affected by driver updates.

So what exactly did nVidia change to save memory bandwidth and beat the Radeon?

Other systems included various PIIIs and T-Bird/Duron/Athlons

Specifically, what MHz was the fastest CPU operating at? And what was the difference between using DXT1 and DXT3 in the benchmarks with that CPU? I want to make sure there's no chance of being CPU limited in these tests.

Games, heh. I have already told him to load up ViewPerf and compare the two, something that truly stresses ZBuffer accuracy. If nVidia were using 16 or 20 bit Z it would show up with some serious flaws under the tests in that application(in fact it does if you use 16bit color which uses 16bitZ).

Professional OpenGL apps utilise OpenGL extensions that are not used in games, correct? Could it be possible that nVidia can detect the difference between a professional app (where image quality is more important than performance) and hence use a full Z-buffer in that situation?
 

RoboTECH

Platinum Member
Jun 16, 2000
2,034
0
0
Ben, Ben, Ben...once again you are missing the point.

We're talking about REAL WORLD GAMING here bud. You can quote your professional OGL applications and white papers and technical specification data sheets all you want.

The fact is GTS TC looks like $hit, ATi and 3dfx's doesn't.

Period.

Next?
 

OneOfTheseDays

Diamond Member
Jan 15, 2000
7,052
0
0
oh please, with the new sky fix for q3a, the geforces tc compression looks just as good as the competitions. Anybody that says otherwise obviously isn't playing q3a at all, they are probably staring at the wall or looking for artifacts. And robotech, since your talking about real world performance, lets talk real world here. No quake 3 player i have talked to can honestly tell a difference between tc compression on all the different cards when they are fraggin away.
 

RoboTECH

Platinum Member
Jun 16, 2000
2,034
0
0
Sudheer, I can tell. I dumped the 32MB GTS for a 64MB GTS just so I could disable TC. Take a look at the newsgroup alt.games.quake3. Lotsa Quake3 geeks there. The "fix" (of turning off TC) is in our FAQ. Yes, "frequently asked questions"

In otherwords, EVERYBODY can notice it, newbies and experienced players.

Besides, did you pay a few hundred $$$ to have a game look like crap? It kills me to see people with this super-high end hardware go 640x480 w/vertex lighting and r_picmip 5.

WHY THE HELL DID YOU BUY A DECENT VIDEO CARD???

a V3 or a Ge256 SDR will do 640x480 with lowa$$ ugly settings @ 200 fps easily. Why bother with a damn high-end video card?

and the "fix" you talk about, i.e. the hack which changes the Q3 executable, won't work for pure servers, RA3 servers, 1.27 servers, Team Arena servers, CPM servers, CPMA servers or OSP servers

in other words, that hack is almost worthless if you play online.


 

PeAK

Member
Sep 25, 2000
183
0
0
...just a historical footnote on the "turbo" drivers
from ATI for the "Rage Pro" (3rd generation/0.35um graphics chip).

The driver actually did increase the benchmark numbers by 40%.
Why the problem ? It is emabarassing but it came down too much trust
in WinBench (in 1998) and not testing on real gaming applications
to reveal a "off by one" coding error. Thats it.

The benchmark program "WinBench" automactically
disables VSYNC during its operation. The "Wait on VSYNC" option
in those days of hardware design was controlled by the
gaming application (which invariably followed MS's guideline
and turned it on). The problem: Driver code waited for two successive
VYSNC signals (instead of one) processing more data. Check out
the following ATI flipping.

P.S. Two years later and almost all hardware today can turn off
waiting on VSYNC. Very few sites however talk about false image
problems that can occur when the game framerate exceeds the monitor
refresh rate.

P.P.S If you look at all the high click through sites that run
advertisements, you would have thought that one of them would
have asked the question about quality tradeoffs that might
have occurred with a 30% increase in benchmark performance.
This particular issue may become the "trojan horse" to uncover
just how simplistic graphic hardware review sites are in their
use of the "framerate counters" to rank video cards.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Robo-

"We're talking about REAL WORLD GAMING here bud."

How's the TC on UT with the V5 again? Are you saying that isn't a game? In Quake3 when you use the FIX(which isn't diabling it), you can't tell the difference in the sky between a Radeon and a GeForce board.

"The fact is GTS TC looks like $hit, ATi and 3dfx's doesn't."

Wrong. When an application makes a call for DXT1 the GeForce looks poor, every other compression method, including many that neither the V5 or Radeon support, works without the slightest bit of a problem.

"The "fix" (of turning off TC) is in our FAQ. Yes, "frequently asked questions"

That is not a FIX, that is the DEFAULT SETTING. The fix switches the S3TC call to DXT3(instead of 1) which is what should have been used in the first place, but with the other offerings falling short in feature support(big surprise) it wasn't an option.

"and the "fix" you talk about, i.e. the hack which changes the Q3 executable, won't work for pure servers, RA3 servers, 1.27 servers, Team Arena servers, CPM servers, CPMA servers or OSP servers

in other words, that hack is almost worthless if you play online."


I take it you are not following the video card market very closely? Since you must have missed it, you can simply use a freely available tweak utility that will intercept DXT1 calls and switch them to the compression format of your choice(for Det3 users), without changing the Quake3(the only game that has a problem) exe file.

Why not list a game that isn't Quake3 that has a problem with TC on the GeForce boards and then explain exactly why it is that every other game works just fine and only Quake3 has problems? NOLF uses TC and doesn't have any problems for me at all, I would say that overall it is the equal, perhaps slightly better then, Quake3 in terms of graphics(particularly textures).

BFG-

"Yes but what exactly did they optimise? Driver optimisations only have an effect on CPU limited situations (ie low res) and not in hardware limited situations (ie high res)."

Wrong. See Voodoo5 launch benches compared to current drivers, or even the Radeon. That is a myth that has been held for some time, it certainly isn't the case.

"Unless the drivers utilise some hardware that was previously unused (some tiling scheme for example) or adopt a radically different approach to rendering (ie HSR), hardware limitations are not affected by driver updates."

Effective useage of available board level resources can certainly greatly increase the effectiveness of memory bandwith utilization.

"So what exactly did nVidia change to save memory bandwidth and beat the Radeon?"

I don't know the exacts, but perhaps the built a driver with full intentions of maximizing the cache available to the GeForce boards. Think of disabling the L2 cache on your CPU and the type of performance hit you will take, particularly in memory intensive applications. Your system memory bandwith wouldn't be effected, but you would lose quite a bit of performance.

"Specifically, what MHz was the fastest CPU operating at? And what was the difference between using DXT1 and DXT3 in the benchmarks with that CPU? I want to make sure there's no chance of being CPU limited in these tests."

900MHZ T-Bird was the fastest, not that it matters. At 1600x1200 32bit UHQ under Quaver with a GeForce DDR I could likely run a Pentium 166 and not be CPU limited(hitting somewhere between 15-20FPS, don't recall exactly off the top of my head except it is very slow). The FPS fluctuation, if it was any other bench, I wouldn't even mention. Quake3 is so extremely repeatable that it is worth noting, under UT I see much larger variations running back to back benches changing nothing.

"Professional OpenGL apps utilise OpenGL extensions that are not used in games, correct? Could it be possible that nVidia can detect the difference between a professional app (where image quality is more important than performance) and hence use a full Z-buffer in that situation?"

That would be an interesting idea except for one rather very important factor, the TNT2s are slower in everything(or pretty much everything) using the Det3s. If nVidia was reverting to a 16bit Z then the TNT2 would benefit significantly, instead it is slower. With the lower performance in the lower resolutions I think it is likely that they are doing some sort of very basic "HSR" on top of overall optimizations for the GeForce.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86


<< Next thing you know, people will start calling
all those nVidia driver releases what they
truly are, patches.
>>



Nah...let's try not to piss hardware off too much ;)
If there is a new version, it is a patch, update, service pack, etc.
I personally like that NVidia does this, as they seem to be able to fix more than 3Dfx did...even if they did try to cover up their bugs (Like they are the first to do so... :p).
 

RoboTECH

Platinum Member
Jun 16, 2000
2,034
0
0
Ben, DXT1 doesn't work properly on a GTS. I'm sure you will admit that. that's my point. Q3 uses it.

Just recently read about the registry setting for the latest drivers.

If it's not a problem with nvidia's, then why are they able to fix it in their drivers?

about time, too!
 

AdamK47

Lifer
Oct 9, 1999
15,846
3,638
136
BFG10K, you'll grow tired of this &quot;anything is better than nVidia mentality&quot; sooner or later.
 

OneOfTheseDays

Diamond Member
Jan 15, 2000
7,052
0
0
Robotech and BFG give it up will ya. 3dfx is DEAD. Quite trying to put Nvidia down at every chance you get. TC on the geforce is FINE on all games. Quake 3, with the fix, looks FINE. If you think otherwise, YOU OBVIOUSLY haven't seen it in action. And with a 64 meg geforce you can disable tc because you have enough memory to run at the high res. With tc off it is more than fast enough for online playing. You need to stop with all this anti-nvidia crap because guess what, nvidia is all we have left.
 

AdamK47

Lifer
Oct 9, 1999
15,846
3,638
136


<< nvidia is all we have left. >>



You better maneuver those bombers into a safe area. I hear the flak guns getting ready.
 

RoboTECH

Platinum Member
Jun 16, 2000
2,034
0
0
I was not aware of a registry hack that allows the GTS to use DXT3 instead of DXT1.

All I knew of was the altered Q3 executable, which doesn't work for online play basically.

Aside from the sky, how do the walls, floors and lighting look?

and please don't say &quot;it looks fine&quot; unless you recognized what it looks(ed?) like when it *wasn't* fine.

thx.
 

RoboTECH

Platinum Member
Jun 16, 2000
2,034
0
0
and BTW Sudheer, 3dfx the company is dead.

My 5500 still works. Go figure.

As far as the 64MB GTS running Q3 with TC disabled, I'm quite well aware of that. That's the setup I had for awhile. :) But thanks anyway.

As far as nvidia being &quot;all that is left&quot;, guess what? You're wrong. My 5500 runs fine. Support is dead (unfortunately), but as it stands, it runs every game that I know of quite well. I know of no present game incompatibilities. Will that change in the future? Perhaps...we'll just have to wait and see.

I bet a bunch of ATi guys are torquing up their flamethrowers for you over that statement tho, hehehe....:D
 

Taz4158

Banned
Oct 16, 2000
4,501
0
0


<< I bet a bunch of ATi guys are torquing up their flamethrowers for you over that statement tho, hehehe.... >>



He's not even worth the time. Never thought I'd say that. Delusional probably equals happiness in his case.
 

OneOfTheseDays

Diamond Member
Jan 15, 2000
7,052
0
0
Well if you look at the future, we see only two companies in the 3d card business. The voodoo 5 will be eclipsed in terms of performance and its driver support will be gone. Ati has been known for NOT executing on time, and odds are that Nvidia will have the better product out there. So in reality, we really do only have Nvidia, that is unless Ati pulls out something really surprising. However, critics said this about the Radeon and the Radeon wasn't really better than its competition. Sure, it beat the geforce in some areas, but in others the geforce wins.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Robo-

&quot;Aside from the sky, how do the walls, floors and lighting look?

and please don't say &quot;it looks fine&quot; unless you recognized what it looks(ed?) like when it *wasn't* fine.&quot;


The &quot;rainbowing&quot;? All gone. id disabled TC on lightmaps using the latest patch, you can verify this by using the r_lightmap 1 command, completely fixed. No game should compress lightmaps, and id has fixed that mistake:)

&quot;Ben, DXT1 doesn't work properly on a GTS. I'm sure you will admit that. that's my point. Q3 uses it.&quot;

Compared to what exactly? The V5 uses FXT1, the Radeon noone knows exactly what it is doing. I can show you screenshots from on line reviews of S3 boards that have the exact same problems the GeForces do(an old Sharky review, I think of the S2K, pops into my head right off, January of this year if memory serves). When S3, who created the standard, is showing the exact same problems as the GeForce series of boards....

&quot;Just recently read about the registry setting for the latest drivers.

If it's not a problem with nvidia's, then why are they able to fix it in their drivers?&quot;


Perhaps a better question would be how would they possibly fix a hardware problem via drivers:)
 

oldfart

Lifer
Dec 2, 1999
10,207
0
0
Hmm...Playing around with Serious Sam Demo. There is a scripts\addons directory. There is a script file called Gfx-NVS3TC.scr, and a .des file called GfxS3TC.des.

The des files reads:

GFX: nVidia S3TC fix
fixes compressed textures on GeForce boards

The Gfx-NVS3TC.scr reads:

// set all rendering console variables to initial values

ogl_iTextureCompressionType=1;
tex_iCompressionFormat=60;
tex_iDithering=3;

RefreshTextures();

I'll be honest. I have no idea what this means. Maybe nothing. It's a little curious that there is this &quot;fix&quot; file in the game only for GeForce cards.


I'll let Ben take it from here :)
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,007
126
BenSkywalker:

Wrong. See Voodoo5 launch benches compared to current drivers, or even the Radeon. That is a myth that has been held for some time, it certainly isn't the case

That would only be incorrect if the company isn't already utilising their hardware to its maximum potential. For example, 3dfx's early drivers disabled the second CPU in the V5 which caused the performance to be affected. In this case a driver update *did* improve performance in high resolution situations as well, but only because the hardware wasn't already pushed to the limit.

In order for the Det 3 drivers to improve performance, the Det 1 and Det 2 drvers must have been causing nVidia's boards to underperform. Surely nVidia wouldn't have left it like that for so long, fixing the problem only when ATi started to beat them?

Effective useage of available board level resources can certainly greatly increase the effectiveness of memory bandwith utilization.

That statement is pure theory and does not explain exactly what it is that nVidia did.

I don't know the exacts, but perhaps the built a driver with full intentions of maximizing the cache available to the GeForce boards.

Ah, so GF based boards have a hardware cache? Now we're getting somewhere. :)
What sort of a cache is it, how big is it and how does it work? I thought only the NV20 has a cache?

900MHZ T-Bird was the fastest, not that it matters.

OK, so you're telling me that DXT3 incurs little/no performance hit to DXT1? In that case it really doesn't matter whether nVidia's DXT1 compression scheme has problems or not.

That would be an interesting idea except for one rather very important factor, the TNT2s are slower in everything(or pretty much everything) using the Det3s.

Yes but the TNT2 is limited in its core, not by its memory. A 16 bit Z-buffer won't help it at all.

Also in theory nVidia's new drivers should leave the TNT2's performance untouched (unless your cache theory is correct, I can see the TNT2 drop in performance in that case). It's bizarre why the performance drops.

AdamK47-3DS:

BFG10K, you'll grow tired of this &quot;anything is better than nVidia mentality&quot; sooner or later.

What mentality is that? nVidia has a potential issue and I have a &quot;mentality&quot; because I want to start a discussion to see if it's true?
 

Packet

Senior member
Apr 24, 2000
557
0
0
If a few inaccuracies in TC are really that distracting from a game, then are you consintrating on the game itself, or looking for flaws of a perticular hardware/drivers?