Please lock this thread. No more useful discussion going on.

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

VIAN

Diamond Member
Aug 22, 2003
6,575
1
0
They should never leave OGL out in the dust. Not after they spend a lot of there time developing a really good one.
 

Jeff7181

Lifer
Aug 21, 2002
18,368
11
81
Originally posted by: McArra
Originally posted by: Jeff7181
Doesn't NFSU and KOTOR use PS2.0?

NfsU it is I think. I don't know if KOTOR is.

KOTOR has some metallic effects... like on armor and whatnot... I haven't played it much though, so I don't know about other stuff... I just played for a few minutes till I going to the point where I steal the armor from those guards in the apartments.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Doesn't NFSU and KOTOR use PS2.0?

KoTOR uses shaders, although I'm not aware of them being changed from the XBox version(1.1 level). Undergound I honestly don't know, did EA ever get their head out of their @ss when it comes to allowing the NFS games to install under Win2K?
 

DaveBaumann

Member
Mar 24, 2000
164
0
0
None of the current boards match full DX9 specs, they actually still have a ways to go.

FYI Ben, they do meet the specifications that have HAL targets. PS/VS3.0 only have software render targets in the current DX9 runtime - there is currently no HAL targets for PS/VS3.0 and DX9 will require a runtime update to expose them.
 

Acanthus

Lifer
Aug 28, 2001
19,915
2
76
ostif.org
Originally posted by: BenSkywalker
Native FP24? Why not go all the way and support 4bit color, or grayscale even? ;)

There is no way they are going to regress back to a more primitive core at this point. ATi will be moving to FP32 at some point in time, they will have no choice in the matter.

native FP-24 precision on the FX design would yield massive performance increases if they changed NOTHING else.

No, it wouldn't. The NV3X's design does not perform as we have seen due to the fact that it uses FP32 alone, it was other design decissions combined with that choice that ended it up where it is. Running FP24 all else being equal the NV3X would perform just like it does now in FP32 except with lower quality.

but I think 100% increases is a little optimistic

It doesn't require much optimisim to see 100% increase in pixel shader performance as completely viable for the NV40(v NV30). The big advantage using PF32 over any other current standard, they can combine the functionality of the Vertex and Pixel shaders in to one large shared shader unit. Besides that, they will certainly rectify the register limitations that made themselves so apparent on the NV3X core. Those factors alone could easily see PixelShader performance up ~100%, but the NV40 is a new core so it is far from unthinkable that they will also add more shader units anyway. How much of that will transfer to games is another matter, we need to start seeing some more shader heavy games before we can really speak much on that aspect(as it stands now we have two, TRAoD and Halo).

Then please expallin NVIDIAs stellar Open-GL performance across the board on the FX line, and their dismal DX9 shader performance across the board. It has always been my understanding that because of the FP-24 shader precision (this has NOTHING to do with color depth) the FX line has always suffered because of software converting of shaders from 24 up 32, rendered, the converted back to fp24 for display in the game/bench.
 

Quixfire

Diamond Member
Jul 31, 2001
6,892
0
0
Originally posted by: Jeff7181
Originally posted by: Quixfire
I believe the next level of GPUs will be geared towards PCI Express. The shear increase in bandwidth would allow them to stomp the last introduced video cards and start and upgrade frenzy. :D

Is the 8X AGP bus REALLY being saturated right now though?
No because the currect GPUs can process the data faster than the cpu can and pass it though the AGP bus. PCI Express will allow enough bandwidth for the CPU to help render some the the data and pass it directly through the GPU.

 

Jeff7181

Lifer
Aug 21, 2002
18,368
11
81
Originally posted by: Quixfire
Originally posted by: Jeff7181
Originally posted by: Quixfire
I believe the next level of GPUs will be geared towards PCI Express. The shear increase in bandwidth would allow them to stomp the last introduced video cards and start and upgrade frenzy. :D

Is the 8X AGP bus REALLY being saturated right now though?
No because the currect GPUs can process the data faster than the cpu can and pass it though the AGP bus. PCI Express will allow enough bandwidth for the CPU to help render some the the data and pass it directly through the GPU.

If that's the case then it's the CPU that isn't fast enough... not the bus.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Dave-

Why didn't MS include the HAL for all of the features supported? Seems odd to have refrast be able to handle the features, but not to allow hardware to use them.

Acanthus-

Then please expallin NVIDIAs stellar Open-GL performance across the board on the FX line, and their dismal DX9 shader performance across the board.

They don't have dismal DX9 performance across the board, that's just the popular line of thought. ShaderMark is based on a demo written by ATi for ATi hardware- and it won't even run properly on refrast for all the tests. 3DM2K3's pixel shader performance test performance is quite comparable between ATi and nV using the latest Futuremark approved driver/patch combination. TRAoD does have serious performance issues on the FX, no doubt about that, but Halo does quite well. In some instances the FX5950 is faster then the 9800XT in that PS 2.0 shader limited game. With all of that said, nVidia certainly has the potential to run in to a lot of performance issues ATi can somewhat avoid due to their limited number of registers.

It has always been my understanding that because of the FP-24 shader precision (this has NOTHING to do with color depth) the FX line has always suffered because of software converting of shaders from 24 up 32, rendered, the converted back to fp24 for display in the game/bench.

First off, 12INT, FP16, FP24 and FP32 are color depths. 12bits integer per color component- RGBA- 48bit color. FP16 uses 16bits of floating point per component- 64bit; FP24 24bits floating point per color component- 96bit etc.

If you want overbright pixels, or if you want to have accurate color representation after performing lengthy series of calculations you need higher then 32bit accuracy in order to do keep accuracy at that level for the final output. As far as the FX having to convert up and then back down, it doesn't use FP24 at all anywhere. It either uses FP16 or FP32 and then writes the end results of the shader to the framebuffer in 32bit color(8INT). ATi's part also writes to the framebuffer in 32bit color when using FP24(most of the time anyway).

The hardware isn't converting color accuracy for shader useage, it uses the level of accuracy it is set to and that's it. In order to start off with a particular exacting color of a given accuracy you would need pre computed state, using shaders is about not having to use precomputed state(it wouldn't be viable).
 

Blastman

Golden Member
Oct 21, 1999
1,758
0
76
3DM2K3's pixel shader performance test performance is quite comparable between ATi and nV using the latest Futuremark approved driver/patch combination.

Futuremark found the cheats in the ps 2.0 tests ?

? The 52.16 drivers have 3DMark03 specific optimization for the Pixel Shader 2.0 test and that score is solely comparable between nvidia cards.

 

Acanthus

Lifer
Aug 28, 2001
19,915
2
76
ostif.org
Originally posted by: BenSkywalker
Dave-

Why didn't MS include the HAL for all of the features supported? Seems odd to have refrast be able to handle the features, but not to allow hardware to use them.

Acanthus-

Then please expallin NVIDIAs stellar Open-GL performance across the board on the FX line, and their dismal DX9 shader performance across the board.

They don't have dismal DX9 performance across the board, that's just the popular line of thought. ShaderMark is based on a demo written by ATi for ATi hardware- and it won't even run properly on refrast for all the tests. 3DM2K3's pixel shader performance test performance is quite comparable between ATi and nV using the latest Futuremark approved driver/patch combination. TRAoD does have serious performance issues on the FX, no doubt about that, but Halo does quite well. In some instances the FX5950 is faster then the 9800XT in that PS 2.0 shader limited game. With all of that said, nVidia certainly has the potential to run in to a lot of performance issues ATi can somewhat avoid due to their limited number of registers.

It has always been my understanding that because of the FP-24 shader precision (this has NOTHING to do with color depth) the FX line has always suffered because of software converting of shaders from 24 up 32, rendered, the converted back to fp24 for display in the game/bench.

First off, 12INT, FP16, FP24 and FP32 are color depths. 12bits integer per color component- RGBA- 48bit color. FP16 uses 16bits of floating point per component- 64bit; FP24 24bits floating point per color component- 96bit etc.

If you want overbright pixels, or if you want to have accurate color representation after performing lengthy series of calculations you need higher then 32bit accuracy in order to do keep accuracy at that level for the final output. As far as the FX having to convert up and then back down, it doesn't use FP24 at all anywhere. It either uses FP16 or FP32 and then writes the end results of the shader to the framebuffer in 32bit color(8INT). ATi's part also writes to the framebuffer in 32bit color when using FP24(most of the time anyway).

The hardware isn't converting color accuracy for shader useage, it uses the level of accuracy it is set to and that's it. In order to start off with a particular exacting color of a given accuracy you would need pre computed state, using shaders is about not having to use precomputed state(it wouldn't be viable).

If this is true, NVIDIA would have a noticible IQ advantage over ATi, which they dont.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Hadn't seen the PS2.0 comments from FM, haven't checked back since they did their last major update. Interesting.

If this is true, NVIDIA would have a noticible IQ advantage over ATi, which they dont.

They would if pushed far enough. Because they are writing back to an INT8 framebuffer you need to have a situation where rounding causes enough error to make it visible even when written back to a 8INT FB. This is the big concern for nV's FP16 not being enough accuracy, it took some time for people to figure out how to show disparity between FP16 and FP24, it would be much more difficult to do the same between FP24 and FP32. The advantages of FP24 over FP16 are questionable in the majority of situations we have seen to date, FP32 over FP24 would be even more difficult to show conclusively to be required with anything current or on the horizon as that level of accuracy simply isn't needed for consumer based applications(in the pro market it isn't that difficult at all to have shaders that can show the difference, however these run far too slow to be viable for real time gaming).

The big advantage to using FP32 for pixel shaders is that vertex shaders require FP32 and some of the functionality of the units can be combined to make a more general purpose unit moving foreward.

You can not believe me if you like, check any quarter way decent review of the new cores when they debuted or check the IHVs websites or ask in any forum from any knowledgeable member, 12INT, FP16, FP24 and FP32 are all color standards.
 

Acanthus

Lifer
Aug 28, 2001
19,915
2
76
ostif.org
Originally posted by: BenSkywalker
Hadn't seen the PS2.0 comments from FM, haven't checked back since they did their last major update. Interesting.

If this is true, NVIDIA would have a noticible IQ advantage over ATi, which they dont.

They would if pushed far enough. Because they are writing back to an INT8 framebuffer you need to have a situation where rounding causes enough error to make it visible even when written back to a 8INT FB. This is the big concern for nV's FP16 not being enough accuracy, it took some time for people to figure out how to show disparity between FP16 and FP24, it would be much more difficult to do the same between FP24 and FP32. The advantages of FP24 over FP16 are questionable in the majority of situations we have seen to date, FP32 over FP24 would be even more difficult to show conclusively to be required with anything current or on the horizon as that level of accuracy simply isn't needed for consumer based applications(in the pro market it isn't that difficult at all to have shaders that can show the difference, however these run far too slow to be viable for real time gaming).

The big advantage to using FP32 for pixel shaders is that vertex shaders require FP32 and some of the functionality of the units can be combined to make a more general purpose unit moving foreward.

You can not believe me if you like, check any quarter way decent review of the new cores when they debuted or check the IHVs websites or ask in any forum from any knowledgeable member, 12INT, FP16, FP24 and FP32 are all color standards.

That makes a lot more sense. When i said it had nothing to do with color depth i meant that it had nothing to do with desktop/overall game color depth.
 

DaveBaumann

Member
Mar 24, 2000
164
0
0
Why didn't MS include the HAL for all of the features supported? Seems odd to have refrast be able to handle the features, but not to allow hardware to use them.

Probably because there is no hardware that can support it at the moment so there is no way of testing a HAL layer. Plus, MS get to release some nice PR and force lots of people to visit their site to update their runtime if/when PS/VS3.0 hardware becomes available.

ShaderMark is based on a demo written by ATi for ATi hardware- and it won't even run properly on refrast for all the tests.

Shadermark 1 was, ShaderMark 2 was written from scratch, AFAIK, and mainly via HLSL.

Halo does quite well. In some instances the FX5950 is faster then the 9800XT in that PS 2.0 shader limited game.

Actually, Halo isn't necessarily running the same code for ATI and NVIDIA. there are some features (the predator effect for example) that wouldn't run on with NVIDIA so they did something different for FX hardware. It would be interesting to find out exactly what path differences there are.

The big advantage to using FP32 for pixel shaders is that vertex shaders require FP32 and some of the functionality of the units can be combined to make a more general purpose unit moving foreward.

The fact that the shader is FP32 or not isn't really the stumbling block here - its pretty much an inconsequential issue. The real issue for unified shader models is the control of the program execution, and thats going to be the one of the real focuses for DX 10/Next hardware to get right. This could end up being one of the largest performance differentiators for the 2005/6 generation of hardware (given roughly equivelent process / die sizes etc.).
 

kylebisme

Diamond Member
Mar 25, 2000
9,396
0
0
Originally posted by: VIAN
I expect nVIDIA to pull ahead in this next battle. I feel that they are being tight lipped for a very good reason.

blind fanboyism?
 

Regs

Lifer
Aug 9, 2002
16,666
21
81
Nvidia pull ahead? I thought they're all ready ahead from other than a misfortune with Microsoft.

"Also, current rumors put the R420 as eight-pipeline but much-enhanced (meaning more shader power per clock). As it seems both companies will stick with 256-bit memory buses for the upcoming gen, more than eight pixel pipes is somewhat wasteful. " - Pete

Very handy piece of information. I'm sure there is still many people out there thinking the RV420 will have double the pipeline forgetting a 256-bit bus won't be able to feed that pipeline.

Even when these rumored specs are solidified for both Nvidia's and ATi's new generation of cards, still a long way ahead of finding out which core and which company will offer the prime performance for that generation of games. Even today, seems like people are still debating on which card (ati vs nvidia) is better for 3D games.
 

Mingon

Diamond Member
Apr 2, 2000
3,012
0
0
The current nv35 has quite a powerful vertex shader unit its the pixel shader that currently holds it back. I would guess that the 25million extra transistors added another 2 pipeline onto the nv35 (so 6 x 2) and that the other 25million added transitors was mostly aimed at increasing pixelshading power.

So my guess is 175million transistor, 550mhz ultra version with 6 x 2 pipeline config, double the PS units alongside the additional VS units for the other 2 pipelines.
 

kylebisme

Diamond Member
Mar 25, 2000
9,396
0
0
lol, i suppose in that case nvidia would have to claim it is a 12 pipe card unless they want to admit that their 5800-5950's are 4 pipers.
 

Mingon

Diamond Member
Apr 2, 2000
3,012
0
0
Its an odd one the nvidia layout, I am sure dave Bauman could explain it better. I thought i was more flexible than a traditional fixed 4 pipeline design and able to processs as an 8 pipeline unit under most circumstances.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Dave-

Probably because there is no hardware that can support it at the moment so there is no way of testing a HAL layer.

It wouldn't be the first time that MS has shipped features in DX before they were supported in hardware though.

Actually, Halo isn't necessarily running the same code for ATI and NVIDIA.

We have heard that ATi isn't running all of the features from Bungie(or was it someone from GearBox, not sure which), although I put little credit in the statement, seemed to Newellesque to give it much weight.

there are some features (the predator effect for example) that wouldn't run on with NVIDIA so they did something different for FX hardware.

Active Camouflage? What exactly is different? Does it look noticeably worse on ATi's parts like the flashlight, or is it something that isn't that noticeable? Have to look for it next time I get a chance to play the game with a R3x0.

The fact that the shader is FP32 or not isn't really the stumbling block here - its pretty much an inconsequential issue.

The hardware being FP32 is rather important, and I was talking about the shader units in particular. Obviously having shader precission is not the only factor.

TheSnowman-

lol, i suppose in that case nvidia would have to claim it is a 12 pipe card unless they want to admit that their 5800-5950's are 4 pipers.

Not necessarily. If they did move to 6x2(which I doubt) it doesn't mean that they would be able to process twelve pixels per clock while the NV30/35 can process eight pixels per clock.

blind fanboyism?

It could be argued that ATi's refresh might have trouble with nV's new core just due to the way their lineups match up for this round(not that it will, but based on history that wouldn't be a shocking outcome). I certainly wouldn't rule out PVR however(which I assume VIAN did while making that comment).

Mingon-

I thought i was more flexible than a traditional fixed 4 pipeline design and able to processs as an 8 pipeline unit under most circumstances.

Not under most, just under special cases. If the chip is processing pixels with no color data it can operate as an eight pixel pipe part, one of the reasons why the performance of the NV3X parts are out of whack in DooM3 in relation to everything else(in a good way for them).
 

kylebisme

Diamond Member
Mar 25, 2000
9,396
0
0
lol Ben, sure it is quite possible that nvidia could take the lead. but best i can tell VIAN saying that he thinks they will without any supporting evidence is nothing more than wishful thinking. also, if you are going with the opinion that the r420 is a "refresh" you idea of history leads you to presume the performance increase will be similar to the r350 and r360 refreshes you are greatly underestimating the situation.
 

Regs

Lifer
Aug 9, 2002
16,666
21
81
presume the performance increase will be similar to the r350 and r360 refreshes you are greatly underestimating the situation.

That could just be pure marketing as well. Just like how some got caught buying a 9800 with the expectation to play HL2 with it.
 

kylebisme

Diamond Member
Mar 25, 2000
9,396
0
0
na the r420 is confirmed to be .13m, so so it is clearly more than a refresh of the .15m r3xx core. it is more of a bastard child of the r3xx, being basically a souped up double rv360 and the rv3xx being a half r3xx redesigned for the .13m process. regardless, how the r420 will compare to the nv40 a matter of idle speculation for us until we start to see some solid confirmation of specs.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
you are greatly underestimating the situation

Kind of like underestimating when ATi said the 9600XT would be faster then the 9700Pro(forget the 9500Pro) in all situations and I laughed openly at them? ;)

I have no doubt that the "R420" will be a better, errr, improvement then the 98xx was over the 97xx, but what is that really saying? ATi has increased their performance by about 10% over the last ~18months and they still don't have a new core. I'm aware of them rolling portions of the R400 in to the "R420", a core so well designed they dropped it completely(inspires awe to be sure ;) ).

If you have faith that ATi is going to release the fastest part this upcoming gen, you must assume that nVidia is going to screw up badly. ATi should be able to reclaim the crown this fall when the generational misalignment will fall in to their favor and the R500 is going up against the NV45. I see the battle this round being between PowerVR(if they can ship) and nVidia. nVidia has one real weakness this generation, pixel shader 2.0 performance. They haven't been known to repeat a perrformance deficit for two generations.

PowerVR's part should be very interesting, as long as they have their long standing issues with geometric throughput solved then it could be a very, very solid part. TBR's should offer AA with nigh no performance hit this gen(~1%). Being able to run 1600x1200 w/6x AA at the same speed as no AA should give them a major edge over the competition. Also, due to the complete removal of OD their pixel shader performance should be competitive if not superior(depends on where they are aiming the part) to nV or ATi. Geometric throughput and texture filtering has to be improved significantly over the prior generation, their drivers for the Kyro2 were far more solid then ATi's drivers are to date on any of their parts so have that going for them also(not quite nV level, but that is in no small part to having to offer workarounds for so many games due to their rendering technique).

PowerVR's big problem is likely to come from the fanatics that will accuse them of cheating on every game they offer workarounds for due to the nature of their part(which will be BS of course). Other then that, they should have a very competitive part to keep pressure up on nVidia in terms of pricing for this upcoming gen.