G80 Physx ETA

taltamir · Jun 29, 2008

Originally posted by: lopri

Originally posted by: taltamir
You don't need to hack the physX drivers.. you install them as is, weather they work or not depends on your video card drivers. Which can be hacked... I got it working on a 8800GTS 512 (G92)... and i hear some people got it on G80.

You could also just wait a little bit and nvidia will release official support, eventually.

Click to expand...

What I'm wondering is:

1. There is an unused part of a GPU while rendering 3D scenes, and that is what accelerates PhysX.
2. The processing unit is the same for 3D rendering and PhysX and there is a compromise on rendering, but the net result is a plus. (say lose 3 FPS on rendering but gain 5 FPS from PhysX? or something like that)

Which scenario is it? If it's #2, I'd think there'd be a drivers nightmare.

I think an ideal scenario is having a dedicated GPU. This of course would not be SLI because the GPUs need not be the same. And you would be able to keep using your old video card for physX while your new card did the rendering.

But I am still unsure what the exact mechanism of GPU PhysX is.

in a physX intensive situation:
Quad Core 3GHZ working at 100% can only calculate the physX + graphic precalculations for 5 - 10 fps.
The video card is sitting idle, only receiving the "precalculations" from the CPU for that many frames.

With cuda physX the CPU only calculates AI + graphic precalculations... say, at 20% utilization... while the GPU dedicates say 70% of its power to calculate the video and 30% to calculate the physX. And it achives, say, 25fps... had the video card had more performance, it could increase both the physX and graphic speed rendering to increase it further.

Actually v2 (which has been out before nvidia bought ageia) of the physX engine can calculate on both the CPU + GPU AT ONCE...
So in the above situation the CPU uses say 20% for doing AI and precalculations, 80% for doing physX, and that leaves the GPU to use 20% of it for physX (instead of 30%) increasing its graphics from 70 to 80% GPU utilization and increasing performance accordingly.

These numbers are all made up, but they are about right. The GPU has much much more power when it comes to raw arithematics, which is exactly what is needed for graphics and for physX.

AFAIK the 9800GTX is about 10-20 times the raw arithmatic power (aka, what is used in physics, graphics, cryptography, and a few other types of calculations) of a quad core extreme 3ghz for 1000$ and the GTX 280 twice that.

http://en.wikipedia.org/wiki/FLOPS

Intel Corporation has recently unveiled the experimental multi-core POLARIS chip, which achieves 1 TFLOPS at 3.2 GHz. The 80-core chip can increase this to 1.8 TFLOPS at 5.6 GHz, although the thermal dissipation at this frequency exceeds 260 watts.

As of 2007, the fastest PC processors (quad-core) perform over 30 GFLOPS.[11] GPUs in PCs are considerably more powerful in pure FLOPS. For example, in the GeForce 8 Series the nVidia 8800 Ultra performs around 576 GFLOPS on 128 Processing elements. This equates to around 4.5 GFLOPS per element, compared with 2.75 per core for the Blue Gene/L. It should be noted that the 8800 series performs only single precision calculations, and that while GPUs are highly efficient at calculations they are not as flexible as a general purpose CPU.

Note that the very latest cards from AMD and nVidia reach nearly a teraflop by themselves.
While the very latest quad core from intel today is about 40 gigaflop (the article says 30, but it is outdated).

So... a 900+ gigaflop video card for 200$ compared to a 40 gigaflop 1000+$ CPU...
Or a 900+ gigaflop video card for 200$ compared to a 20-30 gigaflop 200$ CPU.

As you can see, for things that require raw floating point operations per second a CPU is simply not even close to comparable!

Intel's experimental 80 core 3.2ghz polaris chip gets the same flop performance as a single video card.

Keysplayr · Jun 29, 2008

Originally posted by: MarcVenice
Still not convinced Keys. How many games are like that one single UT3 map ? I'd still rather enable AA then physx my CPU could run as well. And I still don't know what settings were used in that test. And I'm not convinced about your 8800gts 640mb story either. Two things, first, why did new gen cards double or tripple in shaders if they sit idle in CoD4, one of the better looking games out right now? ATI cards perform very good in CoD4, might have something to do with the increase in shaderpower, might not. Second thing, although I might be completely wrong here, don't know to much about GPU architecture, but what if 30 some shaders sit idle on ALL g80 8800gts's, 320/640mb cards? Simply because they are perhaps memory bandwith starved, not enough rop's or tmu's, in essence, an inbalanced card, where as the 9600gt was a better balanced card?

Why don't you just ask Azn if you don't believe me. Save me the trouble of going back and fourth 50 posts with ya.

EDIT: And Sickbeast. He tried it as well. And Marc, if you are wondering what part of the GPU runs Physx, it is the shaders. That has already been said. Whether they are used, or not. If there are free shaders for any given game, chances are there will be less of a performance hit when Physx is enabled. The opposite is also true. If all shaders are being used for any given game, then enabling Physx will obviously have a larger performance hit. No need for you to be convinced. That is basically how it is.

If you're looking for a detailed explaination of scheduling for Physx instructions and how it is done, I can ask. If you REALLY need to know. Otherwise, I don't see how it would matter anyway. It's done the way it's done.

As for your "two things"

"first, why did new gen cards double or tripple in shaders if they sit idle in CoD4,"

Maybe so they would run Physx better? And more compute power for other things? These cards arent only about gaming remember.

"Second thing, although I might be completely wrong here, don't know to much about GPU architecture, but what if 30 some shaders sit idle on ALL g80 8800gts's, 320/640mb cards?"

Well, that is a big if. I'm sure that every single game out there has it's "weight" on any given graphics card. CoD4 did not diminish in performance until we went below 64 shaders. Another game may benefit with more shaders available to it. I mean, of course there are. 9600GT doesn't equal 8800GT very often. Comes close, but not often.

MarcVenice · Jun 29, 2008

I didn't say I didn't believe your results in CoD4, it just didn't convince me. And Sickbeat more or less confirmed it for me. Read my previous post, and respond to that. That should get us somewhere.

hooflung · Jun 29, 2008

Originally posted by: taltamir

Originally posted by: lopri

Originally posted by: taltamir
You don't need to hack the physX drivers.. you install them as is, weather they work or not depends on your video card drivers. Which can be hacked... I got it working on a 8800GTS 512 (G92)... and i hear some people got it on G80.

You could also just wait a little bit and nvidia will release official support, eventually.

Click to expand...

What I'm wondering is:

1. There is an unused part of a GPU while rendering 3D scenes, and that is what accelerates PhysX.
2. The processing unit is the same for 3D rendering and PhysX and there is a compromise on rendering, but the net result is a plus. (say lose 3 FPS on rendering but gain 5 FPS from PhysX? or something like that)

Which scenario is it? If it's #2, I'd think there'd be a drivers nightmare.

I think an ideal scenario is having a dedicated GPU. This of course would not be SLI because the GPUs need not be the same. And you would be able to keep using your old video card for physX while your new card did the rendering.

But I am still unsure what the exact mechanism of GPU PhysX is.

Click to expand...

in a physX intensive situation:
Quad Core 3GHZ working at 100% can only calculate the physX + graphic precalculations for 5 - 10 fps.
The video card is sitting idle, only receiving the "precalculations" from the CPU for that many frames.

With cuda physX the CPU only calculates AI + graphic precalculations... say, at 20% utilization... while the GPU dedicates say 70% of its power to calculate the video and 30% to calculate the physX. And it achives, say, 25fps... had the video card had more performance, it could increase both the physX and graphic speed rendering to increase it further.

Actually v2 (which has been out before nvidia bought ageia) of the physX engine can calculate on both the CPU + GPU AT ONCE...
So in the above situation the CPU uses say 20% for doing AI and precalculations, 80% for doing physX, and that leaves the GPU to use 20% of it for physX (instead of 30%) increasing its graphics from 70 to 80% GPU utilization and increasing performance accordingly.

These numbers are all made up, but they are about right. The GPU has much much more power when it comes to raw arithematics, which is exactly what is needed for graphics and for physX.

AFAIK the 9800GTX is about 10-20 times the raw arithmatic power (aka, what is used in physics, graphics, cryptography, and a few other types of calculations) of a quad core extreme 3ghz for 1000$ and the GTX 280 twice that.

http://en.wikipedia.org/wiki/FLOPS

Intel Corporation has recently unveiled the experimental multi-core POLARIS chip, which achieves 1 TFLOPS at 3.2 GHz. The 80-core chip can increase this to 1.8 TFLOPS at 5.6 GHz, although the thermal dissipation at this frequency exceeds 260 watts.

As of 2007, the fastest PC processors (quad-core) perform over 30 GFLOPS.[11] GPUs in PCs are considerably more powerful in pure FLOPS. For example, in the GeForce 8 Series the nVidia 8800 Ultra performs around 576 GFLOPS on 128 Processing elements. This equates to around 4.5 GFLOPS per element, compared with 2.75 per core for the Blue Gene/L. It should be noted that the 8800 series performs only single precision calculations, and that while GPUs are highly efficient at calculations they are not as flexible as a general purpose CPU.

Click to expand...

Note that the very latest cards from AMD and nVidia reach nearly a teraflop by themselves.
While the very latest quad core from intel today is about 40 gigaflop (the article says 30, but it is outdated).

So... a 900+ gigaflop video card for 200$ compared to a 40 gigaflop 1000+$ CPU...
Or a 900+ gigaflop video card for 200$ compared to a 20-30 gigaflop 200$ CPU.

As you can see, for things that require raw floating point operations per second a CPU is simply not even close to comparable!

Intel's experimental 80 core 3.2ghz polaris chip gets the same flop performance as a single video card.

That is a lot of technobabble to say you haven't the foggiest idea what it does yet.

Keysplayr · Jun 29, 2008

Originally posted by: MarcVenice
I didn't say I didn't believe your results in CoD4, it just didn't convince me. And Sickbeat more or less confirmed it for me. Read my previous post, and respond to that. That should get us somewhere.

EDITED my previous post. Take a look.

Keysplayr · Jun 29, 2008

Originally posted by: JPB

Originally posted by: keysplayr2003

Originally posted by: MarcVenice
Looks like something pretty useless to me, at least on anything g80 derived. If the 9800gtx doesn't show any real improvements, no other card will either. We'll just have to wait for someone to review gtx2*0 with CUDA. That MIGHT improve the value for those cards by a little, since they might be to powerfull for most games at 1680*1050, but then again, as soon as they aren't powerfull enough anymore CUDA will be useless again. And, how many games actually support it ?

Click to expand...

UT3 performance gain running Physx on GPU instead of CPU

Performance jumped from 31fps when physics were run on CPU, to @ 51fps at 1680x1050.
At this early stage, it doesn't sound too useless to me. You will lose a few fps when enabling Physx in any given game, but that is akin to adding eye candy of any sort. Like increasing AA will give you a performance hit. This was on a 9800GTX.

Click to expand...

That is kind of interesting. So PhysX *does* run on the 9800GTX ?

Yes, it runs on the 9800GTX, GTX260 and GTX280. Nvidia will release support soon for all cards G8x and up. So you could use your 8800GTS640 or 8800GT, etc. etc.

taltamir · Jun 29, 2008

Originally posted by: hooflung

Originally posted by: taltamir

Originally posted by: lopri

Originally posted by: taltamir
You don't need to hack the physX drivers.. you install them as is, weather they work or not depends on your video card drivers. Which can be hacked... I got it working on a 8800GTS 512 (G92)... and i hear some people got it on G80.

You could also just wait a little bit and nvidia will release official support, eventually.

Click to expand...

What I'm wondering is:

1. There is an unused part of a GPU while rendering 3D scenes, and that is what accelerates PhysX.
2. The processing unit is the same for 3D rendering and PhysX and there is a compromise on rendering, but the net result is a plus. (say lose 3 FPS on rendering but gain 5 FPS from PhysX? or something like that)

Which scenario is it? If it's #2, I'd think there'd be a drivers nightmare.

I think an ideal scenario is having a dedicated GPU. This of course would not be SLI because the GPUs need not be the same. And you would be able to keep using your old video card for physX while your new card did the rendering.

But I am still unsure what the exact mechanism of GPU PhysX is.

Click to expand...

in a physX intensive situation:
Quad Core 3GHZ working at 100% can only calculate the physX + graphic precalculations for 5 - 10 fps.
The video card is sitting idle, only receiving the "precalculations" from the CPU for that many frames.

With cuda physX the CPU only calculates AI + graphic precalculations... say, at 20% utilization... while the GPU dedicates say 70% of its power to calculate the video and 30% to calculate the physX. And it achives, say, 25fps... had the video card had more performance, it could increase both the physX and graphic speed rendering to increase it further.

Actually v2 (which has been out before nvidia bought ageia) of the physX engine can calculate on both the CPU + GPU AT ONCE...
So in the above situation the CPU uses say 20% for doing AI and precalculations, 80% for doing physX, and that leaves the GPU to use 20% of it for physX (instead of 30%) increasing its graphics from 70 to 80% GPU utilization and increasing performance accordingly.

These numbers are all made up, but they are about right. The GPU has much much more power when it comes to raw arithematics, which is exactly what is needed for graphics and for physX.

AFAIK the 9800GTX is about 10-20 times the raw arithmatic power (aka, what is used in physics, graphics, cryptography, and a few other types of calculations) of a quad core extreme 3ghz for 1000$ and the GTX 280 twice that.

http://en.wikipedia.org/wiki/FLOPS

Intel Corporation has recently unveiled the experimental multi-core POLARIS chip, which achieves 1 TFLOPS at 3.2 GHz. The 80-core chip can increase this to 1.8 TFLOPS at 5.6 GHz, although the thermal dissipation at this frequency exceeds 260 watts.

As of 2007, the fastest PC processors (quad-core) perform over 30 GFLOPS.[11] GPUs in PCs are considerably more powerful in pure FLOPS. For example, in the GeForce 8 Series the nVidia 8800 Ultra performs around 576 GFLOPS on 128 Processing elements. This equates to around 4.5 GFLOPS per element, compared with 2.75 per core for the Blue Gene/L. It should be noted that the 8800 series performs only single precision calculations, and that while GPUs are highly efficient at calculations they are not as flexible as a general purpose CPU.

Click to expand...

Note that the very latest cards from AMD and nVidia reach nearly a teraflop by themselves.
While the very latest quad core from intel today is about 40 gigaflop (the article says 30, but it is outdated).

So... a 900+ gigaflop video card for 200$ compared to a 40 gigaflop 1000+$ CPU...
Or a 900+ gigaflop video card for 200$ compared to a 20-30 gigaflop 200$ CPU.

As you can see, for things that require raw floating point operations per second a CPU is simply not even close to comparable!

Intel's experimental 80 core 3.2ghz polaris chip gets the same flop performance as a single video card.

Click to expand...

That is a lot of technobabble to say you haven't the foggiest idea what it does yet.

No, that is what is called "hypothetical situation"

The reason the numbers are made up is because IT VARIES BY GAME, GPU, AND CPU! It was a rough example.
The important numbers are the FPS in a physX intensive situation. Like the tornado level, where the FPS really does go up by that much when you enable GPU physX.

Read the bottom of the post for some harder numbers.
Also, "technobable" and "jargon" are when a person uses complicated terms... and obfuscates the truth, I was simplifying things so anyone could understand it, eschewing a specific example and numerical accuracy.

SickBeast · Jun 30, 2008

Originally posted by: MarcVenice
Sickbeast, that conclusion is still +1 for me. Only in CoD4 there were shaders not being used, and in a 'bunch' ( which ones ? ) of games disabling SPs immediatly resulted in lower performance? Meaning no SPs to do physics acceleration? Or am I wrong in the assumption that only the shaders can run the physics? If someone happen to has a nice, understandable link, where I can see what part of a videocard does what, it be nice

We ran these tests prior to the existence of PhysX acceleration within the NV driver.

The conclusion we reached was that certain games do not require all of the shaders within the GPU. There is a limit to the complexity of the code for each application.

What this tells us is that there are situations where several of the SPs in a GPU are not being utilized at all. If NV can get them to perform PhysX instructions instead of sitting there idle, I personally think it's a great idea.

As an aside, I only score between 5-10fps on the PhysX levels in UT3. I have a crappy CPU. For people like me, PhysX acceleration is incredibly useful and I'm hoping it can help stave off a platform upgrade.

In terms of the games we tested, I personally used Far Cry and COD4. Far Cry slowed way down when the shaders were turned off, but there was no difference whatsoever in COD4.

As far as I remember, several others tested a bunch of the modern games at the time (this is going back 6 months or so). Disabling the SPs usually had an impact, but it varied. BFG did the most extensive testing.

bryanW1995 · Jun 30, 2008

Originally posted by: SickBeast
I've managed to run Folding@Home on my G80 card and I'm getting 1100 iter/sec vs. 50 or so when I just use my CPU.

That said, I can't get Physx games like UT3 working properly. The lighthouse map is a complete slideshow, along with the other Physx maps.

Does anyone here know when we can expect physx support for G80? Rollo and Keysplayr, are you guys privy to that sort of info, and are you allowed to share it with us?

Thanks.

:beer:

that thing is so borged right now. I'm getting 1300 iter/sec + on my 3870 but only ~ 2100 ppd when I don't use the gpu at all...:| can't wait for the next version...

bryanW1995 · Jun 30, 2008

Originally posted by: plion
Interesting, does it apply to any game or only physx supported games? Can it work on a game like company of heroes or world of warcraft?

world of warcraft barely has GRAPHX, much less PHYSX.

bryanW1995 · Jun 30, 2008

Originally posted by: MarcVenice
Still not convinced Keys. How many games are like that one single UT3 map ? I'd still rather enable AA then physx my CPU could run as well. And I still don't know what settings were used in that test. And I'm not convinced about your 8800gts 640mb story either. Two things, first, why did new gen cards double or tripple in shaders if they sit idle in CoD4, one of the better looking games out right now? ATI cards perform very good in CoD4, might have something to do with the increase in shaderpower, might not. Second thing, although I might be completely wrong here, don't know to much about GPU architecture, but what if 30 some shaders sit idle on ALL g80 8800gts's, 320/640mb cards? Simply because they are perhaps memory bandwith starved, not enough rop's or tmu's, in essence, an inbalanced card, where as the 9600gt was a better balanced card?

Keys, bfg, and another (azn maybe?) got really involved in that shader debate. I just did a brief search without success but it's back there somewhere... I was very surprised to see the results that they got, as were many of us, but I can tell you that what keys says is true. I think that bfg ended it with a long post about WHY they weren't using all the shaders...maybe he'll chime in here in a little while.

taltamir · Jun 30, 2008

oh, found a little gem:
http://www.anandtech.com/video/showdoc.aspx?i=3341&p=7
AMD RV770 1200 gigaflops
NVIDIA GT200 933 gigaflops

compare that to the 40 gflops of a 1000 intel quad core and you realize why graphics and physics NEED a gpu to be calculated.
the G92 is ~670 gflops and the 3870 is ~900 if I remember correctly.

Lonyo · Jul 1, 2008

Originally posted by: taltamir
oh, found a little gem:
http://www.anandtech.com/video/showdoc.aspx?i=3341&p=7
AMD RV770 1200 gigaflops
NVIDIA GT200 933 gigaflops

compare that to the 40 gflops of a 1000 intel quad core and you realize why graphics and physics NEED a gpu to be calculated.
the G92 is ~670 gflops and the 3870 is ~900 if I remember correctly.

Actually... ignoring the fact that theoretical peak performance in GFlops isn't always the best measure of performance, what kind of calculations does physics require?

http://www.anandtech.com/video/showdoc.aspx?i=3341&p=3
Slightly OT-ish, but since the Radeons have 5 stream processors, 4 of one type and one of another, can physics calculations run on any of the 5, or is it limited to the one more advanced stream processor?
Because AT says that for ATI there's a range from 160 to 800 usable stream processors depending on the instruction types, so if physics is one of the ones which can run on any stream processor, then they could definitely benefit from physics code on the GPU in any situation where they are running at less than maximum throughput due to graphics code instruction type.

(If I managed to explain myself clearly)

Sylvanas · Jul 1, 2008

Originally posted by: Lonyo

Originally posted by: taltamir
oh, found a little gem:
http://www.anandtech.com/video/showdoc.aspx?i=3341&p=7
AMD RV770 1200 gigaflops
NVIDIA GT200 933 gigaflops

compare that to the 40 gflops of a 1000 intel quad core and you realize why graphics and physics NEED a gpu to be calculated.
the G92 is ~670 gflops and the 3870 is ~900 if I remember correctly.

Click to expand...

Actually... ignoring the fact that theoretical peak performance in GFlops isn't always the best measure of performance, what kind of calculations does physics require?

http://www.anandtech.com/video/showdoc.aspx?i=3341&p=3
Slightly OT-ish, but since the Radeons have 5 stream processors, 4 of one type and one of another, can physics calculations run on any of the 5, or is it limited to the one more advanced stream processor?
Because AT says that for ATI there's a range from 160 to 800 usable stream processors depending on the instruction types, so if physics is one of the ones which can run on any stream processor, then they could definitely benefit from physics code on the GPU in any situation where they are running at less than maximum throughput due to graphics code instruction type.

(If I managed to explain myself clearly)

I had this train of thought after reading the review and I think you have hit the nail on the head. To keep as many stream processors in use at one time is the job of ATI's complier and it is obviously doing a good job now but if GPU physics could be offloaded to the GPU I would imagine in situations where not all 800 stream processors were in use- they could achieve even better results, now that would awesome! I think in the coming years GPU accelerated physics is going to be a hot topic be it Havok vs Physx or if Microsoft will put the hammer on it all and make it unified in DX11.

taltamir · Jul 1, 2008

This is exactly what physX needs, that is why I pointed it out.
CPUs can do many different calculations, and are far more flexible than a GPU. that is why your software runs on a CPU and not on a GPU. But physics and graphics and the like run better on the CPU.

Munky · Jul 1, 2008

I'm not convinced at all why gpu physics is a good idea. Given that the gpu is much more powerful at number crunching than a cpu, in most modern games you will need all those FLOPS for rendering, while we still have games that barely make full use of 2 cpu cores, nevermind 4. So, instead of making full use of the available cpu power, you have a case where the game takes away from the gpu rendering to do physics.

And, what exactly will the extra physics look like? If it's a bunch of extra debris from explosions and such, you will need to render all the extra stuff as well, so that's 2 things which take away from rendering. I'd much rather see things like fully destructible environments, or something else that makes for a new gameplay experience, and those kinds of effects will need to interact with the player and AI, which means it should eventually end up on the cpu anyway, not just a bunch of extra stuff to render which otherwise has no effect on gameplay.

aka1nas · Jul 1, 2008

You've already seen what CPU physics looks like. I.E. Half Life 2 and Oblivion. You'll end up with a handful of objects that have physics enabled because that's all the CPU can handle. We won't get CPUs that can handle fully destructible environments until both ISVs switch to fully heterogeneous multi-core architectures. Regular old x86 cores probably won't scale fast enough to handle the workload.

taltamir · Jul 1, 2008

the GPU needs that extra power to render graphics, true. But diverting 10% of the GPU's render power still gives you about 100 gigaflops...
While a CPU has 20 (normal dual core) to 40 (nehalem, not out yet).

The thing is, that is STILL not enough, from what I can see in existing physX useage, it is entirely underwhelming. until you have 3 times the power of a GTX280 or 4870 dedicated to physics it will remain underwhelming.
CPU physics = non existant.
GPU physics = underwhelming, whoop, the wall can collapse in large bricks when I hit it with a rocket.
Semi realistic physics? = way in the future. (wall collapses realistically, bricks break apart, items do not disappear from the level, ever, entire building collapses realistically instead of a specific wall market as "destructible")

CPU : Cloth acts moves like steel, Damage causes a predefined cloth tear that has nothing to do with actual damage (aka, you hit him on the head, he gets cloth torn off of his sleeve)
GPU: Cloth moves, not realistically, but it moves. It tears where it is hit, but not in a realistic way yet.
Future: Cloth will tear realistically where you hit.

CPU : Liquids are non interactable, They go through predetermined patterns, at best a switch can cause a predetermined effect to take place.
GPU: liquid can spill and move anywhere, but not very realistically yet
Future: Liquids will move realistically.

CPU: Non destructible environment. Or a few destructible objects that simply overlay an explosion and disappear behind the smoke.
GPU: Actual destruction can be observed, but not realistically, a wall will collapse rather then just disappear behind a fixed explosion, but it will not do so realistically, nor will the entire building collapse.
Future: Everything will be destructible, and realistically so.

Lonyo · Jul 1, 2008

Originally posted by: munky
I'm not convinced at all why gpu physics is a good idea. Given that the gpu is much more powerful at number crunching than a cpu, in most modern games you will need all those FLOPS for rendering, while we still have games that barely make full use of 2 cpu cores, nevermind 4. So, instead of making full use of the available cpu power, you have a case where the game takes away from the gpu rendering to do physics.

And, what exactly will the extra physics look like? If it's a bunch of extra debris from explosions and such, you will need to render all the extra stuff as well, so that's 2 things which take away from rendering. I'd much rather see things like fully destructible environments, or something else that makes for a new gameplay experience, and those kinds of effects will need to interact with the player and AI, which means it should eventually end up on the cpu anyway, not just a bunch of extra stuff to render which otherwise has no effect on gameplay.

Actually, jim1976, in reply to you in the HD4870X2 thread made an incredibly good point in terms of utilisation.

Originally posted by: jim1976
Besides that theoretical GFLOPS are approximately 930 for 280 and 1200 for RV770.. Can you say that this is depicted in the benchmarks? A 280 is on average 10-15% faster than 4870 nowadays, and if you narrow the benchmarks to heavy D3D9 and "D3D10" ones you get a 20-25% increase on average..

Basically, the RV770 core has around 20% more power than the G200, which means that the RV770 is being used at probably less than 75% of its potential in most situations.
Assuming that is usable for physics calculations (i.e. they will run on the "simple" SPs in the RV770 and don't require the 160 more complex ones), you could be looking at something in the realm of 250gflops of unused power in the RV770 which could do the physics calculations.
Since the theoretical power of the RV770 isn't translating into actual performance, it's reasonable to assume there's a bottleneck somewhere. If there's a bottleneck (and it's not the SP's) then running physics code (assuming physics code doesn't encroach on the bottleneck) on the RV770 could be almost done for free (i.e. won't take away from graphics processing).
That all depends on where the bottlenecks are and what sort of code physics is though (i.e. which SP's it requires).

With NV the situation is different, because (it would seem) the hardware is being used more effectively, given that worse theoretical performance (in flops terms) is being translated into better actual real world performance. That implies (although it's by no means even close to certain) that physics code on NV GPU's is more likely to take away from graphics processing power than maybe on ATI cards.

Obviously there's a lot of theoreticals, but there is almost certainly some untapped power somewhere on at least some GPU's, the main question is can physics code use that power without affecting graphics too much?

taltamir · Jul 1, 2008

actually it was explained in depth by anadtech's review.

The RV770 has either more or less power depending on the INSTRUCTION level parallelism (not thread). nvidia uses simple SPs. AMD uses 5way SP, that is, each SP is actually 5 SPs trying to do work in parallel.

This means that at a worst case scenario the RV770 has 160 SP compared to the 260 of the GTX280. And in a best case scenario it has 800 vs 260. It depends on how much INSTRUCTION parallism exists.
Both use thread level parallelism as well, meaning there is fluctuation in performance there too.

So basically AMD performance fluctuates a lot more depending on the game ran.

G80 Physx ETA

Lifer

Elite Member

Moderator Emeritus <br>

Golden Member

Elite Member

Elite Member

Lifer

Lifer

Lifer

Lifer

Lifer

Lifer

Lifer

Diamond Member

Lifer

Diamond Member

Diamond Member

Lifer

Lifer

Lifer