physX CUDA transition almost ready

biostud · Apr 14, 2008

http://www.tomshardware.com/20...p_and_running_almost_/

While Intel's Nehalem demo had 50,000-60,000 particles and ran at 15-20 fps (without a GPU), the particle demo on a GeForce 9800 card resulted in 300 fps. If the very likely event that Nvidia's next-gen parts (G100: GT100/200) will double their shader units, this number could top 600 fps, meaning that Nehalem at 2.53 GHz is lagging 20-40x behind 2006/2007/2008 high-end GPU hardware. However, you can't ignore the fact that Nehalem in fact can run physics.

bunnyfubbles · Apr 14, 2008

But Nehalem is the CPU...it's Larrabee and its potential performance that we'd want to know about...

And 300 or 600 fps is worthless, however 250,000 or 500,000 particles @ 60 fps would be nice

taltamir · Apr 16, 2008

now we know why nvidia has been pushing such unrealistically high shader units without things to keep up with it...

hooflung · Apr 16, 2008

Originally posted by: bunnyfubbles
But Nehalem is the CPU...it's Larrabee and its potential performance that we'd want to know about...

And 300 or 600 fps is worthless, however 250,000 or 500,000 particles @ 60 fps would be nice

The point is that the Nehalem tested was an 8 core CPU so if things were built with that in mind The GPU 'it would have' could run all the standard rasterization and rendering while cores where used for physics. Nvidia now has a valid case for SLI motherboards. Why buy an 8 core CPU that will probably cost 300 bucks when you could buy an 8800GT and pop it into your SLI motherboard for half the price and have SLI and/or physics as viable options for established and future games.

aka1nas · Apr 16, 2008

Once it's available, I'll dig out some of my PhysX-enabled games and compare performance between the PPU and the CUDA implementation.

Piuc2020 · Apr 16, 2008

I wonder if SLI is going to be a necessity or the GPU can cope with graphics and physics at the same time, doubtful but otherwise it could be troublesome since nvidia chipsets suck and they are certainly not in a position to force it upon their users.

ViRGE · Apr 17, 2008

Originally posted by: Piuc2020
I wonder if SLI is going to be a necessity or the GPU can cope with graphics and physics at the same time, doubtful but otherwise it could be troublesome since nvidia chipsets suck and they are certainly not in a position to force it upon their users.

The hardware and software is capable of running CUDA and graphics threads at the same time (or rather via scheduling), so that's not the problem. If there's a problem, it's going to be the performance hit to rendering performance that results from weighing down the GPU with CUDA code too.

superbooga · Apr 17, 2008

Just think of it as another graphical option. You turn it up, and your framerates will drop, just like any other graphical option.

apoppin · Apr 17, 2008

Originally posted by: superbooga
Just think of it as another graphical option. You turn it up, and your framerates will drop, just like any other graphical option.

according to ATi - when they first announced they were doing also physics on their x800-x1900[?] claimed that the "extra unused" cycles would manage it

did they ever demonstrate it so the performance penalty was more limited than nVidia's method?

ViRGE · Apr 17, 2008

Originally posted by: apoppin

Originally posted by: superbooga
Just think of it as another graphical option. You turn it up, and your framerates will drop, just like any other graphical option.

Click to expand...

according to ATi - when they first announced they were doing also physics on their x800-x1900[?] claimed that the "extra unused" cycles would manage it

did they ever demonstrate it so the performance penalty was more limited than nVidia's method?

They never demonstrated simultaneous operation in the first place AFAIK. The only time you'd have spare cycles would be if you were bottlenecked by something other than the GPU (i.e. the CPU), when the whole idea of GPU physics is to get such work off of the CPU, in effect reducing the CPU bottleneck. Using spare cycles would be counterproductive, in other words.

GPU-accelerated <almost anything> has been a joke so far. Commercial GPGPU use has matured but there's been no real use of the GPU in the consumer space (video decoding is a GPU feature, but it's handled with dedicated hardware, not as a GPGPU program like first intentioned). HavokFX, Quantum Effects, video encode acceleration, etc have all been a dud.

apoppin · Apr 17, 2008

Originally posted by: ViRGE

Originally posted by: apoppin

Originally posted by: superbooga
Just think of it as another graphical option. You turn it up, and your framerates will drop, just like any other graphical option.

Click to expand...

according to ATi - when they first announced they were doing also physics on their x800-x1900[?] claimed that the "extra unused" cycles would manage it

did they ever demonstrate it so the performance penalty was more limited than nVidia's method?

Click to expand...

They never demonstrated simultaneous operation in the first place AFAIK. The only time you'd have spare cycles would be if you were bottlenecked by something other than the GPU (i.e. the CPU), when the whole idea of GPU physics is to get such work off of the CPU, in effect reducing the CPU bottleneck. Using spare cycles would be counterproductive, in other words.

GPU-accelerated <almost anything> has been a joke so far. Commercial GPGPU use has matured but there's been no real use of the GPU in the consumer space (video decoding is a GPU feature, but it's handled with dedicated hardware, not as a GPGPU program like first intentioned). HavokFX, Quantum Effects, video encode acceleration, etc have all been a dud.

thanks, i really have not been keeping up .. and it didn't seem to make any sense what ATI said either back then

However, to question what you said [that i bolded] - isn't this the "fault" of the programs? If they are written properly with the GPU in mind, they should be far better than now - and nearly equal to what dedicated HW can do - at least that appears to be the 'theory' of the "extra" cycles not being extra but dedicated to processing the program written specifically to take advantage of the parralelism inherent in GPU architecture?
- and that would also mean they are really working on Fusion - this is all mostly "theoretical" and brand new, right?

aka1nas · Apr 17, 2008

To the best my understanding, the main issue with having the GPU process these sort of tasks is that data will have to be constantly sent back to the rest of the system. 3d graphics are mostly a one-way operation, in that data is sent to the GPU for processing and then dumped directly to the screen.

Communications in the other direction have always been relatively computationally expensive due to the latency of the bus connecting the GPU to the system, especially if the GPU has to then wait for the system to deliver more data to it. PCI-E probably helps in that regard, but latency is likely still a killer in some situations.

A lot of the performance-related aspects of Dx10 are intended to do things like send multiple operations as a single batch to the GPU so there is less communication over the bus and the GPU can stay fully utilized and not waiting around for the rest of the system. First-order physics on the GPU seems to run counter-intuitively to that, as you would need to be communicating data back and forth constantly.

jjzelinski · Apr 17, 2008

Why couldn't the focus shift from max frame rate to minimum frame rate? If ATI or NV were to design their PPU integration with maintaining minimum frame rates in mind then they in fact *would* have excess cycles to devote to physics processing. In fact once that relatively simple taken then everything should feel rather familiar as far as handling image quality is concerned, meaning we would simply continue to play with shader levels, AA, AF, etc. in order to achieve highest possible minimum frame rates while utilizing PPU capabilities.

Furthermore I had pointed out in a similar thread onyl a week or so back that games designers can easily take advantage of physics processing by tailoring their "screenplays" around the trade off between typical eye-candy and more physics driven content. Not every "scene" may require beefy physics, and not every scene may require high levels of eye-candy; the trade-off could be quite convincing if "scripted" properly.

ViRGE · Apr 17, 2008

Originally posted by: apoppin

Originally posted by: ViRGE

Originally posted by: apoppin

Originally posted by: superbooga
Just think of it as another graphical option. You turn it up, and your framerates will drop, just like any other graphical option.

Click to expand...

according to ATi - when they first announced they were doing also physics on their x800-x1900[?] claimed that the "extra unused" cycles would manage it

did they ever demonstrate it so the performance penalty was more limited than nVidia's method?

Click to expand...

They never demonstrated simultaneous operation in the first place AFAIK. The only time you'd have spare cycles would be if you were bottlenecked by something other than the GPU (i.e. the CPU), when the whole idea of GPU physics is to get such work off of the CPU, in effect reducing the CPU bottleneck. Using spare cycles would be counterproductive, in other words.

GPU-accelerated <almost anything> has been a joke so far. Commercial GPGPU use has matured but there's been no real use of the GPU in the consumer space (video decoding is a GPU feature, but it's handled with dedicated hardware, not as a GPGPU program like first intentioned). HavokFX, Quantum Effects, video encode acceleration, etc have all been a dud.

Click to expand...

thanks, i really have not been keeping up .. and it didn't seem to make any sense what ATI said either back then

However, to question what you said [that i bolded] - isn't this the "fault" of the programs? If they are written properly with the GPU in mind, they should be far better than now - and nearly equal to what dedicated HW can do - at least that appears to be the 'theory' of the "extra" cycles not being extra but dedicated to processing the program written specifically to take advantage of the parralelism inherent in GPU architecture?
- and that would also mean they are really working on Fusion - this is all mostly "theoretical" and brand new, right?

I'm not quite sure what you mean, apoppin.

apoppin · Apr 17, 2008

Originally posted by: ViRGE
I'm not quite sure what you mean, apoppin.

OK, Let me begin again .. too many nested quotes

IF a program was written specially to take advantage of the GPU's incredible parallelism - and even perhaps to take advantage of the fact that 3d graphics are mostly a one-way operation - completely UNLIKE Programs written for the CPU calculations; THEN perhaps we would see real use of the GPU in the consumer space - almost as well as what is currently handled with dedicated hardware. And *later* when FUSION is complete and the one-way GPU communicates with the CPU much more effectively - with memory fully integrated and the MB changed from what we know it - it will finally come into its own. AMD's Vision

i am talking about the "future"

ViRGE · Apr 18, 2008

Originally posted by: apoppin

Originally posted by: ViRGE
I'm not quite sure what you mean, apoppin.

Click to expand...

OK, Let me begin again .. too many nested quotes

IF a program was written specially to take advantage of the GPU's incredible parallelism - and even perhaps to take advantage of the fact that 3d graphics are mostly a one-way operation - completely UNLIKE Programs written for the CPU calculations; THEN perhaps we would see real use of the GPU in the consumer space - almost as well as what is currently handled with dedicated hardware. And *later* when FUSION is complete and the one-way GPU communicates with the CPU much more effectively - with memory fully integrated and the MB changed from what we know it - it will finally come into its own. AMD's Vision

i am talking about the "future"

This implies that there's a lack of ability with current hardware, which I would argue is not the case. Certainly accessing data from a GPU isn't as fast as say local memory, but judging what people have been doing with CUDA and Brook+ it doesn't seem like a serious problem. The limiting factor is not the hardware, IMHO, it's the development software. Until a year ago (even less on AMD's side) there was no practical way to write GPGPU software, you had to write it as Cg/HLSL shader code which was picky about hardware and required an in-depth knowledge of how graphical rendering works. The realization of GPU acceleration is going to come from the fact that we finally have real high-level language development tools that can be easily integrated in to current development practices.

For what it's worth, I don't see Fusion changing any of this.

apoppin · Apr 18, 2008

Originally posted by: ViRGE

Originally posted by: apoppin

Originally posted by: ViRGE
I'm not quite sure what you mean, apoppin.

Click to expand...

OK, Let me begin again .. too many nested quotes

IF a program was written specially to take advantage of the GPU's incredible parallelism - and even perhaps to take advantage of the fact that 3d graphics are mostly a one-way operation - completely UNLIKE Programs written for the CPU calculations; THEN perhaps we would see real use of the GPU in the consumer space - almost as well as what is currently handled with dedicated hardware. And *later* when FUSION is complete and the one-way GPU communicates with the CPU much more effectively - with memory fully integrated and the MB changed from what we know it - it will finally come into its own. AMD's Vision

i am talking about the "future"

Click to expand...

This implies that there's a lack of ability with current hardware, which I would argue is not the case. Certainly accessing data from a GPU isn't as fast as say local memory, but judging what people have been doing with CUDA and Brook+ it doesn't seem like a serious problem. The limiting factor is not the hardware, IMHO, it's the development software. Until a year ago (even less on AMD's side) there was no practical way to write GPGPU software, you had to write it as Cg/HLSL shader code which was picky about hardware and required an in-depth knowledge of how graphical rendering works. The realization of GPU acceleration is going to come from the fact that we finally have real high-level language development tools that can be easily integrated in to current development practices.

For what it's worth, I don't see Fusion changing any of this.

i must be stupid tonight

IF a program was written specially to take advantage of the GPU's incredible parallelism .. THEN perhaps we would see real use of the GPU in the consumer space - almost as well as what is currently handled with dedicated hardware

that is what i thought i said [^^this is me, my quote^^ - without the confusing stuff that i should have put in another sentence]

i *expect* to see CUDA take off this year!! - for sure
.. what about AMD? .. that IS the question
- *my question*

Search

physX CUDA transition almost ready

biostud

Lifer

bunnyfubbles

Lifer

taltamir

Lifer

hooflung

Golden Member

aka1nas

Diamond Member

Piuc2020

Golden Member

ViRGE

Elite Member, Moderator Emeritus

superbooga

Senior member

apoppin

Lifer

ViRGE

Elite Member, Moderator Emeritus

apoppin

Lifer

aka1nas

Diamond Member

jjzelinski

Diamond Member

ViRGE

Elite Member, Moderator Emeritus

apoppin

Lifer

ViRGE

Elite Member, Moderator Emeritus

apoppin

Lifer

TRENDING THREADS