Originally posted by: akugami
Maybe if you actually understood what I was trying to say instead of a somewhat snide opening reply it would foster better discussion. Try looking up math coprocessors, Altivec/VMX, and SSE 1/2/3/4.
These were technologies developed to enhance the CPU. Older CPU's could do math. Math coprocessors were more powerful at crunching numbers. The math coprocessors could handle offloaded instructions from the main CPU, freeing it up to crunch other instructions.
Altivec and SSE and similar instruction set extensions are sets of very special instructions that handle certain floating point and integer operations to speed up certain functions such as video encoding. Sure, SSE4 only added a few extra instructions over SSE3 which added only a few extra instructions over SSE2. However, early benchmarks of SSE2 vs SSE4 are showing a 40% increase in performance for DivX encoding.
Ramblings indeed.
You were arguing that "I don't believe nVidia has really designed a GPU for PhysX yet."
For all these ramblings you posted... All these extensions are STILL general purpose. Yes, SSE4 allows developers to improve performance in DivX encoding. That doesn't mean that this is the thing that SSE4 was specifically designed for, or that it is the only thing they're good for. SSE is also great for accelerating linear algebra or geometry-related math, to name but a few things.
Perhaps you meant to say "I think nVidia will extend its architecture to improve GPGPU performance in tasks like PhysX". That is quite different from what you said. What you said sounded more like nVidia would just insert a PPU core into the GPU. That most certainly is not going to happen. There is a clear trend away from fixed-function units, and towards more programmability and flexibility. Larrabee being the most obvious step in that direction.
Originally posted by: akugami
The fact that the G80 GPU cores and up were well designed with GPGPU in mind and in line with how Ageia used hardware to accelerated PhysX doesn't mean there isn't room for enhancements. If Ageia did not have tech that nVidia coveted (PhysX, both software AND hardware) then nVidia would not have shelled out good money.
I see no reason why nVidia would be interested in the hardware, to be honest.
Their PhysX performance is already VERY good. Much better than the PPU ever was.
The software was obviously important to nVidia, since PhysX (formerly NovodeX) was already a widely used API, especially on consoles. So they had a working solution, which developers were already familiar with. An ideal tool to leverage their GPU acceleration.
Originally posted by: akugami
I'm not a hardware engineer. I'm not even a programmer. I simply refuse to believe that there isn't hardware and software that nVidia obtained when they bought Ageia that can be integrated into their GPU's to further accelerate PhysX.
You also have to realize that technology is outdated very quickly. The PPU was already aging when nVidia bought Ageia. Just like nVidia bought out 3DFX, but never did much with their technology either (although both nVidia and 3DFX use a technology named SLI, the meaning of the acronym is different, as is the general technology). The technology just became irrelevant as GPU designs continued to evolve.
Originally posted by: akugami
I agree integrated GPU's will not be competitive with discrete video cards any time soon. However, with the pace technology can move, who can say what is possible in two-three years time. While discrete GPU's are larger than today's CPU's, AMD has shown what can be done with a smaller GPU core to be competitive with a larger one.
It's just a case of conflict of interest. CPUs and GPUs want different types of memory for optimal performance. I don't see any solution for that emerging anytime soon. With consoles they get away with using memory aimed at graphics, because CPU-performance is of secondary interest. Besides, GPU performance isn't as high as a high-end discrete card anyway, so the difference between CPU and GPU memory is smaller. But it remains a compromise between the two.
Bottom line is: there's no point in making super-powerful integrated GPUs when they're going to be bandwidth-starved anyway.
Originally posted by: akugami
I don't believe the decreased GPU power of integrated CPU/GPU's vs discrete GPU's will hurt Intel or AMD as much as you seem to think. Most general consumers simply won't care. Furthermore, OEM's definitely will like having one less part to stock. Don't discount the fact that Intel ships the most GPU chipsets even though they're current GPU's are crap for gaming comparatively speaking.
Except this discussion was never about that type of systems.
It's about Cuda and PhysX, which is of no interest to the average office user or casual home user.
It's only interesting for gaming and high-performance computing. Especially with high-performance computing, the name alone should be enough indication that people won't accept a slower integrated GPU.
Originally posted by: akugami
As for the memory issue. Hyper Transport and Intel Quickpath Interconnect or a similar bus technology can be made to deliver a high bandwidth bus along with plugin memory modules on a specially designed daughtercard port. Maybe it's some other solution.
Seems rather pointless. If you're going to use a separate module anyway, it might aswell be a PCI-e card with the GPU and memory close together with a direct connection.
HT and Quickpath still are no substitute for an on-die controller and a direct connection to the memory. Which is what both GPUs and CPUs have. A bus always adds extra overhead. Overhead you aren't interested in when your memory is dedicated anyway.