Originally posted by: Scali
Originally posted by: akugami
And as an aside, I don't believe nVidia has really designed a GPU for PhysX yet.
Don't you get it?
The "GP" in GPGPU stands for General Purpose.
They don't HAVE to design a GPU for PhysX, because their GPU is designed for General Purpose processing.
Just like Intel and AMD don't really design their CPUs for any specific task in mind. They are designed to run pretty much anything.
As such, nVidia will NEVER design a GPU for PhysX. There's no need. They'll just continue to improve on the GPGPU features.
Originally posted by: akugami
I think some of their GPU design was meant for their GPGPU uses which also helped PhysX. nVidia didn't buy Ageia until early 2008 and likely most of the design work on what would be put into the GT200 GPU cores was already set in stone.
It's worse than that. PhysX works on everything from the G80 up. Any Cuda GPU.
And those are over 2 years old.
The GT200 isn't all that different from the G80, it's mostly just bigger and faster. Aside from that they added a few features to Cuda, but nothing specific to physics. And they probably never will.
In fact, if you study the Ageia PPU design, it's not too different from the G80's original design. The key to the PPU was not so much the parallelism (it didn't have that many cores, only about 12 I believe, and they weren't that fast), but in how the architecture could shuffle the data through a sort of packet-switching bus. It was almost like a network switch.
nVidia's G80 added shared memory between its stream processors, which also allows stream processors to quickly communicate with eachother.
And that's what you want for physics. You want to propagate the forces of one object to the objects that it acts upon.
Maybe if you actually understood what I was trying to say instead of a somewhat snide opening reply it would foster better discussion. Try looking up math coprocessors, Altivec/VMX, and SSE 1/2/3/4.
These were technologies developed to enhance the CPU. Older CPU's could do math. Math coprocessors were more powerful at crunching numbers. The math coprocessors could handle offloaded instructions from the main CPU, freeing it up to crunch other instructions.
Altivec and SSE and similar instruction set extensions are sets of very special instructions that handle certain floating point and integer operations to speed up certain functions such as video encoding. Sure, SSE4 only added a few extra instructions over SSE3 which added only a few extra instructions over SSE2. However, early benchmarks of SSE2 vs SSE4 are showing a 40% increase in performance for DivX encoding.
The point of all these ramblings is that I believe whatever advantages and intellectual properties they were buying when they purchased Ageia has not yet been integrated into the GPU's produced by nVidia. This means that PhysX can only get better when they do properly integrate Ageia's tech with their existing tech.
The fact that the G80 GPU cores and up were well designed with GPGPU in mind and in line with how Ageia used hardware to accelerated PhysX doesn't mean there isn't room for enhancements. If Ageia did not have tech that nVidia coveted (PhysX, both software AND hardware) then nVidia would not have shelled out good money.
I'm not a hardware engineer. I'm not even a programmer. I simply refuse to believe that there isn't hardware and software that nVidia obtained when they bought Ageia that can be integrated into their GPU's to further accelerate PhysX.
Originally posted by: Scali
Originally posted by: akugami
I beg to differ. nVidia's products are wildly successful now but the landscape is set to change dramatically in the next two years. First, Intel is heading into the market and while it would be extremely hard for them to gain market share from hardcore gamers, they can easily use their CPU business for their GPU's to piggyback on. And we all know what physics product Intel will be supporting. Second is both Intel and AMD will be moving towards integrated CPU/GPU's in which the multi-core processor contains not only two or more CPU cores but likely at least one GPU core. As processes get smaller, one can even imagine multi CPU and GPU cores in one package. This cuts nVidia out completely.
Integrated GPUs will not be competitive with discrete cards anytime soon.
Aside from the fact that discrete GPUs are FAR larger than a CPU itself, so you can't really integrate such a large chip in a regular CPU anyway... Another huge problem is the shared memory of an integrated GPU.
A discrete videocard has its own memory, which is different from the main memory in a computer. It's specially designed for graphics (GDDR) and delivers high bandwidth at high latencies. Regular memory is designed to deliver low latencies, and the bandwidth is much lower.
So any integrated GPU will have MUCH lower bandwidth than a discrete card, which means it is impossible to get competitive performance.
This is also why Intel launches its Larrabee as a discrete card.
I agree integrated GPU's will not be competitive with discrete video cards any time soon. However, with the pace technology can move, who can say what is possible in two-three years time. While discrete GPU's are larger than today's CPU's, AMD has shown what can be done with a smaller GPU core to be competitive with a larger one.
While individually AMD's current GPU's may not match nVidia's top GPU's, it is arguable that with the way they set about designing their line up, AMD is very competitive by integrating two GPU's into one Xfire card to combat a single GPU by nVidia. These are competitive from not only a performance standpoint but from a price standpoint as well.
Furthermore there will be process shrinks. This means it should be easier to implement multiple GPU cores in the future assuming one doesn't not raise the transistor count too rapidly as one shrinks the node at which the CPU/GPU's are produced.
By going with less powerful cores you might be able to fit two or three GPU cores in the same die space as four CPU cores. You'd sacrifice sheer power in each individual GPU core but you'd make it up by having more than one. Sure, this solution may never be as powerful as discrete GPU's but as we move further and further ahead, there seems to be less and less gains made in game realism.
Cryengine2 (Crysis) showed some amazing graphics and I think we'll be hard pressed to really go up from that for the average gamer to really notice much differences. I do believe that CPU/GPU's can be made to run a game like Crysis at decent resolutions so that _most gamers_ won't worry or care about extra levels of detail. Case in point, the Xbox 360 and PS3 are pretty close to what will max out the useful graphics updates for general consumers. After that, updated graphics simply becomes another checkbox feature for them.
I don't believe the decreased GPU power of integrated CPU/GPU's vs discrete GPU's will hurt Intel or AMD as much as you seem to think. Most general consumers simply won't care. Furthermore, OEM's definitely will like having one less part to stock. Don't discount the fact that Intel ships the most GPU chipsets even though they're current GPU's are crap for gaming comparatively speaking.
As for the memory issue. Hyper Transport and Intel Quickpath Interconnect or a similar bus technology can be made to deliver a high bandwidth bus along with plugin memory modules on a specially designed daughtercard port. Maybe it's some other solution. Regardless, the current GPU's accessing memory chips still have to go through circuit boards. The memory is not directly on the GPU die. The motherboard is just another circuit board.
***EDIT***
I think I'll quit while the getting's good. Too much arguing in circles as usually happens when the fanboys get at it. I got sucked into arguing with the fanboys a few times already so I'm going to quit. I got my point out, you can agree or disagree. If anyone wants to ask something or further clarify, send a PM or two.