Originally posted by: Concillian
The REAL question is:
where would you get enough 25 - 60 GB/sec throughput memory for all that?
That's the thing, of course. Most modern GPUs are actually *still* remarkably memory bandwidth-limited (although this will change with increased use of shader programs containing loops). That's why overclocking the GPU core clock generallly has much less of an effect on performance than overclocking the memory.
I think that most current high-end cards have what, roughly 30GB/s theoretical max bandwidth?
I think that if NV was able to design a quad crossbar memory-controller for their GPUs, then they (or someone else) could certainly do it to support a quad-CPU arrangement, with shared access to high-speed local memory. The alternative, would be to have each CPU have its own high-speed memory pool, with a dedicated memory bus, and then have some sort of shared-interconnect to the frame-buffer/general DRAM pool. Something like AMD's Opteron multi-CPU arrangement would probably be good here.
Originally posted by: Concillian
And how much MORE memory than a single GPU card would you need for that to be effective? I mean, most multi-processor systems have the biggest problems with memory management. AMD seems to have circumvented the issue with dedicated memory for each processor in their Opterons. So how much would 2 gigs of 600 MHz GDDR cost you so you could have 256MB per CPU in your multi CPU solution?
I don't actually see why you would necessarily need a greater amount of memory with this sort of arrangement. You could still have a 256MB pool of high-speed shared memory, same as todays cards, and also have say, a 64MB local pool of higher-speed non-shared memory for each CPU.
Originally posted by: Concillian
You can't even get 2 gigs of regular run of the mill 200MHz DDR for $180, let alone 600MHz GDDR.
I really don't see how it was implied that a multi-CPU 'software GPU" setup, would require similar multiple quantities of each RAM, equivalent to todays PC's main memory amounts. This is not designed to be a server, rather a dedicated embedded-system, but a more flexible one, specifically for graphics-processing.
Originally posted by: Concillian
Face it, CPUs and video cards are WAY different. It's like comparing apples and donkeys: They're both living, but that's where similarities end.
The only real considerations are price/performance ratios, and practical limitations of implementations. (Such as excess heat from multiple CPUs, but with newer low-power high-speed chips, that's not such a problem anymore.)
The 3D pipeline can be implemented in either software or hardware, there's no inherent advantage of either implementation, save for the issues I just noted.