Why can't GPU/VPU's get to 1 GHz?

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Concillian

Diamond Member
May 26, 2004
3,751
8
81
Originally posted by: VirtualLarry
[This thought mostly driven by the fact that modern video cards only run at 500Mhz, and cost $200-500, and yet, you can purchase a 1.8Ghz Duron for around $40. For $500 you could purchase 8 of them, with three effective pipelines each, at 1.8Ghz, would yield somewhere around, lets say conservatively, equivalent to a 32Ghz single-pipeline GPU/CPU. You would still have $180 left over for some high-speed DRAM and interconnect switching hardware and PCB fabrication.

The REAL question is:
where would you get enough 25 - 60 GB/sec throughput memory for all that?
And how much MORE memory than a single GPU card would you need for that to be effective? I mean, most multi-processor systems have the biggest problems with memory management. AMD seems to have circumvented the issue with dedicated memory for each processor in their Opterons. So how much would 2 gigs of 600 MHz GDDR cost you so you could have 256MB per CPU in your multi CPU solution?

You can't even get 2 gigs of regular run of the mill 200MHz DDR for $180, let alone 600MHz GDDR.

Face it, CPUs and video cards are WAY different. It's like comparing apples and donkeys: They're both living, but that's where similarities end.
 

everman

Lifer
Nov 5, 2002
11,288
1
0
This type of work will definitly be one of the "killer apps" for quantum computing. Massively parallel systems, this kind of stuff will be trivial. A 32 Qubit GPU *drools*
 

Ages120

Senior member
May 28, 2004
218
0
0
To bad development times for software seem to increase exponentially =( We will need ai to make our programs eventually and then after that create content.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,225
126
Originally posted by: Concillian
The REAL question is:
where would you get enough 25 - 60 GB/sec throughput memory for all that?

That's the thing, of course. Most modern GPUs are actually *still* remarkably memory bandwidth-limited (although this will change with increased use of shader programs containing loops). That's why overclocking the GPU core clock generallly has much less of an effect on performance than overclocking the memory.

I think that most current high-end cards have what, roughly 30GB/s theoretical max bandwidth?

I think that if NV was able to design a quad crossbar memory-controller for their GPUs, then they (or someone else) could certainly do it to support a quad-CPU arrangement, with shared access to high-speed local memory. The alternative, would be to have each CPU have its own high-speed memory pool, with a dedicated memory bus, and then have some sort of shared-interconnect to the frame-buffer/general DRAM pool. Something like AMD's Opteron multi-CPU arrangement would probably be good here.

Originally posted by: Concillian
And how much MORE memory than a single GPU card would you need for that to be effective? I mean, most multi-processor systems have the biggest problems with memory management. AMD seems to have circumvented the issue with dedicated memory for each processor in their Opterons. So how much would 2 gigs of 600 MHz GDDR cost you so you could have 256MB per CPU in your multi CPU solution?

I don't actually see why you would necessarily need a greater amount of memory with this sort of arrangement. You could still have a 256MB pool of high-speed shared memory, same as todays cards, and also have say, a 64MB local pool of higher-speed non-shared memory for each CPU.

Originally posted by: Concillian
You can't even get 2 gigs of regular run of the mill 200MHz DDR for $180, let alone 600MHz GDDR.

I really don't see how it was implied that a multi-CPU 'software GPU" setup, would require similar multiple quantities of each RAM, equivalent to todays PC's main memory amounts. This is not designed to be a server, rather a dedicated embedded-system, but a more flexible one, specifically for graphics-processing.

Originally posted by: Concillian
Face it, CPUs and video cards are WAY different. It's like comparing apples and donkeys: They're both living, but that's where similarities end.

The only real considerations are price/performance ratios, and practical limitations of implementations. (Such as excess heat from multiple CPUs, but with newer low-power high-speed chips, that's not such a problem anymore.)

The 3D pipeline can be implemented in either software or hardware, there's no inherent advantage of either implementation, save for the issues I just noted.
 

Concillian

Diamond Member
May 26, 2004
3,751
8
81
shared memory has significant programming and potential latency challenges. The easiest (and highest performance) way out is a set for each CPU, as AMD did with the Opterons. I think you'd have a hard time getting 8 CPUs sharing 256MB to perform well at all. I could be wrong though.

The issue is that your memory bandwidth is divided between each CPU. If you have 40 GB/sec then that gets divided between 8 CPUs. What happens when 3 CPUs request the same memory at the same time? Extra latency, data collisions, and wasted time. What would have been 40GB/sec for a single processor becomes 30-35 GB /sec when it gets shared between multiple processors because of the overhead involved. In an environment where memory throughput is SO important, I think you couldn't get away with using 256MB shared. You'd have to have at least 2-4 sets of 256MB to get reasonable performance. What I did not consider is that you may be able to use slower memory if you did that because you'd have multiple busses.