- May 9, 2001
- 858
- 0
- 0
Recently i've been trying to understand the difference. There's a lot of technical stuff but i'm just aiming for a basic conceptual understanding.
It's a bit difficult because articles i find are either too basic (cpu's are programmable, gpu's are massively parallel) or too technical (getting into how code is handled)
So here's my understanding so far, please make corrections, fill in the gaps, etc:
The work unit of cpu is alu and fpu. The fundamental work unit of gpu is the fpu.
The cpu is general and handles any type of processing. Since the work is so varied, left alone, programs would stall at various stages of the pipeline and waste many cycles. Maximum performance is achieved by dedicated transistors to organizational stuff (branch prediction, out of order processing, pre-fetch, cache, etc) to keep the work units busy at all times.
The gpu is more specific and workload is very regular and predictable. So when the cpu does this kind of work the resources used for organizing is wasted and unnecessary. Gpu's get much better performance than cpu because organizing transistors are thrown out and more work units are put it their place.
Special instructions for cpu, like sse4, improve performance by acting as shortcut commands to skip unnecessary steps and therefore act more gpu-like
Because gpu work is much simpler, the stages are shorter and this is why their clock speeds are so much less than cpu's using the same manufacturing process.
ppu physics processors are very much like gpu and the fundamental work unit is the fpu. ppu try to improve performance over gpu by removing the unnecessary rendering resources and dedicating more transistors to physics processing. Whereas gpu's allow vast improvement over cpu, many believe the difference of ppu over gpu is minimal because they are so similar and so it would be better for physics to use a second gpu card if not just relying on multi core cpu.
It's a bit difficult because articles i find are either too basic (cpu's are programmable, gpu's are massively parallel) or too technical (getting into how code is handled)
So here's my understanding so far, please make corrections, fill in the gaps, etc:
The work unit of cpu is alu and fpu. The fundamental work unit of gpu is the fpu.
The cpu is general and handles any type of processing. Since the work is so varied, left alone, programs would stall at various stages of the pipeline and waste many cycles. Maximum performance is achieved by dedicated transistors to organizational stuff (branch prediction, out of order processing, pre-fetch, cache, etc) to keep the work units busy at all times.
The gpu is more specific and workload is very regular and predictable. So when the cpu does this kind of work the resources used for organizing is wasted and unnecessary. Gpu's get much better performance than cpu because organizing transistors are thrown out and more work units are put it their place.
Special instructions for cpu, like sse4, improve performance by acting as shortcut commands to skip unnecessary steps and therefore act more gpu-like
Because gpu work is much simpler, the stages are shorter and this is why their clock speeds are so much less than cpu's using the same manufacturing process.
ppu physics processors are very much like gpu and the fundamental work unit is the fpu. ppu try to improve performance over gpu by removing the unnecessary rendering resources and dedicating more transistors to physics processing. Whereas gpu's allow vast improvement over cpu, many believe the difference of ppu over gpu is minimal because they are so similar and so it would be better for physics to use a second gpu card if not just relying on multi core cpu.