Originally posted by: Gamingphreek
Well i remember somewhere someone said that.
Well, if "someone" who was "somewhere" said it, it MUST be true.
Also you can make infrences as to howlong the instruction pipeline is. I mean:
#1 The GPU's cannot get over around 600mhz or so.
#2 The GPU's are much more complex than CPU's otherwise we would just start tacking CPU's on Video cards.
-Kevin
Keep in mind the second point I added above -- on a modern GPU, you're getting 8-16 times more IPC than the raw clock speed would suggest (less than that for vertex shader operations, although they can be done in parallel with PS operations). Plus your average modern-day GPU does more "work" (in a computational sense) than your average CPU in one clock cycle (since it is working on more specialized tasks and has dedicated hardware for most of them). It might, for instance, compute the transformed coordinate values for a triangle vertex in one GPU clock cycle -- something that, done on a CPU, would probably take dozens (if not hundreds) of micro-ops. Essentially, the GPU makers have already done what you want them to do -- they've trimmed their GPUs down to the bare minimum required to do 3D rendering, and made that one task as fast as possible. I wouldn't necessarily say they're "more complex" than modern CPUs, but they're definitely complex in different ways -- CPUs have a single very long, complicated pipeline (30+ stages), fairly large instruction sets, and multiple layers of on-die cache, as well as highly complex subsystems like ALUs, FPUs, and SIMD units to implement instruction sets like SSE. GPUs have, in essence, a large number of relatively simple fixed-function pixel pipelines (if I had to estimate, I'd say maybe 10 stages), vertex and pixel shaders, and some hardware to help with AA and AF.
If you're interested in looking into GPU architecture in more detail, AT and Tom's Hardware have both published a number of articles on it (other sites, I'm sure have done similar things -- check out Beyond3D and FiringSquad as well). It's impossible to know exact information on the number of pipeline stages, etc. (plus, you couldn't compare it directly to a CPU pipeline, since they do different things), but you can get a general sense of how the data flow works.
AT article on 6800
AT article on NV30
AT article on R420
AT article on R300
THG NV40 article
THG R420 article
THG R300 article