I don't understand the concept behind them, even for people who want a low-end GPU (which is what they're designed for).
The reason why is because they could just beef up the AVX registers/cores/whatever and add texture units. With FMA, 512 bit wide FPUs/core, and good drivers I think they could do a lot more by emulating the ROPs/depth stencil units.
Instead of Ivy Bridge, they should've just done this:
8 or 10 cores (16 or 20 threads), with AVX2, 12-16 MB low latency L3 cache (or even L3 cache per core and shared L4 cache) @ 3.5 GHz or so. Most games don't take advantage of more than 4 cores, so 4 or 6 cores (plus texture units) would be left for the blending, pixel/vertex/geometry shaders, and depth. Via the driver, the number of cores doing graphics could be user-assigned.
a fast QP interface.
on die TMDS.
on die texture addressing/filtering units with 1MB L2 cache @ 800MHz.
That would result in a higher TDP and cost more, but if the iGPU is going to be low-end anyway, then I see no reason to just emulate the traditional GPU elements other than the texture units.
Anyway, I was looking for a critique of this idea. Programmable blending/depth should be the future, IMO, because dedicated hardware is not always faster than genericized especially now that AVX2 will have FMA and with good programming and drivers. Intel is quite a bit ahead of TSMC in terms of process, so they should use it to their advantage.
The reason why is because they could just beef up the AVX registers/cores/whatever and add texture units. With FMA, 512 bit wide FPUs/core, and good drivers I think they could do a lot more by emulating the ROPs/depth stencil units.
Instead of Ivy Bridge, they should've just done this:
8 or 10 cores (16 or 20 threads), with AVX2, 12-16 MB low latency L3 cache (or even L3 cache per core and shared L4 cache) @ 3.5 GHz or so. Most games don't take advantage of more than 4 cores, so 4 or 6 cores (plus texture units) would be left for the blending, pixel/vertex/geometry shaders, and depth. Via the driver, the number of cores doing graphics could be user-assigned.
a fast QP interface.
on die TMDS.
on die texture addressing/filtering units with 1MB L2 cache @ 800MHz.
That would result in a higher TDP and cost more, but if the iGPU is going to be low-end anyway, then I see no reason to just emulate the traditional GPU elements other than the texture units.
Anyway, I was looking for a critique of this idea. Programmable blending/depth should be the future, IMO, because dedicated hardware is not always faster than genericized especially now that AVX2 will have FMA and with good programming and drivers. Intel is quite a bit ahead of TSMC in terms of process, so they should use it to their advantage.
Last edited:
