What's the purpose of integrated GPUs?

Anarchist420 · Apr 25, 2012

I don't understand the concept behind them, even for people who want a low-end GPU (which is what they're designed for).

The reason why is because they could just beef up the AVX registers/cores/whatever and add texture units. With FMA, 512 bit wide FPUs/core, and good drivers I think they could do a lot more by emulating the ROPs/depth stencil units.

Instead of Ivy Bridge, they should've just done this:
8 or 10 cores (16 or 20 threads), with AVX2, 12-16 MB low latency L3 cache (or even L3 cache per core and shared L4 cache) @ 3.5 GHz or so. Most games don't take advantage of more than 4 cores, so 4 or 6 cores (plus texture units) would be left for the blending, pixel/vertex/geometry shaders, and depth. Via the driver, the number of cores doing graphics could be user-assigned.
a fast QP interface.
on die TMDS.
on die texture addressing/filtering units with 1MB L2 cache @ 800MHz.

That would result in a higher TDP and cost more, but if the iGPU is going to be low-end anyway, then I see no reason to just emulate the traditional GPU elements other than the texture units.

Anyway, I was looking for a critique of this idea. Programmable blending/depth should be the future, IMO, because dedicated hardware is not always faster than genericized especially now that AVX2 will have FMA and with good programming and drivers. Intel is quite a bit ahead of TSMC in terms of process, so they should use it to their advantage.

tweakboy · Apr 25, 2012

Simply said. It is for people who don't play games. If your a gamer then you disable the onboard and use your dedicated PCIe card. gl

RavenSEAL · Apr 25, 2012

tweakboy said:
Simply said. It is for people who don't play games. If your a gamer then you disable the onboard and use your dedicated PCIe card. gl

podspi · Apr 25, 2012

Anarchist420 said:
That would result in a higher TDP and cost more, but if the iGPU is going to be low-end anyway, then I see no reason to just emulate the traditional GPU elements other than the texture units.

You just answered your own question. The market wants cheap, low-power (watt) gpu + cpu. That is what it gets.

A better question is, why would they do what you are suggesting, when what they did will serve 90% of the market better?

exar333 · Apr 25, 2012

Cost reduction (GPU on-die not discrete) and 2-4 cores is what the market wants.

BTW, the article listed in your sig is ridiculous (off-topic I know).

Matt1970 · Apr 25, 2012

ExarKun333 said:
BTW, the article listed in your sig is ridiculous (off-topic I know).

Go look at his old posts.

Anarchist420 · Apr 25, 2012

podspi said:
A better question is, why would they do what you are suggesting, when what they did will serve 90% of the market better?

Good point, but a lot of things don't sell in huge numbers... between people who own a 2500k or an $1k+ CPU, I'm sure the former is close to 90%. Maybe both are profitable but I don't know. I just think nvidia and AMD need some competition and nothing I could do would make nvidia's drivers any good.

podspi said:
You just answered your own question. The market wants cheap, low-power (watt) gpu + cpu. That is what it gets.

Thank you

Don Karnage · Apr 25, 2012

So you can still play Counter-Strike Source after you blow up your video card overclocking

pelov · Apr 25, 2012

Err... since when do games use AVX?

You're implying that Intel should have made a bigger chip with beefier FPUs because... why?

I'm not sure I understand your argument. At all, really. Beefing up AVX does nothing in terms of display and actually hampers video. What you're arguing for, oddly enough, is something that AMD is looking to bring forward with HSA. By off-loading FP tasks to the GPU, or your hypothetical monster FPUs, it allows for synthesis of tasks on the CPU, or APU or whatever the hell you want to call it.

What this has to do with 8-10 cores and 20 threads, which would be completely useless for almost everybody, I have no idea. Why such monstrous amounts of cache are necessary makes a bit more sense considering your CPU would be an APU or CPU + IGP (you're literally describing the thing you don't understand), I'm not sure either.

I think it stems from overemphasizing AVX. Who/what the hell uses AVX anyway? Most of the world is still stuck on ISAs released a decade ago.

JimmiG · Apr 25, 2012

I guess that's kind of where AMD is planning to go eventually with what used to be called Fusion - blurring the line between what is the "GPU" and what is the "CPU".

ShintaiDK · Apr 25, 2012

JimmiG said:
I guess that's kind of where AMD is planning to go eventually with what used to be called Fusion - blurring the line between what is the "GPU" and what is the "CPU".

Fusion and APU is a pure marketing gimmick. Sandy and Ivy is more integrated than Llano or Trinity for that matter GPU wise due to the shared L3.

The only reason you see GPUs in CPUs is due to cost reductions. But you can spin everything into something it aint.

Cerb · Apr 25, 2012

Anarchist420 said:
Programmable blending/depth should be the future

Too late. DX11 games are already out and being played. It's the present/past, not future.

Anyway, I was looking for a critique of this idea. Programmable blending/depth should be the future, IMO, because dedicated hardware is not always faster than genericized especially now that AVX2 will have FMA and with good programming and drivers. Intel is quite a bit ahead of TSMC in terms of process, so they should use it to their advantage.

There isn't any real advantage there, though. It will result in worse performance (the CPU could be running a regular CPU thread), and take more work for drivers. Why do that? They have die space to burn, and not eating into CPU performance is a better option.

Even in the distant future, when dedicated GPU cores are past novelties, I fully expect to see dedicated raster hardware alongside the CPU, since it's just so efficient, in comparison to general purpose ALUs.

LoneNinja · Apr 25, 2012

ShintaiDK said:
Fusion and APU is a pure marketing gimmick. Sandy and Ivy is more integrated than Llano or Trinity for that matter GPU wise due to the shared L3.

The only reason you see GPUs in CPUs is due to cost reductions. But you can spin everything into something it aint.

I might be wrong, but I think it lowers power consumption as well by integrated the graphics which use to be on the northbridge into the CPU.

Either way the PC/Tech market is moving to mobile computing devices, lower power draw and cheaper to make products is where the money is to be made. Those of us wanting high end desktops with little concern of power consumption are becoming a niche market.

Lonyo · Apr 25, 2012

Intel used to have crappy integrated graphics and used CPU cores to emulate functions that weren't in the hardware.
It was crap.
Intel are already struggling to even come close to what AMD has using a dedicated GPU. How would sharing GPU and CPU resources make things better?

Dedicated hardware is still MUCH faster than generic, especially when it comes to things like GPU tasks.

cytg111 · Apr 25, 2012

Anarchist420 said:
I don't understand the concept behind them, even for people who want a low-end GPU (which is what they're designed for).

The reason why is because they could just beef up the AVX registers/cores/whatever and add texture units. With FMA, 512 bit wide FPUs/core, and good drivers I think they could do a lot more by emulating the ROPs/depth stencil units.

Instead of Ivy Bridge, they should've just done this:
8 or 10 cores (16 or 20 threads), with AVX2, 12-16 MB low latency L3 cache (or even L3 cache per core and shared L4 cache) @ 3.5 GHz or so. Most games don't take advantage of more than 4 cores, so 4 or 6 cores (plus texture units) would be left for the blending, pixel/vertex/geometry shaders, and depth. Via the driver, the number of cores doing graphics could be user-assigned.
a fast QP interface.
on die TMDS.
on die texture addressing/filtering units with 1MB L2 cache @ 800MHz.

That would result in a higher TDP and cost more, but if the iGPU is going to be low-end anyway, then I see no reason to just emulate the traditional GPU elements other than the texture units.

Anyway, I was looking for a critique of this idea. Programmable blending/depth should be the future, IMO, because dedicated hardware is not always faster than genericized especially now that AVX2 will have FMA and with good programming and drivers. Intel is quite a bit ahead of TSMC in terms of process, so they should use it to their advantage.

If the resulting gpu 'flops' are worth the while (and that is your general question right), I can certainly see your point .. And it could be sort of like the old days, sharing RAM with your video card... how many of your 16 processing cores would you like to perform gfx today for this particular game? And you could throw the #cores at it what makes sense for the desired application.

But at the end of the day, #transistors count, and dies get too big fast. Either way you put it, having more processing power, in form af a descrete card (300watt parts, do we know any cpu's up there?) ot otherwise is going to be a net win.

Munky · Apr 25, 2012

That's a bad idea from a compatibility and competitiveness point. If you just do the bare minimum implementation, and emulate functional units to reduce costs, you'll get slaughtered by the competitors who are putting an actual GPU into their integrated solutions. If you want any sort of competitive performance and compatibility with modern games and API's, you'll need to have an actual GPU. CPUs and GPUs are designed for different workloads, and with a different set of priorities.

cytg111 · Apr 25, 2012

Lonyo said:
Intel used to have crappy integrated graphics and used CPU cores to emulate functions that weren't in the hardware.
It was crap.
Intel are already struggling to even come close to what AMD has using a dedicated GPU. How would sharing GPU and CPU resources make things better?

Dedicated hardware is still MUCH faster than generic, especially when it comes to things like GPU tasks.

cant find the link, but theres an interview with carmack on youtube (duh), about rage, where he goes into detal about what consoles does much much better than PC's and evens out the factor 10'ish flops due to this design .. and its actually in respect to shared resources.

BenchPress · Apr 25, 2012

Lonyo said:
How would sharing GPU and CPU resources make things better?

Because a quad-core Haswell CPU will be capable of 500 GFLOPS, thanks to AVX2. For the record, Llano's GPU is rated at 480 GFLOPS.

So instead of a quad-core + IGP, why not go for a homogeneous 8-core that can deliver 1 TFLOPS? The die size would be the same.

It's no coincidence that Intel is adding TSX technology to Haswell. It vastly improves the multi-core scalability.

Dedicated hardware is still MUCH faster than generic, especially when it comes to things like GPU tasks.

Please define "dedicated" and "GPU tasks". The biggest portion of a GPU is not dedicated to graphics. It's a fairly generic multi-core programmable processor with wide SIMD units. Guess what, the CPU is multi-core now too, and AVX2 widens everything to 256-bit (and there are three such units per core).

Nothing is preventing the CPU from becoming really good at graphics.

DaveSimmons · Apr 25, 2012

> Nothing is preventing the CPU from becoming really good at graphics.

That's why we all use Larrabee now . . .

pelov · Apr 25, 2012

BenchPress said:
Please define "dedicated" and "GPU tasks". The biggest portion of a GPU is not dedicated to graphics. It's a fairly generic multi-core programmable processor with wide SIMD units. Guess what, the CPU is multi-core now too, and AVX2 widens everything to 256-bit (and there are three such units per core).

Nothing is preventing the CPU from becoming really good at graphics.

This is the same argument that people use for defending Bulldozer but it never works. You can't claim something is amazing and then ask the entire planet to gravitate towards the new ISAs and architecture. It just doesn't work that way. Nobody will be willing to put in that extra work if it means they're only catering to a very small crowd, and in this particular case it would require more work to retain backwards compatibility. It's an added feature and not something that's incredibly advantageous and game-changing.

The same thing was said for AVX implementation, but where is it exactly? A few synthetic benchmarks and a handful of applications :/

CPUarchitect · Apr 25, 2012

Anarchist420 said:
...emulate the traditional GPU elements other than the texture units.

There isn't much left to "emulate". The majority of the GPU uses vector instructions very similar to those of the CPU. But for the longest time, the CPU has been lacking vector instructions for parallel memory accesses. This will finally change with Haswell's AVX2 instruction set, which adds gather support.

And this would even eliminate the need for dedicated texture units. Currently the most expensive task for real-time software renderers is fetching each texel individually. This could become up to eight times faster with AVX2.

BenchPress · Apr 25, 2012

DaveSimmons said:
> Nothing is preventing the CPU from becoming really good at graphics.

That's why we all use Larrabee now . . .

Larrabee is a whole different story. It's an add-on card, so you have to pay extra. It had to live up to the expectations of high-end competitive cards. AVX2 on the other hand will be in each and every CPU from Haswell forward. You basically get it for free with your next CPU. And unlike Larrabee, it would only have to deliver "adequate" graphics to be a viable alternative to an IGP. Heck, even if you have to sacrifice a little graphics performance, getting an 8-core for the price of a quad-core + IGP sounds like a great deal to me.

Olikan · Apr 25, 2012

pelov said:
The same thing was said for AVX implementation, but where is it exactly? A few synthetic benchmarks and a handful of applications :/

be glad when things actually use SSE....

(looking to skyrim and PhysX...oh, they just got patched :sneaky

pelov · Apr 25, 2012

Olikan said:
be glad when things actually use SSE....

(looking to skyrim and PhysX...oh, they just got patched :sneaky

Exactly. ISAs are added at a rate that far exceeds their implementation in software. It's difficult to get excited after a while when you realize that by the time you see something that can leverage them we'll be talking about AVX3 or FMA6.1337. It's silly.

AVX is great when it's used and AVX2 will be awesome too, I'm sure. But these claims of "Intel will mop the floor with (insert ISA here)" are silly. Take a AVX2, add 5-10 years then we can begin discussing just how great it is. When you look at it that way you'll quickly realize how silly that argument is :/

That's a lot of sillies.

IntelUser2000 · Apr 25, 2012

BenchPress said:
Because a quad-core Haswell CPU will be capable of 500 GFLOPS, thanks to AVX2. For the record, Llano's GPU is rated at 480 GFLOPS.

Come on!

Haswell - 2013
Llano - 2011

Trinity - 820GFlops - 2012
Kaveri - 1050GFlops - 2013

What if we compare to Intel's own?

300GFlops, HD Graphics 4000 @ 1150MHz 16EUs
750GFlops, Haswell based iGPU 40EUs

500GFlops is also using 8 core CPU. 4 core ends up at half that. Dual core ends up at 125GFlops.

Why it doesn't make sense even if it can be better:
-You'd get crappy graphics performance in a Notebook form factor, especially dual core Ultrabook chips they are trying to push!
-You still need 8 cores
-8 cores fully active with FP means more power used

Not even Intel seems to agree even by just looking at Ivy Bridge graphics, nevermind Haswell.

What's the purpose of integrated GPUs?

Diamond Member

Diamond Member

Diamond Member

Golden Member

Diamond Member

Lifer

Diamond Member

Platinum Member

Diamond Member

Platinum Member

Lifer

Elite Member

Senior member

Lifer

Lifer

Diamond Member

Lifer

Senior member

Elite Member

Diamond Member

Senior member

Senior member

Platinum Member

Diamond Member

Elite Member