yeah I'm pretty sure you're right. My old laptop had more than enough GPU to play WoW on medium settings but even with it at ultra low it could only muster about 13fps due to bandwidth bottleneck. Sometimes I wish they would give you 64MB video memory integrated...would it be that hard? Then you could at least get some mildly serious gaming done with older games.
😕
My experience as of late with an A6 APU has been quite good, and the idea certainly has shown it's validity in making decent entry level gaming performance cheap to the masses for the most part.
Easily the three things that hold AMDs APUs back (especially in laptops) are 1) memory bandwidth, 2) GPU clock speed and 3) CPU clock speed.
Graphics performance has been hit and miss with various titles. Surprisingly Crysis performed quite well, and despite the popular assertion that it is a badly optimized, the A6 managed just over 20 FPS average @ 1366 x 768, no AA, and medium settings, with physics and shaders at high. Left 4 Dead 2 and Team Fortress 2, as expected could be run on just about max settings, though I would leave AA off since it is a very bandwidth depleting feature.
The main "miss" was the lack of acceptable performance in Battlefield 3 and Call of Duty 4, though they were probably way too hard on the CPU cores of the A6, since they are only 1.5 GHz, and even with turbo, are only 2.4 if the TDP is within acceptable limits. CoD4 in my experience is very high GHz reliant, not architecture reliant. My CoD4 test was on a 40 player server, with lots of crap going on, and max settings without AA or AF, but that is something to expect a modern system to be able to deal with, without issue.
Between more memory bandwidth and on-chip video memory, I think it's more feasible to continue boosting the bandwidth, especially since it's beneficial to the CPU, unless the bandwidth and size of on die video memory was 512 MB and at least 40 GB/s for something like a Trinity class GPU, and still cost effective. The proposition of using that on chip RAM as a cache for the CPU is a interesting idea, but that's just another layer programmers would have to consider about how to make the use of when PC systems already vary quite wildly.