• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Engineers boost Llano GPU performance by 20% without overclocking

Olikan

Platinum Member
Found this on the web...

To achieve the 20% boost, the researchers reduce the CPU to a fetch/decode unit, and the GPU becomes the primary computation unit. This works out well because CPUs are generally very strong at fetching data from memory, and GPUs are essentially just monstrous floating point units. In practice, this means the CPU is focused on working out what data the GPU needs (pre-fetching), the GPU’s pipes stay full, and a 20% performance boost arises.

look at the link, the journalist is very confused 🙄

http://www.extremetech.com/computing/117377-engineers-boost-amd-cpu-performance-by-20-without-overclocking

Our experiments on a set of benchmarks show that our proposed preexecution improves the performance by up to 113% and 21.4% on average
from the link of the university
http://news.ncsu.edu/releases/wmszhougpucpu/
 
Last edited:
So are we using the cpu and main memory as one giant branch predictor too? It makes no sense. I dont see how this 20% could have any relevance in real world computing.
 
So are we using the cpu and main memory as one giant branch predictor too? It makes no sense. I dont see how this 20% could have any relevance in real world computing.
so now the new liano will have a better gpu and a worse cpu !?!?!?
so before we had a good cpu with intergrate graphic as joke
now we will have good graphic with a joke cpu?????
 
This looks essentially like a run-ahead execution, allowing otherwise idle CPU resources to speed up execution by doing memory fetches for the GPU and warming up the cache. Before this idea was applied between threads in the CPU only. Interesting idea although I'm sure the quoted improvements are extremely optimistic 🙂

For some reason I got the initial impression that this was going to be some hardware trick but it's just clever software optimization.
 
Last edited:
so now the new liano will have a better gpu and a worse cpu !?!?!?
so before we had a good cpu with intergrate graphic as joke
now we will have good graphic with a joke cpu?????
An integrated GPU with the power of a 3870 is NOT a joke.
 
Updated @ 17:54: The co-author of the paper, Huiyang Zhou, was kind enough to send us the research paper. It seems production silicon wasn’t actually used; instead, the software tweaks were carried out a simulated future AMD APU with shared L3 cache (probably Trinity). It’s also worth noting that AMD sponsored and co-authored this paper.

Erm, ok. Larrabee destroys all when it comes to these kinds of research results. 😀
 
Found this on the web...



look at the link, the journalist is very confused 🙄

http://www.extremetech.com/computing/117377-engineers-boost-amd-cpu-performance-by-20-without-overclocking


from the link of the university
http://news.ncsu.edu/releases/wmszhougpucpu/

wow this is really deep stuff. I wonder if IDC, CTho, Intel etc will be able to make sense of these tips! Seems strange to just throw this performance enhancing mod to the wolves. Do you think we'll be able to apply a bios update to do the same thing on our desktops?
 
Horsepower means very little though doesn't it? It's my understanding that what is holding integrated GPUs back is memory to GPU bandwidth. Or is that wrong?

yeah I'm pretty sure you're right. My old laptop had more than enough GPU to play WoW on medium settings but even with it at ultra low it could only muster about 13fps due to bandwidth bottleneck. Sometimes I wish they would give you 64MB video memory integrated...would it be that hard? Then you could at least get some mildly serious gaming done with older games.

The world this uses dedicated video , for ultimate performance... gl

😕
 
wow this is really deep stuff. I wonder if IDC, CTho, Intel etc will be able to make sense of these tips! Seems strange to just throw this performance enhancing mod to the wolves. Do you think we'll be able to apply a bios update to do the same thing on our desktops?

What I get out of this is that they are basically making GPGPU stuff faster with the help of the CPU, not making CPU stuff faster.

This paper presents a novel approach to utilize the CPU resource to facilitate the execution of GPGPU programs on fused CPU-GPU architectures.

So you need to start with an app that already lends itself nicely to GPGPU compute, then rewrite it to take advantage of the CPU, and your GPGPU application will run faster.

Sounds like good stuff, but its not going to benefit x86 apps as far as I can tell.
 
Wouldn't it be faster still to make the whole die GPU blocks? Using the whole CPU power for one part of the GPU's function seems inefficient on power.
 
It's ironic that the CPU is being used to speed up the GPU.

Things like this reinforce that the future will be homogeneous architectures, not heterogeneous ones. The CPU simply needs high-throughput execution units like a GPU. And Intel will offer just that with the AVX2 support in Haswell next year. GPGPU's days are numbered.
 
Die shot? I didn't see it, do you have full access to the article?

amd-llano-die-348x196.jpg
 
Things like this reinforce that the future will be homogeneous architectures, not heterogeneous ones. The CPU simply needs high-throughput execution units like a GPU. And Intel will offer just that with the AVX2 support in Haswell next year. GPGPU's days are numbered.

this is something that i am very curious about...

amd's GCN supports x86 instructions, while avx-1024 seems to be a holy grail.
which one is better?
 
Obviously intel emplyees are going to pump whatever they have. But the work MS put into C++ AMP would suggest that GPGPU has a strong future, it isn't dead. lol There's more potential in 10,000 shaders than there is in AVX2.
 
Back
Top