How many years till we see APUs as standalone gaming PCs?

cbn · Feb 13, 2010

This is being written in response to the memory bandwidth criticisms mentioned in the Sandy Bridge and Llano bad for gamers? thread

It was brought to my attention here that Intel stipulates memory bandwidth based on number of cores/total CPU computational power. This makes sense to me. Greater CPU throughput= greater need for memory bandwidth.

However, 3.5 years after the release of Q6600 Quad core we seeing 75% scaling in only one game so far (Dragon Age Origins). All other games scale 50% better at best. Process node development and the ability to multiply cores seems to be moving at a faster rate than multi-threaded game programming.

Therefore could we end up seeing a migration of ram from higher cored systems to lower core count APU boards for the purposes of increasing Video/Graphics bandwidth?

Intel's 22nm should be upon us in 2012 with Eight cores as the default processor. How much system memory bandwidth will that require? How much memory bandwidth for a 12 core single socket system? Quadruple channel DDR3? DDR5?

Next consider how multi-threaded games could be by 2012? Will quad core still be sufficient? If so would using faster memory memory from these fullsize/high core count mainboards be desirable for smaller Fusion systems if only to assist the GPU render graphics?

P.S. I realize gaming isn't exactly the most "noble" use of a computer. However, young kids like puzzles/entertainment. Maybe learning to operate/tinker with a PC is more educational than just playing on a console? Or maybe I am completely wrong and consoles evolve into something greater than what they are now.

cbn · Feb 13, 2010

Oops, my mistake. I meant to put this thread in the CPU forum. (Moderators can you please move this for me)

DominionSeraph · Feb 13, 2010

soccerballtux · Feb 13, 2010

quad core will be sufficient in 2012. There are only so many calculations one can make that are necessary.

cbn · Feb 13, 2010

soccerballtux said:
quad core will be sufficient in 2012. There are only so many calculations one can make that are necessary.

I agree. Quad core could still very well be the standard in 2012.

But then how many cores would be possible on a full size (non-fusion) die @ 22nm? 12 cores? 16 cores? What kind of bandwidth is needed for that throughput?

If the memory bandwidth for these higher count chips is indeed greater could we see a strategy where the very same high speed memory sticks *might* be used on lower core count APUs for a different reason?

The smallest Sandy Bridge CPU is supposed to be a dual core. Will we see quad core as the smallest CPU for 22nm? Could sharing high speed DDR5 from a 16 core CPU be useful for a 22nm quad core's fused GPU?

cbn · Feb 13, 2010

DominionSeraph said:
Huh?

AMD Llano is a quad core "stars" CPU with *supposedly* 480 stream processors fused together in a single die. They are calling this fusion of "CPU/GPU" an "APU".

This is a very large IGP . AMD 790 GX, in contrast, only has 40 stream processors.

However, people have said system ram bandwidth will hold this massive IGP back as far as 3D gaming goes. (re: the purpose of including large fusion IGPs has more to do with heterogenous computing than 3D graphics)

Others have said AMD's "sideport memory" might be a way to get around this bottleneck. If so, could GDDR can be used for this purpose?

P.S. If you are already an IT expert please forgive my lame explanations. I'd rather you tell me how I am screwing this thread up instead.

cbn · Feb 13, 2010

http://www.eetimes.com/news/design/showArticle.jhtml?articleID=217600854

According to this we will see DDR4 for 2011.

Could this be enough to cover 3D requirements for Llano without the need for side port memory?

zephyrprime · Feb 13, 2010

Quad core will be sufficient in 2012. Because of Amdahl's Law, scaling to higher and higher core counts becomes tougher and tougher as more cores are added. So the time it takes to scale software to 8 cores efficiently will be ~2x the time it took to scale to 4 cores. That means it'll take maybe 10 years (I'm guessing here)!

Memory bandwidth is a big problem for apu systems. I expect that we will see a jump in memory bandwidth to accomodate APU system rather than to accomodate high PCs or servers. Things like tripple channel are really useful for APU systems but unfortunately, APU systems need to be low cost. The computer engineering guys need to really upgrade the physical interface used by memory modules. Dimm packaging isn't going to be good enough for really high clock speeds and bit widths. Too bad rambus owns so many patents on using low-voltage differential serial interfaces as applied to ram because moving ram to a serial interface would be very useful.

soccerballtux · Feb 13, 2010

You know I imagine it will function something like a IGP-- has great potential, but is bandwidth starved. Feh, my Radeon Xpress 200m would have done WoW at twice the framerate or more if it just had a single 64MB slice of dedicated memory.

We'll see when it gets here.

cbn · Feb 14, 2010

http://www.xtremesystems.org/forums/showpost.php?p=4235337&postcount=29

http://www.xtremesystems.org/forums/showpost.php?p=4236900&postcount=48

Another opinion on the bandwidth issue of Llano's IGP.

IntelUser2000 · Feb 14, 2010

High-bandwidth dedicated RAM is expensive though. Current sideport memory is not very fast, nor is high capacity. I think they'll probably remain band-aid solutions until real alternatives arrive.

And the real alternative I believe will be coming with Haswell, or near its timeframe. It should be the first CPUs with eDRAM or other technologies like 2T SRAM to pack more MB. At that point it probably becomes more mandatory than just: "Oh we should do this..." and such.

512MB of DRAM as MCM with 100GB/s link to the CPU and with later iterations, a fully stacked DRAM with larger capacities and hundreds of GB/s to 1TB/s of bandwidth should solve the problem with memory.

Next is purely my opinion:
-By then the graphics would be fully integrated into the CPU into the pipeline just like FPUs were on the 486. We might be wondering at that point whether opting for dedicated GPUs is worth it at all. With console-ization of upcoming games and such, are PC gaming fading into obscurity?

CPU cores:
Sandy Bridge: 6, 8 cores for high-end
Bulldozer: 8 cores

22nm: Die shrink, cores only increasing in enterprise servers
16nm: Possibly a few Large core + many small cores hybrid?

cbn · Feb 16, 2010

zephyrprime said:
Quad core will be sufficient in 2012. Because of Amdahl's Law, scaling to higher and higher core counts becomes tougher and tougher as more cores are added. So the time it takes to scale software to 8 cores efficiently will be ~2x the time it took to scale to 4 cores. That means it'll take maybe 10 years (I'm guessing here)!

http://games.on.net/article/7836/Metro_2033_-_Technical_QA

Take a look at what the developer of Metro 2033 is saying about core scaling in this interview. Not only does he claim the game will scale 100% with quad core over dual core, he is also mentioning scaling beyond 8 cores as well.

from the linked article said:
games.on.net: So if I go from a dual to quad core, what performance increase can I expect?

Olez: If you are not bottlenecked by the video card, we have linear scaling – so double the performance. This goes all the way up to eight and sixteen cores.

If this is true then significantly less video card would be needed to achieve playable frame rates at X resolution and detail settings.

I can't wait to see the benchmarks.

soccerballtux · Feb 16, 2010

Computer Bottleneck said:
zephyrprime said:

Quad core will be sufficient in 2012. Because of Amdahl's Law, scaling to higher and higher core counts becomes tougher and tougher as more cores are added. So the time it takes to scale software to 8 cores efficiently will be ~2x the time it took to scale to 4 cores. That means it'll take maybe 10 years (I'm guessing here)!

Memory bandwidth is a big problem for apu systems. I expect that we will see a jump in memory bandwidth to accomodate APU system rather than to accomodate high PCs or servers. Things like tripple channel are really useful for APU systems but unfortunately, APU systems need to be low cost. The computer engineering guys need to really upgrade the physical interface used by memory modules. Dimm packaging isn't going to be good enough for really high clock speeds and bit widths. Too bad rambus owns so many patents on using low-voltage differential serial interfaces as applied to ram because moving ram to a serial interface would be very useful.

Click to expand...

http://games.on.net/article/7836/Metro_2033_-_Technical_QA

Take a look at what the developer of Metro 2033 is saying about core scaling in this interview. Not only does he claim the game will scale 100% with quad core over dual core, he is also mentioning scaling beyond 8 cores as well.

If this is true then significantly less video card would be needed to achieve playable frame rates at X resolution and detail settings.

I can't wait to see the benchmarks.

wonder what they're computing

VirtualLarry · Feb 16, 2010

IntelUser2000 said:
Next is purely my opinion:
-By then the graphics would be fully integrated into the CPU into the pipeline just like FPUs were on the 486. We might be wondering at that point whether opting for dedicated GPUs is worth it at all. With console-ization of upcoming games and such, are PC gaming fading into obscurity?

Scary, but highly likely. Look at what happened to dedicated FPU number-crunching cards that had one or more i860 RISC chips onboard, for their high FPU performance. I don't really see them selling anymore. Once dedicated on-die GPUs reach sufficient performance, I see dedicated GPUs dying off.

cbn · Feb 16, 2010

VirtualLarry said:
Once dedicated on-die GPUs reach sufficient performance, I see dedicated GPUs dying off.

How long do you think it will take for this to happen?

IntelUser2000 · Feb 17, 2010

Haswell.

By then we'll have so many cores we'll be limited by memory bandwidth for performance scaling in most of the segments. Since we can't have dozens of memory channels, the only logical solutions are things like eDRAM.

Rumored of having "revolutionary" cache hierarchy and such makes me believe Haswell, and whatever AMD will have to make by then will have mass eDRAM. 8 cores and above are also what's generally believed as when memory bandwidth becomes a real problem.

Imagine what'll happen to the performance of iGPUs when eDRAM is next to the CPU with 100-200GB/s of bandwidth. Sure, you'll still have system memory, but that DRAM will serve most of CPU and GPU functions.

EDIT: Sandy Bridge is supposed to have a version with 2 GPUs in them. I don't know if I want to believe a SLI-like configuration ON THE DIE with iGPUs, but that's not the significant part of the story.

The Core i3/i5 GPU=roughly Geforce 9400 and AMD 790GX performance

Sandy Bridge's iGPU clock is said to be at 1-1.4GHz and it can also share the L3 cache. Along with some minor architecture improvements its safe to say that it'll be 2x faster than Core i3/i5 iGPU.

Now that is not enough to get close to Llano's iGPU. But with 2x cores it might be. So from both camps we'll see 3x+ faster iGPUs. That'll probably be enough to run most games on high settings playable at 17" monitor resolutions.

grimpr · Feb 17, 2010

You got that wrong buddy if you believe that whatever Intel puts into the Haswell die, fast interconnects, massive edrams, and iGPUs as you call them will stand a chance at whatever Nvidia and AMD have up their sleeves in Discreet GPUs, concerning Bandwidth and Raw Performance. Moore's works also for those two companies you know. Dont be surprised seeing 2011/2012 GPUs at 500GB/s bandwidth oceans of ALUs in them packing 8 GB's of GDDR5 RAM onboard. 28nm at Globalfoundries and TSMC are slowly rolling out.

No offense. But you should start exploring the graphics world a little more. Its nice and cosy living and breathing Intels manifesto.

IntelUser2000 · Feb 17, 2010

Yes, you are right there. But I never said it'll be fast as the top-end GPUs.

Look at how well the 9600GT run when done at more sane and common resolutions and settings: http://www.anandtech.com/video/showdoc.aspx?i=3742&p=3

Right now IGPs do that only with low or medium settings. You do not need 500GB/s or so bandwidth the high end GPUs will have by then using those settings. Don't you think discrete GPUs will start feeling squeezed when even new games run that well?

I'm pretty sure the reality won't be that rosy for iGPU. AMD will probably realize that their profit margins might decrease when that happens and such. But who knows, maybe the costs of getting an iGPU won't be $5 like today, but $30-40 like with really low end.

Martimus · Feb 18, 2010

IntelUser2000 said:
Yes, you are right there. But I never said it'll be fast as the top-end GPUs.

Look at how well the 9600GT run when done at more sane and common resolutions and settings: http://www.anandtech.com/video/showdoc.aspx?i=3742&p=3

Right now IGPs do that only with low or medium settings. You do not need 500GB/s or so bandwidth the high end GPUs will have by then using those settings. Don't you think discrete GPUs will start feeling squeezed when even new games run that well?

I'm pretty sure the reality won't be that rosy for iGPU. AMD will probably realize that their profit margins might decrease when that happens and such. But who knows, maybe the costs of getting an iGPU won't be $5 like today, but $30-40 like with really low end.

Modern games will change with time as well. My IGP on my laptop runs Quake 1 great, and has no problems when I try to run NetHack or X-Com. The point is that the software will adjust when the hardware is there to adjust to. Look at how often Microsoft has increased the amount of background tasks on their OSes as the years go on. They do that because they can with more modern equipment than they had in the past.

AMD may see a short term decrease in GPU sales when Llano is released, but I expect software to adjust and make those IGPs the new minimum in supported graphics.

tweakboy · Feb 18, 2010

Very nice informational thread.

I guess 2014 or 2015 .

cbn · Feb 21, 2010

Martimus said:
AMD may see a short term decrease in GPU sales when Llano is released, but I expect software to adjust and make those IGPs the new minimum in supported graphics.

Not only that but we have to consider the sheer number of lower resolution LCDs still on the used market that could make these APUs popular as gaming PCs.

Heck, still to this day the video card forum gets flooded with questions about using HD58xxx with older 12x10 or 14x9 LCD monitors. This very surprising to me considering 1080p starts at only $150.

cbn · Feb 21, 2010

The following question is a little bit off topic:

Does anyone think the era of heterogenous computing (ie, APUs) could stimulate the development of wider (ie, higher IPC) x86 CPU cores?

If so, why?

grimpr · Feb 22, 2010

x86 cores are more than sufficient as of now,i doubt we will see major advances there, maybe a 3%-5%, Intel and AMD are focusing their major efforts into the uncore section.Llano is all about the uncore like Sandy Bridge.

dac7nco · Mar 1, 2010

VirtualLarry said:
Scary, but highly likely. Look at what happened to dedicated FPU number-crunching cards that had one or more i860 RISC chips onboard, for their high FPU performance. I don't really see them selling anymore. Once dedicated on-die GPUs reach sufficient performance, I see dedicated GPUs dying off.

I've been running server-based, realtime en/decrypt systems based off of no-head g8x nV systems for a number of years now. These make comparable quadro/Tesla machines look like shit.... they are only an 88-series nV card, w/the soft to give them balls. I thought I'd post this, as "VirtualLarry"s post reminded me of the old Fortezza cards we used way back when I has a balls-out ISDN PRI setup.

Daimon

hans007 · Mar 1, 2010

Computer Bottleneck said:
AMD Llano is a quad core "stars" CPU with *supposedly* 480 stream processors fused together in a single die. They are calling this fusion of "CPU/GPU" an "APU".

This is a very large IGP . AMD 790 GX, in contrast, only has 40 stream processors.

However, people have said system ram bandwidth will hold this massive IGP back as far as 3D gaming goes. (re: the purpose of including large fusion IGPs has more to do with heterogenous computing than 3D graphics)

Others have said AMD's "sideport memory" might be a way to get around this bottleneck. If so, could GDDR can be used for this purpose?

P.S. If you are already an IT expert please forgive my lame explanations. I'd rather you tell me how I am screwing this thread up instead.

it probably wont be that big. the core is probably the propus core which is 140mm @ 45nm.

the llano will probably be on 32nm so that will probably cut the die size down quite a bit.

if the gpu compnents run at the cpu's speed say 2-2.5ghz it will be fairly fast for a 400-480 shader chip. so basically the gpu part would be pretty similar to say a 5570 or 5670 core, but using DDR3 system ram (which would be 128-bit say ddr3-1333). SO it would have tons of shader power if it runs at 2.5 ghz or more. given the memory constraints maybe they could go with 200 shaders running at cpu speed and the ram would still bottleneck it (like how ddr3 bottlenecks a 5570 and a 5670 had ddr5).

anyhow, it will be interesting to see how it turns out. also a 40nm 5670 die is 104 mm . and that has 400 shaders. if it really is just a propus core and a 5670 core or something like that it would still not be that big (around the size of a core i7 920 which is 263mm) even at the current 45 and 40nm. at 32nm it would probably not be excessively large.

How many years till we see APUs as standalone gaming PCs?

Lifer

Lifer

Diamond Member

Lifer

Lifer

Lifer

Lifer

Diamond Member

Lifer

Lifer

Elite Member

Lifer

Lifer

No Lifer

Lifer

Elite Member

Golden Member

Elite Member

Diamond Member

Diamond Member

Lifer

Lifer

Golden Member

Senior member

Lifer