• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Asymmetric (Heterogeneous) Cores

tynopik

Diamond Member
There are certain workloads that benefit from lots of small cores while others prefer fewer big, fast cores

Why not the best of both worlds?

http://software.intel.com/en-us/art...core-hardware-asymmetric-heterogeneous-cores/

At a crude level, integrating cpu and gpu on the same die already does this.

Then there's NVIDIA's Kal-El which is advertised as quad-core but actually has 5 cores. The fifth core was a special low-power core that handled near-idle tasks and then switched over to real quad core if something more demanding came up.

How long till we see something similar on mainstream intel chips?

Perhaps 2-4 'big' cores and 10 'mini' cores?

Obviously a huge problem is getting Windows to support this properly

Or maybe only make it accessible if you program specifically for it like CUDA?
 
Last edited:
ARM doesn't have a max throughput strategy like we do in x86. ARM is about minimizing power and everything else comes second. The 5th core in Kal-El is for receiving push notifications and keeping the clock ticking while the device sleeps. This philosophy doesn't carry over to x86 because there are very few usage models shared by either machine.

Minimizing power consumption in x86 is done with downclocking and power gating, but when it comes time to perform, every resource is called and, like you said, for the way software works *today* there is a minimum amount of symmetry that is preferred, and too much asymmetry can be bad (but we have enough threads about Bulldozer so forget that).

Vectorizing any given job for ten tiny cores isn't always the best thing either, otherwise Knights Corner would've been a successful 45nm part. Non-serial workloads are not explicitly without dependencies and branches of their own, and the large number (itll be more than ten) of P54C cores it takes to surpass a single SNB core for some task may not be worth implementing (certainly not if you are frugal on xtors and watts). If your workload is the exception to this rule, you should buy Knights Corner as a peripheral accelerator. It doesn't mean your OS should run entirely on such a restricted architecture.

I think AMD and intel are on the best possible track for heterogeneity without being too radical. The big general purpose cores do the best they can with the software they have, and the display controller has a graphics major with a math minor. It'll be up to the software guys to decide what workloads belong where, and by the time that comes the display controller will be more than apt enough.
 
Last edited:
The 5th core in Kal-El is for receiving push notifications and keeping the clock ticking while the device sleeps. This philosophy doesn't carry over to x86 because there are very few usage models shared by either machine.

Yeah, I don't think x64 would use such a feature for power-management like Kal-El, I was just giving an example of heterogeneous cores 'in the wild'

Vectorizing a given job for ten tiny cores isn't always the best thing either

Of course it isn't ALWAYS the best thing, but SOMETIMES it is

hence heterogeneous designs 😉
 
My point is that the exceptions are far, far too few to implement in a conventional, mass appeal device. There are plenty of exceptions in HPC scenarios however and the market already has devices to satisfy the rare cases.

If, in 2012, Intel sectioned off 20% of a consumer CPU die for a Knights Corner device, it would be asleep 100% of the time because of the software we work with today. We still have a few years to wait for heterogeneous consumer applications to be written, and for these cases a 750 gflop IGP is quite generous.
 
Last edited:
It's Knights Ferry, not Knights Corner, Knights Corner is the 22nm part that will be a real product. But your general assumptions are valid.

Knights Corner is very different from Knights Ferry. The graphics features are all cut out. It's now a many core processor with massive bandwidth graphics memory on PCB. It seems the change in strategy will work at least.

On Haswell, Intel has been aiming for accelerators. Whether those are FPGA's, or graphics accelerators, or even something entirely different I don't know. But that seems to be the trend happening.
 
heh, sorry. It was all I could do without calling it Knights Turd or checking my facts. I was never interested in LRB. The heaviest job I do is x264 encoding but a few generations of QuickSync could change my view there. Really just wanting to see IVB.
 
That's sort of what Cell in the PS3 does isn't it?

I imagine writing the software to schedule tasks is the difficult part is having different cores that specialize in different jobs. Figuring out the best combination of tasks to cores on the fly is no small feat.
 
While having low-power cores on the desktop would probably be a waste, how awesome would it be to have (with proper OS support, of course) a quad-core laptop with 2 low-power cores (think Atom or Bobcat), and 2 high-power cores + IGP?

On the road? 10 hours battery life. Plugged in, not painfully slow. Ah, I can dream...
 
Back
Top