• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

I did not realize we were so close to cheap, 32 core desktop computers

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Some ARM cores are really tiny. For example, at 40 nm, an ARM Cortex A5 is only 0.53 mm^2, including caches, and would consume only 80 mW if you could run it at 1 GHz.

http://www.arm.com/products/processors/cortex-a/cortex-a5.php

In terms of die area, and ignoring on-chip networks, and uncore-type stuff, you could fit 554 such ARM cores in the area of a single quad-core Sandy Bridge CPU (and this is comparing TSMC 40 nm to Intel 32 nm, so on the same node, it would be even more). In terms of power budget, you could fit 1187 ARM cores in the power budget of a single quad-core Sandy Bridge CPU.

Let's say you want to build a ~300 mm^2 chip (like 4C Sandy Bridge) out of these ARM cores, and that you're going to have to dedicate 50% of your die area to things like uncore, memory controllers, and some last level cache. And then let's round down for some nice numbers, so we end up with a 256 core ARM chip. This is a totally feasible thing that could be built right now.

FYI, Amdahl's law is not why these things aren't being made. If you can find just the right workload it might make sense to build such a chip. Namely, an embarrassingly parallel / data parallel workload that is bound neither by memory bandwidth/core nor memory capacity/core, and for some reason doesn't map well to a SIMD GPU, or needs more memory capacity than a GPU can offer. Gustafson's law trumps Amdahl's law in most practical situations, so people on this board shouldn't be worried about Amdahl's law, in my opinion.
 
Once upon a time I would have said that we won't need more than 8 cores on a system. However, as time has gone on, it looks like concurrent paradigms are really starting to take hold in the programming world (they are getting tighter and tighter integration with languages).

I imagine that within the next 10 years, most new applications will have some pretty strong abilities to use most of the cores on your box.
 
Some ARM cores are really tiny. For example, at 40 nm, an ARM Cortex A5 is only 0.53 mm^2, including caches, and would consume only 80 mW if you could run it at 1 GHz.

http://www.arm.com/products/processors/cortex-a/cortex-a5.php

In terms of die area, and ignoring on-chip networks, and uncore-type stuff, you could fit 554 such ARM cores in the area of a single quad-core Sandy Bridge CPU (and this is comparing TSMC 40 nm to Intel 32 nm, so on the same node, it would be even more). In terms of power budget, you could fit 1187 ARM cores in the power budget of a single quad-core Sandy Bridge CPU.

Let's say you want to build a ~300 mm^2 chip (like 4C Sandy Bridge) out of these ARM cores, and that you're going to have to dedicate 50% of your die area to things like uncore, memory controllers, and some last level cache. And then let's round down for some nice numbers, so we end up with a 256 core ARM chip. This is a totally feasible thing that could be built right now.

FYI, Amdahl's law is not why these things aren't being made. If you can find just the right workload it might make sense to build such a chip. Namely, an embarrassingly parallel / data parallel workload that is bound neither by memory bandwidth/core nor memory capacity/core, and for some reason doesn't map well to a SIMD GPU, or needs more memory capacity than a GPU can offer. Gustafson's law trumps Amdahl's law in most practical situations, so people on this board shouldn't be worried about Amdahl's law, in my opinion.

There seem to be entirely new technological things beginning to work out, which may result in products (which rely on HUGE computing power), not necessarily yet available, which could sell in huge volumes.

A good example is that vision recognition systems, of quite amazing performance, have been talked about, in recent times.

E.g. Something which has a picture of a cat in it, and the robot realises it is a cat.

My understanding is that stuff like that, needs HUGE computing power.
One potentially economic and power efficient way of achieving this, is with many core Arm cpu devices, just as you have described.

There are robots that can look around and decide the path it wants to take, purely from the image(s) it sees.

Things like self driving cars would probably not be possible without being able to reasonably economically give the car cpus, with enough processing power.
(Although other techniques, such as FPGAs/Asics and other stuff could also probably do it).
I've not seen details of googles self driving cars hardware, but I seriously doubt it relies on a single core, 4 bit, 100KHz cpu, with 24 clock cycles per instruction, 7 bytes of ram and a massive 512 bytes of rom/flash.
 
Back
Top