Multicore and superscalar architecture

Ruptga

Lifer
Aug 3, 2006
10,246
207
106
Back in the day everyone got really excited about the Pentium because it was the first mainstream CPU with a superscalar architecture (A superscalar architecture fetches, executes, and returns results from more than one (standard) instruction during a single pipeline stage (typically this means a single clock cycle). <=Wikipedia).
Yes that's one weird bit of parenthases, but moving on, multicore has the same basic idea: executing multiple things at the same time, parallelism.

So, for the old timers, is the adoption of multicore similar to superscalar's? The main difference I see is that superscalar uses several different units with different jobs, but multicore (so far) uses several different units that are exactly the same. Is that enough of a difference that the way they're developed/adopted will be too different to compare?

If you think they'll work out basically the same, how long do you think it will be before single-core is a joke like the 486s are now?
 

BrownTown

Diamond Member
Dec 1, 2005
5,314
1
0
It's all just where the parallelism is being extracted from, data level parallelism is exploited by vector processing units, which is the SIMD unit in an x86 CPU. Instruction level parallelism is exploited by a superscalar architecture along with all the tricks associated with extraction more ILP like loop unrolling, reorder buffers, branch prediction etc.. Multi-core exploits thread level parallelism where you have multiple instruction streams from different programs, or different parts of the same program. Basically, what it boils down to in terms of what you can do is how many transistors you have. A few little tricks and you can increase the IPC with only a relatively small addition of transistors, but after awhile there is only so much ILP that can be extracted from most programs, so you reach a point of diminishing returns. At this point it becomes more advantageous to attempt to exploit the TLP by adding additional cores. This certainly isn't a novel idea by any means, it just requires a large enough transistor budget. However, as we all know this only works for certain applications where the program can easily be divided into multiple threads. And right now, even alot of workloads that might be amenable to multiple threads are still single threaded because multi cores are still relatively new in the consumer market, especially for the average user.