Why do modern processors have so many transistors?

Witling

Golden Member
Jul 30, 2003
1,448
0
0
Clearly a great deal of effort has been made to put more transistors onto silicon. But I don't understand why. Instructions can be presented to a processor only so quickly. Each transistor makes a very basic type of decision. Then why are more transistors desireable? Are the extra transistors just to hold information, e.g., the results of a previous calculation?
 

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
If I understand correctly, the 1 instruction 1 operation relationship was pre-MMX. In today's world, you can't survive the speed demand without paralellism and so the more transistors you get, the more you can do at the same time.
 

itachi

Senior member
Aug 17, 2004
390
0
0
pre-pentium pro.. it was before then that processors executed single ops per instruction. with the pentium pro, instructions got decoded to micro-ops.. making the core a risc architecture internally. today, mmx-type instructions are still 1 op per instruction.

new architecture, expanded logic, and larger cache contribute to a larger number of transistors. as the size of the transistor reduces, more can be fitted in the limited area. expanding logic can increase the number of transistors required for a certain function significantly.. but it allows the function to be performed at a much greater speed. for example, ripple carry adders.. when you add A and B, each of the bits are compared to get the carry out and sum (which is dependant on the carry in).. if the bit-width of A and B is small, this method will be sufficient.. however, if it's large.. it'll take a few cycles for the 32nd bit to be computed. carry propagate-generate adders expand the carry-in's and out's.. making it so the carry's are independant of the sum. this increases the amount of transistors required, but also allows the addition to be performed at a significantly higher speed.

i'm tired.. and i got school tomorrow, so i'll finish this up later.
 

Calin

Diamond Member
Apr 9, 2001
3,112
0
0
The increase in transistors is a result of the increase in "basic" operational units. Unlike a 486, which had one floating point unit, the Athlon has no less than three floating point "units", every one of them more or less capable. Also, to present them as just one faster floating point unit (so that the programs written under the assumption there is a single FPU) consumes more binary logic, so more transistors.
The prefetch of the instructions (while introduced to 486), uses algorithms to improve the performance in conditional jumps (so that the processor takes the correct path). This also consumes transistors
 

CTho9305

Elite Member
Jul 26, 2000
9,214
1
81
Here is a labelled p4 die layout - you can see where transistors are going. Note that the actual "Integer Execution Core" is relatively small - a LOT of transistors go into finding ways to cram instructions into the execution units quickly.

1MB of cache takes ~6 million transistors, plus hundreds of thousands or maybe a million more for the "tags", decoders, sense amps, etc.
 

cquark

Golden Member
Apr 4, 2004
1,741
0
0
Originally posted by: CTho9305
Here is a labelled p4 die layout - you can see where transistors are going. Note that the actual "Integer Execution Core" is relatively small - a LOT of transistors go into finding ways to cram instructions into the execution units quickly.

Yes, while cache occupies a huge amount of the total transistors, the biggest use of transistors is in attempting to extract more parallelism from the code. Branch prediction, schedulers, cache and memory access optimizations, and so forth are costing more and more transistors with each generation of microprocessors, yet the improvements are only single digit percentages.

That's the main reason Intel and AMD are moving to dual core chips. They think having two processors will give you a greater performance improvement than more complex logic on a single processor, and they're probably right. See The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software for a look at why we're moving to dual core chips and what that will mean for software development.
 

itachi

Senior member
Aug 17, 2004
390
0
0
Originally posted by: Calin
The increase in transistors is a result of the increase in "basic" operational units. Unlike a 486, which had one floating point unit, the Athlon has no less than three floating point "units", every one of them more or less capable. Also, to present them as just one faster floating point unit (so that the programs written under the assumption there is a single FPU) consumes more binary logic, so more transistors.
define faster.. and exactly how is it that a faster fpu would yield a larger number of transistors? combining all the units into one would yield less transistors.. not more. to have it perform at the same level would be next to impossibe. you can't get a cpi less than 1 without either parallelism or pipelining.. the fpu is already pipelined and extending the pipeline any deeper and you may find yourself in a situation simular to intel.. and parallelism is what is achieved by the seperate fpu entities. so the only real option is to increase the clock rate significantly.
also, saying that the athlon has 3 fpu's gives the wrong impression.. in completing all ops, the 3 units are interdependent.. what one unit lacks, another has.. mmx alu is the only unit thats reciprocated, in the adder and multiplier.
The prefetch of the instructions (while introduced to 486), uses algorithms to improve the performance in conditional jumps (so that the processor takes the correct path). This also consumes transistors
prefetching was introduced to improve the performance of cache. more specifically, to reduce cache-miss penalties. the part about improving the performance of conditional jumps is the reason why we have the branch prediction unit. if the prediction is 'path taken', then the instructions at the offset address are prefetched, assuming they don't already exist in the branch target buffer. if the prediction is 'path not taken' then the next block is prefetched.
 

CTho9305

Elite Member
Jul 26, 2000
9,214
1
81
combining all the units into one would yield less transistors.. not more.
I think what Calin meant was that it costs transistors to make many parallel functional units act as if they were not - a large number of transistors are spent making the out of order execution that happens in modern processes undetectable to software. As far as software is concerned, programs are executed one instruction at a time, strictly in order. The hardware, in order to maximize performance, executes instructions out of order, but has to preserve the illusion of in-order execution.
 

kcthomas

Senior member
Aug 23, 2004
335
0
0
also as the processes scale down so do the wires that connect everything. small wires means high resistances, so you need more repeaters to keep the signal strong. (a repeater is 2 inverters back to back, outputing a new strong signal). i think current technology has somewhere around 1 mil repeaters (a repeater is usually made using 4 transistors)
 

genghislegacy

Member
Jan 21, 2005
100
0
0
Originally posted by: cquark
Originally posted by: CTho9305
Here is a labelled p4 die layout - you can see where transistors are going. Note that the actual "Integer Execution Core" is relatively small - a LOT of transistors go into finding ways to cram instructions into the execution units quickly.

Yes, while cache occupies a huge amount of the total transistors, the biggest use of transistors is in attempting to extract more parallelism from the code. Branch prediction, schedulers, cache and memory access optimizations, and so forth are costing more and more transistors with each generation of microprocessors, yet the improvements are only single digit percentages.

That's the main reason Intel and AMD are moving to dual core chips. They think having two processors will give you a greater performance improvement than more complex logic on a single processor, and they're probably right. See The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software for a look at why we're moving to dual core chips and what that will mean for software development.

 

cquark

Golden Member
Apr 4, 2004
1,741
0
0
Ars Technica has an article on IBM's Cell processor, which has a good presentation of how much die space control logic is using these days and how the Cell design uses the die space for other purposes (8 special purpose DSPs and an interesting type of local memory.)