Originally posted by: interchange
A 2.2Ghz processor won't handle 2.2billion instructions in one second, as each clock does not handle the processing of a whole instruction. I could explain to you pipelining, out of order execution, superscalar, etc. but suffice it to say that it's very complicated to determine exactly how many clocks it typically takes to execute one instruction. Or, in the case of modern superscalar CPUs how many instructions can execute per clock 🙂.
Pipelining allows the completion of an instruction each clock pulse.
A particular instruction will of course take a minimum number of clock cycles to complete, equal to the number of pipeline stages, but don't forget, there will be another instruction right behind it.
Let's look at an example:
On a simple 4-stage pipeline, a single instruction will take a minimum of 4 cycles/iterations to complete.
Stage 1: ?
Stage 2:
Stage 3:
Stage 4:
? represents an instruction in the first pipeline stage, at the first cycle.
Looked at from the perspective of a single instruction, completion takes 4 cycles.
So what if we introduce a second intruction?
Will two instructions take 2 X 4 cycles to complete?
No.
Stage 1: ? I2
Stage 2: ? I1
Stage 3:
Stage 4:
Instruction 1 will be complete in four cycles, but because instruction 2 is trailing I1 by one clock cycle/iteration, it will complete one cycle later.
So in five cycles, two instructions will be complete.
In both the above examples, the pipeline has only been partially full, so the result has not been one instruction per cycle.
Now, if the pipeline is full, we get this:
C1 Stage 1: ? I6
C2 Stage 2: ? I5
C3 Stage 3: ? I4
C4 Stage 4: ? I3
C5 ? I2
C6 ? I1
Once we reach Cycle 4 (C4), we're getting an instruction completed each clock cycle.
Completion rate at this point is therefore one instruction per cycle.
This is pipelining.