Why does pipelining allow for higher frequencies?

eLiu · Jun 5, 2012

I've often heard this statement made, but I honestly have no idea as to why it's true. Is it b/c with a deeper pipeline, each stage will be responsible for smaller chunks of work = simpler stages = run faster? I'm just guessing here but I'd love to know the real answer.

Raghu · Jun 6, 2012

You are correct.

esun · Jun 6, 2012

It takes time for a logic signal to propagate. Say for instance you have a sequence of inverters (NOT gates) and a single input. There is some delay associated with each gate, call it T_inv.

So if you have N inverters, after changing the input bit, it will take N * T_inv time for the output to change. If this is a clocked system and you need to correctly determine the output between changes of the input, you have to wait N * T_inv seconds, meaning your clock frequency is limited to 1 / (N * T_inv).

Now let's say you split the chain so there are N/2 inverters followed by a register following by N/2 inverters. Now, after the input changes, the register in the middle only has to wait half the time for the signal in the middle to be correct so it can be sampled. Similarly, the second set of N/2 inverters only has to wait half the time from when the middle register last changed its value.

Of course this means that the output value is always delayed by 1 clock cycle from what it previously was (i.e., we've added a 1 clock cycle of latency). However, we get the run the clock twice as fast. There is a direct trade-off here. There is a limit to how much you can pipeline, though, where adding more pipelines doesn't improve performance any more.

eLiu · Jun 7, 2012

Cool, thanks! That makes sense.

Born2bwire · Jun 7, 2012

Dammit, for a second I thought that eLiu had split personality and was asking and answering between himselves.

WhoBeDaPlaya · Jun 7, 2012

There's another trade-off - stalls. If your branch prediction isn't up to par, you'll have to spend a couple of clock cycles doing diddly squat (no-ops).

The more stages a pipeline has, the more cycles stalls cost.

eLiu · Jun 7, 2012

WhoBeDaPlaya said:
There's another trade-off - stalls. If your branch prediction isn't up to par, you'll have to spend a couple of clock cycles doing diddly squat (no-ops).

The more stages a pipeline has, the more cycles stalls cost.

Yeah, I'm aware. When I said "= run faster", I meant just in terms of frequency, not in terms of ability to do useful work (Hi pentium 4, how are you? lol). More of a purely electronics question I guess.

WhoBeDaPlaya · Jun 7, 2012

eLiu said:
Yeah, I'm aware. When I said "= run faster", I meant just in terms of frequency, not in terms of ability to do useful work (Hi pentium 4, how are you? lol). More of a purely electronics question I guess.

Ah gotcha. Then yeah, basically you're limited by the immutable laws of EM.
It gets wackier at advanced (~ <= 40nm) nodes, where the corners you sign off a design on get flipped on its head.

Why does pipelining allow for higher frequencies?

eLiu

Diamond Member

Raghu

Senior member

esun

Platinum Member

eLiu

Diamond Member

Born2bwire

Diamond Member

WhoBeDaPlaya

Diamond Member

eLiu

Diamond Member

WhoBeDaPlaya

Diamond Member

TRENDING THREADS