with proper cooling, why can't chips be overclocked forever?

draggoon01 · Sep 21, 2003

suppose you were able to keep the temperature of a chip constant with fancy cooling. then would you be able to clock it infinitely high? what's limiting factor?

Colt45 · Sep 21, 2003

capacitance i guess?

/tired

PrincessGuard · Sep 21, 2003

Originally posted by: Colt45
capacitance i guess?

/tired

Transistors have parasitic capacitances, which limit the speed at which logic circuits can switch between states.

I'm sure there's other reasons too.

silverpig · Sep 21, 2003

The speed of the electrons going through the chip is finite.

Furthermore, the smallest known physical time is ~10^-42 seconds (granted, insanely small, but finite).

Evadman · Sep 21, 2003

If the speed was upped too far beyond the limits of the chip, electron migration would accelerate, and the chip would kill itself. I do not know the specifics on the rate this happens, but it happens on all chips, overclocked or no. Overclocking just accelerates it.

calbear2000 · Sep 21, 2003

Besides the physical reasons people are stating, the chip will fail much much sooner due to timing issues rather than electromigration, self-heat, noisy Vil/Vih, etc etc.

Setup timings would break easily. Internal flop to flop timings would be hosed... Memory writes wouldn't be read across the FSB with flight times exceeding clock periods... any timing speedpaths within the circuitry would be exposed.

It is more of a circuit design issue limiting clock speeds than any physical parameters.

CTho9305 · Sep 22, 2003

Look at this. That should explain it pretty well... since there is a limited amount of current through the transistors, that limits the charge time of the load capacitances. No matter how cold you get, you can't get around that.

calbear2000 · Sep 22, 2003

Originally posted by: CTho9305
Look at this. That should explain it pretty well... since there is a limited amount of current through the transistors, that limits the charge time of the load capacitances. No matter how cold you get, you can't get around that.

But given current architecture and bus protocol in today's cpu's, speedpaths will break the chip well before any physical limitations.

0roo0roo · Sep 22, 2003

http://www.eas.asu.edu/~schroder/Electromigration.pdf
http://www.cadence.com/whitepapers/electromigration.html

sgtroyer · Sep 22, 2003

A few points:

The speed of any digital system is limited by the current the transistor can supply and the capacitance being charged or discharged. More current or less capacitance = faster operation. The reason colder chips will run faster is that a transistor produces more current at lower temperatures. This is because electron (and hole) mobility increases with decreasing temperature. So, if this were the only factor, then draggoon's hypothesis would be correct, and there would be no upper limit to the chip's clock frequency. However, there are some mitigating factors.

1. Beyond current and capacitance, there is another factor that is increasingly important on modern chips: wiring resistance. With a long enough wire, there is an appreciable resistance, and the switching time can only go so fast. Even with an unlimited amount of current from the transistor, the resistance limits the speed. This doesn't improve with temperature.

2. Below some temperature, mobility actually starts to increase again, so that also caps the current. I'm not sure what that temperature is, and I think it depends on factors like doping concentrations.

3. Regardless of how fast the logic on the chip could operate, there might be an upper limit to the clock generation. Processors, I think, generate their clocks on board (synced up to an external clock.) Depending on how this is done, it will have an upper limit frequency that doesn't scale with temperature.

Regarding electromigration. Electromigration really only becomes a problem at high temperatures: it scales with the exponential of temperature. So I'm going to guess that wouldn't be a problem if your chip were sufficiently cold.

CTho9305 · Sep 22, 2003

Originally posted by: calbear2000

Originally posted by: CTho9305
Look at this. That should explain it pretty well... since there is a limited amount of current through the transistors, that limits the charge time of the load capacitances. No matter how cold you get, you can't get around that.

Click to expand...

But given current architecture and bus protocol in today's cpu's, speedpaths will break the chip well before any physical limitations.

huh? all critical paths are CAUSED by the physical limitations. If all the transistors could drive infinite current, you could clock as fast as you want, until you hit speed-of-light issues. Internal timing and setup/hold times seemed to be fine for Intel as they took the P3 from 500MHz to 1GHz, and AMD as they took the tbird from 700MHz to 1.4GHz without major revisions. As long as you speed up everything inside the chip by about the same amount (or clock it to the new slowest thing... as you improve further, you'll still be increasing the clock speed), then it should still work. Manufacturing process improvements are the same as a global speedup (not going .18 to .13, but variations within a given size).

draggoon01 isn't (or doesn't seem to be) asking about the whole motherboard... but the same reason applies for why you can't overclock the motherboard endlessly (plus some speed-of-light issues at sizes that big... light goes 1 foot in 1 billionth of a second). I don't know what you're saying about writing to ram... CPUs already are faster than the FSB, hence the need for a divider.... problem solved. If you built a 50Ghz P4 (ignoring all other issues with this), it would be starved for instructions and data, but the fact that it is so much faster than the motherboard doesn't mean they can't reliably exchange data.

pm · Sep 22, 2003

Below some temperature, mobility actually starts to increase again, so that also caps the current. I'm not sure what that temperature is, and I think it depends on factors like doping concentrations.

That must be very low then. Well below LN temperatures. I hadn't heard about this effect.

sgtroyer · Sep 22, 2003

What I should've said is mobility starts to decrease again, but I assume you figured this out from the context. I don't claim to be any expert in device physics, but that's what it said in my book. "Solid State Electronic Devices", Streetman and Banerjee, 5th ed. pg. 97-98

I'll try to summarize. At high temperatures, mobility is dominated by lattice or phonon scattering. This increases with increasing temperature. At low temperatures, impurity scattering dominates. This is "scattering from defects such as ionized impurities." As temperature decreases, the thermal motion of the carriers becomes slower, and the effect of a collision with an impurity is greater.

Do I understand it? Not totally. Is it quantified? Not at all. But that's what the book says. I'll try to see if I can find further information.

pm · Sep 22, 2003

Originally posted by: sgtroyer
What I should've said is mobility starts to decrease again, but I assume you figured this out from the context. At low temperatures, impurity scattering dominates. This is "scattering from defects such as ionized impurities." As temperature decreases, the thermal motion of the carriers becomes slower, and the effect of a collision with an impurity is greater.

Very interesting. Thanks for explaining it. The explanation makes sense.

calbear2000 · Sep 24, 2003

Originally posted by: CTho9305

Originally posted by: calbear2000

Originally posted by: CTho9305
Look at this. That should explain it pretty well... since there is a limited amount of current through the transistors, that limits the charge time of the load capacitances. No matter how cold you get, you can't get around that.

Click to expand...

But given current architecture and bus protocol in today's cpu's, speedpaths will break the chip well before any physical limitations.

Click to expand...

huh? all critical paths are CAUSED by the physical limitations. If all the transistors could drive infinite current, you could clock as fast as you want, until you hit speed-of-light issues. Internal timing and setup/hold times seemed to be fine for Intel as they took the P3 from 500MHz to 1GHz, and AMD as they took the tbird from 700MHz to 1.4GHz without major revisions. As long as you speed up everything inside the chip by about the same amount (or clock it to the new slowest thing... as you improve further, you'll still be increasing the clock speed), then it should still work. Manufacturing process improvements are the same as a global speedup (not going .18 to .13, but variations within a given size).

Perhaps I should explain my background as a circuit designer at Intel... there is a realistic limitation on the clock speeds of our chip due to speedpaths and setup times breaking well before we hit any kind of physical barriers such as excessive capacitance, electromigration, speed of light, etc etc.

Think of a simple circuit of 2 flops seperated by some logic of ~200ps, and clocked by the same clock. Assume hold of the 2nd flop is met. If your clock is running at 3Ghz, you basically have a margin of 333ps - 200ps - setup_time_of_2nd_flop. Increase your clock speed, then you'll decrease the period and the setup window. Eventually you'll break.

CTho9305 · Sep 24, 2003

Originally posted by: calbear2000

Originally posted by: CTho9305

Originally posted by: calbear2000

Originally posted by: CTho9305
Look at this. That should explain it pretty well... since there is a limited amount of current through the transistors, that limits the charge time of the load capacitances. No matter how cold you get, you can't get around that.

Click to expand...

But given current architecture and bus protocol in today's cpu's, speedpaths will break the chip well before any physical limitations.

Click to expand...

huh? all critical paths are CAUSED by the physical limitations. If all the transistors could drive infinite current, you could clock as fast as you want, until you hit speed-of-light issues. Internal timing and setup/hold times seemed to be fine for Intel as they took the P3 from 500MHz to 1GHz, and AMD as they took the tbird from 700MHz to 1.4GHz without major revisions. As long as you speed up everything inside the chip by about the same amount (or clock it to the new slowest thing... as you improve further, you'll still be increasing the clock speed), then it should still work. Manufacturing process improvements are the same as a global speedup (not going .18 to .13, but variations within a given size).

Click to expand...

Perhaps I should explain my background as a circuit designer at Intel... there is a realistic limitation on the clock speeds of our chip due to speedpaths and setup times breaking well before we hit any kind of physical barriers such as excessive capacitance, electromigration, speed of light, etc etc.

Think of a simple circuit of 2 flops seperated by some logic of ~200ps, and clocked by the same clock. Assume hold of the 2nd flop is met. If your clock is running at 3Ghz, you basically have a margin of 333ps - 200ps - setup_time_of_2nd_flop. Increase your clock speed, then you'll decrease the period and the setup window. Eventually you'll break.

The logic only takes 200ps because of the internal capacitances and resistances, no?

How come you don't have a "* Not speaking for Intel Corp.*" in your sig like the other 3 Intel people here who do?

KY2000 · Sep 24, 2003

Huh?

Err..... What I heard here is interesting...but..you be the judge. .

If you notice, there is usually a limit to what you can overclock, and that is ususally upper bound by the
top of the line CPU based on the batch of CPU that was launched together with may be 10%-20% ++ over the best in class.

Simply put, why you can overclock is because the testing is usually done at normal and high temperature.
When it fails at a higher frequency, it is down -bin it to a lower clock speed and test again. IE a single batch of design .e.g. current athlon 3200xp to 2800xp is just the same silicon design with the higher clock speed
passing the testing at a higher frequency at hot test.

In order to ensure that it works at the designated speed, the designers will design it to give it a max headroom slightly
above the desired frequency. 10%?

It is unlikely to work beyond the theoretical frequency due design issues.

Beyond the design issues, manufacturing issues (e.g. defectivitiy) causes the difference tolerances to frequency and heat which is why by cooling it aggressive you can overclock the chip to the optimal speed which is the original design speed.

So what am I saying is, the risk you take in overclocking say a Pentium 4 2.6G to 3.0G (which incidently is the best overclock scheme due to bus multiplier effect) is way lower than when you overclock a P4 3.2 to 3.5

Hope it helps

sgtroyer · Sep 24, 2003

calbear2000,

the setup time requirement is a direct result of the capacitance of the internal nodes of the flop. the propagation time is a direct result of the capacitance of the logic and routing wires. The other factor is the current sourcing ability of your fets. When temp goes down, mobility increases, so there is more current. This means that at low temp your setup time requirement will be less, and the propagation time through the logic will be faster. You can clock the chip faster.

I agree with you that you will never run into the speed of light, and electromigration doesn't limit speed, but to say that excessive capacitance isn't a factor is just wrong. The reason "speedpaths and setup times" break is because of capacitance.

calbear2000 · Sep 24, 2003

Originally posted by: sgtroyer
calbear2000,

the setup time requirement is a direct result of the capacitance of the internal nodes of the flop. the propagation time is a direct result of the capacitance of the logic and routing wires. The other factor is the current sourcing ability of your fets. When temp goes down, mobility increases, so there is more current. This means that at low temp your setup time requirement will be less, and the propagation time through the logic will be faster. You can clock the chip faster.

I agree with you that you will never run into the speed of light, and electromigration doesn't limit speed, but to say that excessive capacitance isn't a factor is just wrong. The reason "speedpaths and setup times" break is because of capacitance.

Yet you can lower your setup time by increasing your clock path delay (and thus increasing hold). But you can't keep trading off setup and hold at higher clock speeds. Without design/architecture changes, the clock is limited purely because of design issues.

Technically, yes everything comes down to device physics. But it is a misleading answer as to why there is a limit on clock speeds on a given chip. It is the design, not device characteristics.

sgtroyer · Sep 24, 2003

You misunderstand me. I'm not talking about adding delay to the clock path to trade setup for hold time. I'm referring to the fact that as the logic gets faster, because of temperature or any other cause, setup times and propagation times decrease. The chip will run faster.

Design depends on device characteristics. There's no way around it. Setup time isn't a constant number with no basis in physical reality: it depends on the devices and the physics. To think of it as an unchanging constant is to oversimplify and mislead.

Someone want to back me up here?

calbear2000 · Sep 24, 2003

Originally posted by: sgtroyer
You misunderstand me. I'm not talking about adding delay to the clock path to trade setup for hold time. I'm referring to the fact that as the logic gets faster, because of temperature or any other cause, setup times and propagation times decrease. The chip will run faster.

Design depends on device characteristics. There's no way around it. Setup time isn't a constant number with no basis in physical reality: it depends on the devices and the physics. To think of it as an unchanging constant is to oversimplify and mislead.

Someone want to back me up here?

Where did I say setup time and prop delay is constant?? Of course it changes with temperature.

Setup time depends on the relative delay of data w.r.t clock. It will decrease only if your data delay is larger than clock delay. Otherwise it increases at colder temp. Regardless, changes in setup time (if it does decrease for a given path) will not scale with the setup window when the clock period decreases. ie. setup times across the chip will not keep up with a shrinking time window when you pump up the clock by even a few percentage points.

We design and validate across PVT (from -10 degrees Celsius to 110) Sometimes, setup decreases at faster corners, sometimes it increases. If you want data on this, pm me.

CTho9305 · Sep 24, 2003

While I tend to intuitively expect sgtroyer is correct, I can see where calbear2000 is coming from. If the clock propagation circuitry speeds up more than setup and hold times, then you'd need your logic to have the data ready earlier.

sgtroyer · Sep 25, 2003

Yeah, I think we're all right, but terminology is just getting in the way.

When I said setup times decrease, I should've said the setup requirement decreases. Setup time depends on the clock period and logic propagation time, the setup requirement is a property of the flop. calbear caught me on that. So yes, if your setup time decreases (because of shorter clock periods), and there is no corresponding decrease in the setup requirement, then the path is going to break.

Chris, in general the propagation time through the clock drivers doesn't matter. Since the same (approx) clock is feeding all of the flops, everything will speed up and slow down in concert. If the clock arrives sooner, the signal is launched sooner and clocked in sooner. Note that a clock that arrives sooner is different than a clock with a shorter period.

One problem with the above simplification, however, is clock skew, where because of different RC delay in clock lines, one clock will arrive before another, decreasing your timing window. This is a big problem on large chips.

Mday · Sep 25, 2003

Originally posted by: draggoon01
suppose you were able to keep the temperature of a chip constant with fancy cooling. then would you be able to clock it infinitely high? what's limiting factor?

materials. it's the material that is limiting it. as well as the use of the transister. assuming you can keep the temperature constant with some fancy cooling. electron tunneling and other detrimental effects occur at high frequencies will cause loss and corruption of data. not to mention chips with transistors not designed with high frequencies in mind just won't work at high frequencies.

now back to reality... proper cooling is impossible. hot spots will occur. chips get hot from both sides, the area between the mobo and cpu (where it is mounted) needs to be cooled too.

uart · Sep 27, 2003

Originally posted by: sgtroyer
What I should've said is mobility starts to decrease again, but I assume you figured this out from the context. I don't claim to be any expert in device physics, but that's what it said in my book. "Solid State Electronic Devices", Streetman and Banerjee, 5th ed. pg. 97-98

I'll try to summarize. At high temperatures, mobility is dominated by lattice or phonon scattering. This increases with increasing temperature. At low temperatures, impurity scattering dominates. This is "scattering from defects such as ionized impurities." As temperature decreases, the thermal motion of the carriers becomes slower, and the effect of a collision with an impurity is greater.

Do I understand it? Not totally. Is it quantified? Not at all. But that's what the book says. I'll try to see if I can find further information.

In a nut shell what that means is that at higher temperatures the mobility is predominately limited by collisions with the vibrations in the crystal latice of the intrinsic semiconductor material, thermal vibrations that clearly increase as temperature increase.

At lower temperatures latice vibrations decrease to the point where collisions with the actual dopant impurities dominate. To understand why collisions with these dopant ions have a worse impact on mobility at low temperatures you have to remember that almost all of the carriers are originally derived from these dopant ions - and the dopant ion can swallow the carrier back up again if they meet and the carrier does not have sufficient energy. In the limit if you keep reducing the temperature then you can actually get to the point where the carriers "freeze", meaning they mostly all get trapped by their original dopant ions and have insufficient thermal energy to re-ionize.

with proper cooling, why can't chips be overclocked forever?

Senior member

Lifer

Golden Member

Lifer

Administrator Emeritus<br>Elite Member

Golden Member

Elite Member

Golden Member

No Lifer

Member

Elite Member

Elite Member Mobile Devices

Member

Elite Member Mobile Devices

Golden Member

Elite Member

Junior Member

Member

Golden Member

Member

Golden Member

Elite Member

Member

Lifer

Member