Increasing CPU Core Voltage... What effects are caused?

Stimmer

Junior Member
Dec 22, 2007
14
0
0
I realize raising the voltage incrases heat, and allows you to increase your clock speed due to the increased voltage. How ever, what other effects go with it? Does it lower the life of your CPU, or is that a direct cause of +Voltage = +Temps = -Life?

As such, if you OC and need to increase your voltage and yet you keep your temps <40C you don't harm the life of the CPU?

Still learning as I go, so asking questions that come to mind.
 

pm

Elite Member Mobile Devices
Jan 25, 2000
7,419
22
81
It's a good question, and one that comes up here a lot and that there's not generally a good consensus answer.

There are plenty of low-level reasons for why a chip will stop working, and there are equations which cover the statistical probability of failure will tell you what the key variables are that govern that failure. In some failure mechanisms, temperature and voltage play an equal role... in others the voltage dominates. In one failure mechanism, the cooler the CPU the worse it is. But it's hard to know what the specific mechanism that will govern the reliability of a microprocessor on a given process technology unless you happen to be involved in reliability studies for that particular product. And even if they did know, it's still a statistical thing.

For example, for a long time CPU wiring inside the CPU was made of aluminum with some copper and silicon atoms thrown in. Aluminum atoms are fairly lightweight and so when a huge number of electrons are flowing down a wire, they can actually push the aluminum atoms around. It's kind of like trying to move a car by firing ping-pong balls at it. It's a crazy idea unless you have a ridiculous amount of them and they are moving very fast. If you move enough metal atoms down the wire, the wire will start to have holes in it, and this will increase the temperature of the wire, which makes things even worse, and eventually the wire will fail. This is called electromigration and it was the dominant failure mechanism on CPU's up until about the 180nm process technology when copper metal started to get used to make the wires. Copper atoms are much larger and they pretty much stopped electromigration from being a problem for a while... although it's starting to become an issue again.

Electromigration is described by "Black's Equation" which is: MTTF = A*/(J^n)*e^(Ea/kT). MTTF is "mean time to failure" and is the statical time it takes before the wires fails, A is the cross-sectional area of the wire, J is current, n is a scaling value, Ea is the activation energy which is dependent on the metal, T is the temperature and k is the Boltzman constant. So for a given wire, the area, the constants, and the activation energy are all fixed. So the two that you can mess with are the current density and the temperature. So MTTF is directly proportional to the square of 1/J, the current density, while MTTF is directly proportional to e^(1/T) - in this case voltage is much worse than temperature. See also: http://en.wikipedia.org/wiki/Electromigration .

But electromigration ceased to be much of an issue in pretty much the 7-8 years in the industry and now there are plenty of new ways for CPU's to die. Other things that will kill a CPU nowadays are: gate-oxide wear, hot-e gate ionization, oxide defect generation, PMOS negative bias temperature instability, and several others. In the case of gate-oxide wear, there's a temperature acceleration factor that gives a higher dependence on elevated temperatures than on increased voltage which has a linear dependence. While Googling for this post, I came across this paper ( http://www.ece.rutgers.edu/~kp...apers/NanoSymp2001.doc ) which discussed a variety of oxide failure models and has the conclusion: "The understanding of thin gate-oxide wear out and breakdown is far from complete. None of the model in the literature is without shortcomings."

So in a nutshell, even the experts in the industry don't know. If you want my advice, though, be very careful of the voltage, don't press your luck too far and I wouldn't recommend listening to anyone who says "voltage doesn't really matter - it's the temperature that will kill your CPU" because the academic literature doesn't support that at all.
 

Borealis7

Platinum Member
Oct 19, 2006
2,901
205
106
heat - bad.
beer - good.


What do we want!?
BRAINS!

When do we want it!?
BRAINS!
 

The-Noid

Diamond Member
Nov 16, 2005
3,117
4
76
PM that was probably the most informational post I have seen in a long time.

Thank you very much.
 

MrSpadge

Member
Sep 29, 2003
100
6
0
PM,

thanks for taking the time to write this up. It's always nice to see someone elevating discussions with real knowledge.

I read the paper from 2001, which you linked. Given the rather crude assumptions being made it is no wonder that neither model fits a real thin oxide. From what I've seen so far I'd guess the oxide breakdown is at least a superposition of two different mechanisms. Do you know if any substantial progress has been made in this field since this 2001?

Regards,
Stephan
 

Martimus

Diamond Member
Apr 24, 2007
4,490
157
106
That was a very nice post PM! You explained electromigration much better than I ever could have. Even so, I would like to chime in as I have had experience in testing electronic components for voltage and current limitations. In my experience, diodes seem to degrade much faster the more current you push through them. A zener diode will actually change its breakdown voltage over time, and will do so at an exponentially increasing rate the more current you put into it. The part will eventually fail, and the silicon will sometimes crack under the heat (Although cracked silicon does not always make the part fail). Either way, you cannot undo the damage that is done when you put too much voltage through the component, even if it hasn't failed yet. The breakdown voltage has permanently changed. As this relates to microprocessors is that the saturation region of the transistors will change over time, and will do so faster the more current you use (more voltage=greater current) this means that the switching voltage will change over time, so that what was once adequate voltage to switch a transistor "on", will no longer be adequate after the saturation region has shifted. This will likely make processors require higher voltages as they age to work properly. Forcing additional current through the processor will only quicken this change.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,570
10,202
126
Originally posted by: Martimus
As this relates to microprocessors is that the saturation region of the transistors will change over time, and will do so faster the more current you use (more voltage=greater current) this means that the switching voltage will change over time, so that what was once adequate voltage to switch a transistor "on", will no longer be adequate after the saturation region has shifted. This will likely make processors require higher voltages as they age to work properly. Forcing additional current through the processor will only quicken this change.

That seems to nicely describe the mechanism behind OC degradation.

I'm running my E2140 @ 3.2Ghz, 1.425v, 84C core temps. 24/7 Seventeenorbust loading, which is much the same as Prime95.

I'm expecting that I'm shortening the lifespan of the chips, but hopefully not enough to actually matter. (Ie. failure before an upgrade.)
 

lopri

Elite Member
Jul 27, 2002
13,310
687
126
Originally posted by: Stimmer
I realize raising the voltage incrases heat, and allows you to increase your clock speed due to the increased voltage. How ever, what other effects go with it? Does it lower the life of your CPU, or is that a direct cause of +Voltage = +Temps = -Life?
Imagine you do lots of coke. It allows you to dance harder due to the increased energy. However, what other effects go with it? Does it lower the life of your.. whoops. What were we talking about?
 

Martimus

Diamond Member
Apr 24, 2007
4,490
157
106
Originally posted by: VirtualLarry
Originally posted by: Martimus
As this relates to microprocessors is that the saturation region of the transistors will change over time, and will do so faster the more current you use (more voltage=greater current) this means that the switching voltage will change over time, so that what was once adequate voltage to switch a transistor "on", will no longer be adequate after the saturation region has shifted. This will likely make processors require higher voltages as they age to work properly. Forcing additional current through the processor will only quicken this change.

That seems to nicely describe the mechanism behind OC degradation.

I'm running my E2140 @ 3.2Ghz, 1.425v, 84C core temps. 24/7 Seventeenorbust loading, which is much the same as Prime95.

I'm expecting that I'm shortening the lifespan of the chips, but hopefully not enough to actually matter. (Ie. failure before an upgrade.)

Wow, that is pretty hot for 24/7 operation. Hopefully you are planning on upgrading soon :p