Will smaller CPUs be less reliable?

blyndy

Member
Nov 11, 2008
33
0
0
I was thinking that as IC features shrink down to tens of atoms, will there be a noticeable increase in failures from things such as cosmic rays and electromigration?
 

Swivelguy2

Member
Sep 9, 2009
116
0
0
A. Cosmic Rays: If a cosmic ray were capable of interacting with the tiny traces in a CPU, how would it make it through the computer case? Or your house?

B. Electromigration: They won't sell you the CPU unless it can be made reliable.
 

Cogman

Lifer
Sep 19, 2000
10,283
135
106
If the tech doesn't change, yes, there will be significant failures.

However, most die shrink are accompanied by lots of new techs to allow the shrink to happen. Think, High-K gates.
 

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
A. Cosmic Rays: If a cosmic ray were capable of interacting with the tiny traces in a CPU, how would it make it through the computer case? Or your house?

B. Electromigration: They won't sell you the CPU unless it can be made reliable.

A: You'd be surprised how many times we have to consider options to deal with "cosmic ray" events in the design. So yeah it will interact with the cpu and we either just blue screen or do some type of parity/ecc/etc.... to handle it for places that have a very low error tolerance.
 

BD231

Lifer
Feb 26, 2001
10,568
138
106
My buddy just bought a netbook and didn't like the os load times, so he left it running 24/7 .... the book was dead within two weeks of him making that decision.

I'd say the new tech has a ways to go when it comes to reliability, the ones I've worked with all run hot.
 

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
My buddy just bought a netbook and didn't like the os load times, so he left it running 24/7 .... the book was dead within two weeks of him making that decision.

I'd say the new tech has a ways to go when it comes to reliability, the ones I've worked with all run hot.

Sounds like an exception but I haven't tried running my netbook 24/7. Would be surprised if that caused it to die within 2 weeks.
 

esun

Platinum Member
Nov 12, 2001
2,214
0
0
http://en.wikipedia.org/wiki/Soft_error
http://en.wikipedia.org/wiki/Electromigration

Yes, cosmic rays are issues. So is electromigration. Yes, they are both bigger problems as we shrink device dimensions. Those wiki articles explain the issues quite well I think.

I'd bet there already has been a noticeable increase, which is why designers specifically design to allow for robustness to soft errors and layout methods are used to minimize electromigration.
 

CanOWorms

Lifer
Jul 3, 2001
12,404
2
0
Devices are usually designed for a certain lifetime. Thus, electromigration issues are usually sorted out early and a customer would usually not encounter it unless they have something out of specification, i.e. voltage.

Radiation is a huge issue, especially in the aerospace area. Single Event Upsets can change data, Single Event Latchups will destroy a device, etc. You can design around some of these issues, too.
 
May 11, 2008
21,407
1,249
126
Last edited:
May 11, 2008
21,407
1,249
126
On this website you will find some information about cpu's used in space.


http://www.cpushack.net/space-craft-cpu.html


The RCA 1802 from the voyager 1 and 2 :



I do not know how they do it, but BAE systems hardens their processors against radiation. Anybody has any more detailed information on that ?
I found this.
http://en.wikipedia.org/wiki/IBM_RAD6000


I found something, but more information is always welcome...

http://en.wikipedia.org/wiki/Radiation_hardening

Found some more detailed information.

http://www.mse.vt.edu/faculty/hendricks/mse4206/projects97/group02/space.htm

http://www.mse.vt.edu/faculty/hendricks/mse4206/projects97/group02/hardening.htm#soi
 
Last edited:

Modelworks

Lifer
Feb 22, 2007
16,240
7
76
In the mid 1990's at sandia we used lots of gold to prevent damage to parts. Those old pictures of chips covered in gold wasn't just for looks, it was used as a shielding material . It was easy to work with and is still one of the major uses for gold in protecting electronics. Another method that was used was suspending the parts in fluorinert based fluids. Liquids are great at stopping all forms of radiation.
 

CountZero

Golden Member
Jul 10, 2001
1,796
36
86

yottabit

Golden Member
Jun 5, 2008
1,580
668
146
I have the feeling that we are going to be hitting the "brick wall" Moore predicted sooner than everyone realizes as it comes to die shrinks. Not meaning that it will be impossible to produce smaller transistors, but that it will not be cost effective compared to other methods of gaining performance.

If we were all smart and want to continue the trend of approximately double processing power every 18 months we need to start exploring more options than the essentially free gains we've been seeing from die shrinking over the past decades. I know chip makers love die shrinks because it lets them get more yield, but that's only if they can manufacture effectively at a certain size. I'm sure there are lots of architecture changes and software changes that can be made to enhance performance. Another note is that with trend towards these general purpose programmable units predicted in the future we may not need such a high proportion of cpu power anyway.

Even if processing power does hit a brick wall, it will just force software designers to get off their butts and make more efficient software. Has anyone noticed (at least on windows) how bloated everything has become? I forget where, but I saw a test where someone ran a circia 1999 computer with Office against a circa 2008 computer with Office 2007 and the older computer could get the tasks done in half the time.. with all this abundance of processing power lately it seems like some programmers have forgotten the word optimization.

Anyways that's my prediction... based on very little sound research of course

to the OP, one thing I found interesting is that (according to Wikipidea, the principle source of knowledge on the internet) the principle cause of errors in DRAM memory is due to cosmic radiation. Apparently DRAM is most effected by it because each bit is essentially a tiny capacitor, so intercepting some cosmic energy can cause it to flip state.
 
Last edited:

esun

Platinum Member
Nov 12, 2001
2,214
0
0
I know chip makers love die shrinks because it lets them get more yield

Just wanted to point out that yield in the fabrication business is a specific term referring to the proportion of "good" devices on a wafer (as in, devices with no defects). Hence, yield tends to go down when moving to a new process, then gradually climbs as the process is refined. The device density increases with process shrinks, so you get more devices per wafer, but yield doesn't go up.

Apparently DRAM is most effected by it because each bit is essentially a tiny capacitor, so intercepting some cosmic energy can cause it to flip state.

The soft errors article I linked above goes into detail on this.

This has its own problems. What if it is the voting mechanism that fails?

Fair point, but it's easier to make a very robust majority circuit than trying to make the entire system robust.
 

yottabit

Golden Member
Jun 5, 2008
1,580
668
146
Just wanted to point out that yield in the fabrication business is a specific term referring to the proportion of "good" devices on a wafer (as in, devices with no defects). Hence, yield tends to go down when moving to a new process, then gradually climbs as the process is refined. The device density increases with process shrinks, so you get more devices per wafer, but yield doesn't go up.

Thanks for clearing that up... that's what I was trying to express, but couldn't find the right words.
 

RadiclDreamer

Diamond Member
Aug 8, 2004
8,622
40
91
My buddy just bought a netbook and didn't like the os load times, so he left it running 24/7 .... the book was dead within two weeks of him making that decision.

I'd say the new tech has a ways to go when it comes to reliability, the ones I've worked with all run hot.

1 laptop and one experience is no sufficient to make a reliability judgement on
 

jimhsu

Senior member
Mar 22, 2009
705
0
76
Below 45 nm, chipmakers require double patterning to get features that are small enough, though that has been done with considerable success. Below 22 nm is another problem entirely, where quantum tunneling (the current which is defined by an exponential decay relative to distance, says my phys chem class) starts to matter: http://en.wikipedia.org/wiki/16_nanometer . That in essence makes ordinary silicon lithography all but impossible, even with electron beam lithography (the limit is theoretical, not engineering). So yes, there is a definite limit that CPU manufacturers will need to find some way to work around.
 
Last edited: