I particularly enjoyed this tid bit about creating a OS that is hardware fault aware. Pretty neat ideas!
"For example, a hardware-fault-tolerant (HWFT) OS might map out faulty memory locations, just as disks map out bad sectors. In a multi-core system, the HWFT OS could map out intermittently bad cores. More interestingly, the HWFT OS might respond to an MCE by migrating to a properly functioning core, or it might minimize susceptibility to MCEs by executing redundantly on multiple cores. A HWFT OS might be structured such that after boot, no disk read is so critical as to warrant a crash on failure. Kernel data structures could be designed to be robust against bit errors. Dynamic frequency scaling, currently used for power and energy management, could be used to improve reliability, running at rated speed only for performance-critical operations. We expect that many other ideas will occur to operating-system researchers who begin to think of hardware failures as commonplace even on single machines."