Originally posted by: Blain
Don't worry you memory hounds, I'm
not trying to decide between ECC & non-ECC memory for any of my PC's.
Of course, like everyone else, I've heard how ECC is for mission critical systems.
But I'm curious to know if anyone has
actually done any side-by-side
testing to demonstrate the ECC advantage when errors are encountered.
1. Non-ECC tested with simulated errors interjected.
2. ECC tested with simulated errors interjected.
3. Test results compared and conclusion.
>>
Can anyone point me to some tests? <<<
It's been done. I don't have links handy, sorry, but I'm pretty sure
you'll find hard data if you check for white-papers / reports on the web sites of:
Intel developer / server site areas.
IBM server / blade / mainframe products.
IBM research labs.
SUN server products.
NASA / JPL / ESA (euro. space. agency) re: space environment, radation hard
computer hardware, cosmic ray induced errors, et. al.
I heard a figure recently from the Beyond3d forum saying that IBM's
estimate was one single-bit-error per month per gigabyte of memory
{I PRESUME attributable mostly to cosmic rays and alpha-radiation
"single event upsets". I also PRESUME that's on the LOW SIDE,
i.e. assuming a low rate of ELECTRONIC noise / glitches due to EMI,
logic metastability, chip malfunction, ... clearly such problems could raise
the error rate arbitrarily high.}.
SRAM is more susceptable per kilobyte than DRAM, though you're likely to
have millions of times more DRAM installed..
CPUs and GPUs are also susceptable but anecdotally (same B3D poster's
assertion) it's said that lower than 65nm (or was it 55nm) that they had
to start using ECC and error correction / detection techniques in mainstream
CPUs just to keep an acceptable error rate. So your modern CPU and maybe
(unknown) newer GPUs *already* use ECC internally to a degree.
Google for "single event upset" or SEU and ECC and I bet you'll find a
slew of reports.