• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

CPU vs server CPU?

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
You still can't separate bit flip (which all ram is subject to, and what ECC addresses) and faulty hardware (which ECC won't do %$#@ for).

There is little point in continuing to discuss this until you can separate the two. ECC ram fails exactly like non ECC ram.

How is that a negative for ECC RAM? Sounds like a huge plus to me.

"Faulty" non-ECC- you don't know if you ram is really faulty, it could just be a random single bit flip causing a problem.

Faulty ECC- since a single bit flip would be corrected, you know it's really messed up and needs to be replaced.

How can you possibly twist this around to make it a negative for ECC?
 
Don't distort my meaning.

There is nothing negative about ECC ram (beyond being a feature that's not worth the cost to use in every single system in the world, and a small latency increase), but people touting it as a panacea for all memory issues don't understand the problem. Random bit flips are so rare, that if they even manifested as something you noticed on a home system, it wouldn't recur. It might happen once or twice a year on average. What this guy seems to think it will make a difference with is truly broken hardware that is continually causing a system to crash. ECC will not help you in this case. If your memory is broken, it's broken. It's not the low probability chance of a 0 turning in to a 1 or vice versa.
 
Last edited:
Random bit flips are so rare, that if they even manifested as something you noticed on a home system, it wouldn't recur.

The studies say you're wrong.

Memory errors are HIGHLY correlated. You get one, you will get more.


What this guy seems to think it will make a difference with is truly broken hardware that is continually causing a system to crash. ECC will not help you in this case.

WTF are you talking about?

If your memory is broken, it's broken.

Please explain precisely what you mean by 'broken'. The same bit always reading the same value (stuck)? Memory not being recognized by the system and refusing to boot? Memory with much greater than normal error rates?
 
You still can't separate bit flip (which all ram is subject to, and what ECC addresses) and faulty hardware (which ECC won't do %$#@ for).
Wrong. See the rather famous Google study. ECC support was doing quite a bit of %$#@ to successfully correct errors on failing (faulty) hardware. In the process, this also indicated that the modules were in need of replacement. What matters is how many and which bits were not sent/received correctly, not whether the source of the problem is RAM, PSU, mobo VR, etc..

There is little point in continuing to discuss this until you can separate the two. ECC ram fails exactly like non ECC ram.
The chips may, but the whole system does not. With 64-bit modules, you cannot reliably know whether or not an error occurred, and I'm sure we've all had bad RAM experiences where Memtest wouldn't catch it. With ECC, you can be alerted to a failure, much like practically every other storage technology out there. As a bonus, sufficiently minor errors can be both detected and corrected. If a pattern presents itself, you know there is a HW issue to work on (likely RAM, but not necessarily).
 
Back
Top