But I've never had a system that was unstable due to non-ECC RAM. Regular old RAM stays error-free for as long as I have the patience to test it. If someone's system has a problem with BSODs etc., you can bet it is not due to properly functioning non-ECC RAM, and users deserve to know that before committing if they don't have a high enough awareness of what's involved.
That's the problem. This is because most people don't understand what ECC is supposed to do, and dismiss it as unneeded. Read
here.
Basically, there is some chance that a bit spontaneously turn from 0 to 1 or viceversa, due to cosmic rays background radiation, bla bla bla. Cosmic rays are also one of the possible reasons for
silent data corruption on a Hard Disk, that are what ZFS is mean to be reliable against, but there are more causes for those ones. This data corruption can happen on perfectly functional Hardware, not just defective parts, in what case you would need to replace them anyways.
There are statistics that tells you about an "incidence rate" that every 1 GB, there is a percentage chance of a bit to change its value over a given amount of time. That rate seems to be very wide and broad, because it could either happen often enough to a issue, to rare enough that it will never, ever happen on your lifetime. Still, it CAN happen to you.
What happens when that 0 turns to 1? Very probabily, nothing. Worst case scenarios would include data corruption on a file you're working with that you save back to the Hard Disk, that an application crashes, or that your system BSODs. How do you know if an specific issue was attributable to this? Simple, you don't. But you can assume that, if at least one time in your life you had a random weird BSOD, crash, or whatever, that were never able to reproduce again, it COULD have been caused by this sort of data corruption, and that ECC could have corrected it and your system would still be working like if nothing had happened.
My machine is on 24/7, and my record uptime has been 50 days, with 4 GB of common non-ECC RAM. Usually, buggy Drivers due to a non-standard GPU configuration (Radeon 4200 + 2x 5770), memory leaks in some games and such, and worst of all, my electric utility power company brownouts or blackouts, are the cause of my downtime issues. Do I need ECC? Chances are that no. But, it is still a "just in case" measure. In the same way that some people have fire extinguishers in their houses in case of fire, or a gun to protect from intruders if they ever break into their home, for me, ECC is that, a "just in case" measure, that may or may not be needed, but after all, its a mere preventive measure.
The chances that ECC corrects an error that saves the day from a BSOD or crash when I could be doing some mission-critical are virtually 0, but still, as a "just in case" measure, I could consider ECC as an extra feature, if it is at the right price. But why people consider that its fine to pay tons of money to overclock to get extra performance deep in the ever-dimishing returns, but disregard a stability-improving feature as "unneeded", is something I can't yet understand.