ECC or no ECC for home server?

fffblackmage

Platinum Member
Dec 28, 2007
2,548
0
76
I'm considering building a home server for various media and perhaps for backup purposes as well. If I do build one, I'm pretty certain I'll be using ZFS, though I have yet to determine which OS.

My question is: Should I do a server type build running ECC memory? I'm wondering what kind of problems I could run into if, for some random reason, I get a memory error? ZFS can't fix something like that for me, right? I suppose a single flipped bit here and there doesn't matter, but I just find it annoying knowing a file has failed a hash check and nothing can be done to fix it (has happened more than once already).

Also, any other benefits ECC might provide that I don't know of? Downsides other than cost?

edit: Oh great, I just realized this is a memory question and probably belongs to the memory subforum... oh well.
 
Last edited:

RebateMonger

Elite Member
Dec 24, 2005
11,586
0
0
You want this thread moved to Memory and Storage? I can do it.

I know of no downsides to ECC besides cost and that it's a bit slower than non-ECC.

One advantage on servers is for troubleshooting. There's usually no need to run long memory tests to troubleshoot a problem. If there's a failing memory chip, you'll usually be told.

It's not clear how often memory bits "flip". Last I read, Intel had lowered its estimate of the frequency of such events. For me, I use ECC on my and my clients' servers. I don't on my own desktop PCs. Especially since no $50 motherboards support it.
 

fffblackmage

Platinum Member
Dec 28, 2007
2,548
0
76
After hours of gooling... I think I managed to answer my own questions.

I was wondering if ECC was a good idea, considering the significant cost of a server cpu and server mobo, but apparently, the Athlon II supports ECC and so do consumer-level ASUS mobos, so that leaves me with the extra cost of ECC ram only. Huzzah!
 

ViRGE

Elite Member, Moderator Emeritus
Oct 9, 1999
31,516
167
106
This is a bit off topic from hardware, but if you're going ZFS, use Solaris. I've had more than a couple of people tell me that ZFS on *BSD is still going through teething issues right now.
 

Absolution75

Senior member
Dec 3, 2007
983
3
81
I believe I've read somewhere on this forum that the number of flipped bits is very low in even consumer hardware - somewhere on the lines of 1 flipped bit per 6 months (likely to be off, but the point is it is really low).

The question you have to ask yourself: is the extra cost of the ram is worth it?

For me, its no. I've been running normal DDR2 in a production server for over a year and haven't noticed anything that would point to memory corruption (given, I had an Intel CPU - no consumer boards support ECC)
 
Last edited:

JackMDS

Elite Member
Super Moderator
Oct 25, 1999
29,563
432
126
I do not see so much as a technology problem.

Technology wise it is Not such a big diifference thses days between the two.

For business, when there server failure it can cost very big $$$ (or worse) to the business, so it make No sense to save one time the small amount of money by going None-ECC.

@Home. There is No real need for unprecedented long term durability. Technology changes every few years and I still have empty new VHS cartridges that I bought years ago and never use.

A very durable Server will go the VHS tapes way after a while too.


:cool:
 

RebateMonger

Elite Member
Dec 24, 2005
11,586
0
0
I've long wondered how much error detection/correction capability is included in applications. In 25+ years of working with PCs and applications, I've never seen a data error that was verifiably caused by a bit flip. In the first couple of decades of PC use, there was usually no way to know if a computer or application crash was caused by memory errors or was simply a programming (OS or application) error.

MS' Exchange Server, since Exchange 2003 SP1, has had built-in error detection and some correction capability for their mail database:

"Each Exchange page now has two checksums, one right after the other, at the beginning of each page. The first checksum (the data integrity checksum) determines whether the page has been damaged; the second checksum (the ECC checksum) can be used to automatically correct some kinds of random corruption. Before Exchange Server 2003 SP1, Exchange could reliably detect damage, but could not do anything about it.
By surveying many 1018 cases, Microsoft discovered that approximately 40 percent of 1018 errors are caused by a bit flip. A bit flip occurs when a single bit on a page has the wrong value—a bit that should be a 1 flips to 0, or vice versa. This is a common error with computer disks and memory.
The ECC checksum can correct a bit flip. This means that approximately 40 percent of 1018 errors are self-correcting if you are using Exchange Server 2003 SP1 or later."


Anybody know if MS' Excel or Word or Intuit's Quickbooks, for instance, contain ECC protection for the data files? There must be something. There are SO MANY computers out there, many with defective memory or defective hard drives. But you seldom/never hear of the kinds of numerical or text errors in the data files that, presumably, would be generated by such errors. In 25 years, you'd think that topic would be a major discussion-maker.
 
Last edited:

fffblackmage

Platinum Member
Dec 28, 2007
2,548
0
76
Oh wow, responses are much more insightful than I had expected. I'm not familiar with anything ECC, so this is quite helpful.

I know bit flips are quite rare, especially with ram perhaps. The few instances where I got failed hash checks were from a HDD (which I didn't make clear earlier, sorry). The files were just sitting in an external HDD for over a year, and I did find a few files with hashes that didn't check out. Ever since I first found issues, I've done hash checks on files immediately after transferring them and found, on occasion, that files are corrupt. I'm not certain as to what the cause is. Either there was an error introduced during the file transfer over USB or the error was caused by something random while sitting on the disk for so long or my external HDD just sucks.

So I guess ECC doesn't really matter, at least in home use. Bleh, I'm just thinking too hard like usual.
 

fffblackmage

Platinum Member
Dec 28, 2007
2,548
0
76
This is a bit off topic from hardware, but if you're going ZFS, use Solaris. I've had more than a couple of people tell me that ZFS on *BSD is still going through teething issues right now.
I was actually leaning towards Solaris over BSD, though I suppose things may change by the time I decide to get around to building this server. From what I hear, Solaris is currently the one with the most features available for ZFS, while BSD is lagging behind a bit (but they're working on an update?).
 

mfenn

Elite Member
Jan 17, 2010
22,400
5
71
www.mfenn.com
ECC seems overkill for a home server. I wouldn't bother with it personally.

I probably wouldn't run OpenSolaris since it's no longer being supported by Sun (Oracle) and I dunno how strong the community will be without that support.