Better NAS testing methodology

klassobanieras

Junior Member
Jun 27, 2013
3
0
0
With respect, it seems really weird that data-integrity is barely mentioned in NAS reviews.

How does a device deal with the RAID write-hole? What happens when you yank the power-cord during a metadata-heavy write?
What happens when a disk silently corrupts a few bits? Is it detected? Fixed? When? During a scrub, or only when I try to read the file?
Say a disk silently corrupts some bits but the problem goes unnoticed, and later a different disk fails outright. What happens when I try rebuilding the array?
What happens when the box dies? Can I get my data off the disks using (eg) a Linux box, or do I need to buy another identical NAS? What if it's 5 years later and I can't find one?

The statistics of drive-failure plus the long expected working life of these devices mean that some kind of failure is basically inevitable, and IMHO what makes a NAS good or bad is what happens after that.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,225
126
I had a generic "Gigabit NAS" with an IDE drive I added. One day, it started showing up some of my pictures corrupted. I determined that it must be the RAM on the
NAS going bad. I saved the drive, and junked the NAS. Thankfully, the NAS was running Linux, and using the ReiserFS filesystem, so I was able to easily recover my files.
 

klassobanieras

Junior Member
Jun 27, 2013
3
0
0
I had a ReadyNAS NV running for a *long* time. I loved that little ah heck, until the PSU popped. Suddenly my 'redundant' data is trapped in a totally opaque disk-format, and if I want it back I have to spring $200 for a proprietary PSU that the manufacturer doesn't deign to sell anymore, for an ageing box that I would probably rather replace at this point.

This is really, really backwards: When something fails, that's when you're meant to be *happy* you sprung for a NAS. That's the acid test, and that's my eternal gripe about NAS reviews - they totally ignore the inevitable failure. Just enumerating the supported RAID levels is *nowhere near* good enough.

It's uglier and messier, but nowadays everything's on a FreeNAS box. I'm not dependent on one hardware vendor to continue making spare parts, and I know my data's in a filesystem that cares about data-integrity. I expect there are NASes out there that fit this bill, but how would I know about them?
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
I had a generic "Gigabit NAS" with an IDE drive I added. One day, it started showing up some of my pictures corrupted. I determined that it must be the RAM on the
NAS going bad. I saved the drive, and junked the NAS. Thankfully, the NAS was running Linux, and using the ReiserFS filesystem, so I was able to easily recover my files.
I wonder if an fsck wasn't the actual problem...RFS' fsck was known for doing exactly that, when it tried to recover files. RFS was fast, but it was a terrible file system for file storage.

Ultimately this is why you'd spend a little more time on a DIY file server. The NAS offesr much convenience, and often point-and-click management features, but it's less robust than an Atom running FreeNAS, when it comes to failures, and is instead, much more like a sealed-case external HDD (you might want to run something faster than Atom, though).

Synology has been using/supporting ext4, but I'm not sure what things look like if you use their custom RAID. OTOH, it won't check the FS by itself--you've got to log in, shut everything down, and run fsck, so :|.

Testing failure probably isn't done because it's hard to test for, especially in a way that's relevant to realistic failures.
 
Last edited:

Red Squirrel

No Lifer
May 24, 2003
70,667
13,835
126
www.anyf.ca
It's one of the reasons I like to build my own and use mdraid or something else that is open source. If the hardware fails I can still recover the data with other hardware.

mdraid can handle power plug pulls fairly well from what I've tested, though it's still not something you want to subject any data storage system to so I have a good UPS.
 

klassobanieras

Junior Member
Jun 27, 2013
3
0
0
Testing failure probably isn't done because it's hard to test for, especially in a way that's relevant to realistic failures.

I can definitely see that it would be time-consuming and boring, yep. But testing some decent subset of likely failure modes does seem doable:

1. Pretend the box has died. Try getting your data off the disks.
2. Kick off a metadata-heavy write (say, copying a bazillion tiny files) and yank the power-plug. Repeat a few times.
3. Fill the filesystem to capacity; power off and pull a disk; plug the disk into a PC and overwrite a few blocks with garbage; replace the disk in the nas and power back on; read every file on disk.

It would be totally normal for all this to happen to a single unit over it's lifespan. This wouldn't even be testing the worst case, but rather what a user is likely to run into.
 

jvroig

Platinum Member
Nov 4, 2009
2,394
1
81
1. Pretend the box has died. Try getting your data off the disks.
2. Kick off a metadata-heavy write (say, copying a bazillion tiny files) and yank the power-plug. Repeat a few times.
3. Fill the filesystem to capacity; power off and pull a disk; plug the disk into a PC and overwrite a few blocks with garbage; replace the disk in the nas and power back on; read every file on disk.

These scenarios are something I would be very interested to see when reading NAS reviews. We are in effect simulating failure and corruption, so we can see how the NAS product deals or copes with it. Good idea.