I don’t understand unRAID

VladTepes

Junior Member
Jul 2, 2010
4
0
0
Hello wise users of this fine forum!

To make a long introduction short I’m a RAID and hardware newbie. I have a low understanding of the mechanics and inner workings of a hard drive but struggle to at least maintain a basic level of understanding.

Now I’ve just started reading up on unRAID and I’m just puzzled as to how it actually works because it did not make any intuitive sense to me. Considering I am fairly dense in general I need help visualizing how unRAID works.

So I’ll just construct a fictional scenario that involves a home server running unRAID with 10 physically identical hard drives, each being 100GB in size (I know they don't have to be, but it's easier to visualize).

Now the unRAID is configured with 1 disk for parity (assume I don’t understand what the hell this means, even though I’ve tried to read up on it (http://searchstorage.techtarget.com/sDefinition/0,,sid5_gci212748,00.html)) and 9 disks for storage. I have a hypothetical TV-show which consists of 9 seasons; each season is 90GB worth of data.

I label and map through my unRAID the 9 different disks to store the 9 separate season, 1 through 9.

For the part that feels unintuitive; I’m reading that should any of the 9 disks used for storage fail (lets say disk #7 that contains season 7) I could just pop in a new one and voila, the unRAID will reconstruct the data lost from disk #7?

How on earth does this work? How does the parity disk keep track of the data on all the 9 disks? If the 9 storage disks work completely in isolation I just don’t understand how this works. I mean if it can reconstruct the data on disk #7, doesn’t that mean the data has to be ON the parity disk? How can that be if there’s only room for 100GB on the parity disk? It’s not like it knows what disk will fail in advance and thus it can prepare (hehe)…

Likewise if two disks fail at the same time, then you’re going to loose the data on both those drives as no reconstruction will be possible (?). If so, why, why can’t it reconstruct the data for at least one of those two disks in that scenario? Something basic probably eludes me here!

Is there anyone who can explain this in a way that is easy or moderately easy to understand? I can accept (but I’d rather not :p) that I simply won’t understand it unless I learn more about the underlying mechanics if that’s the only answer.

Thanks for reading!
 
Last edited:

veri745

Golden Member
Oct 11, 2007
1,163
4
81
let's reduce your example to 3 disks. For each bit on the two storage disks, there is a parity bit that is calculated and stored on parity drive. Let's assume that the first bit of data on drive A is a '0' and the first bit of data on drive B is a '1'.

Parity can be "even" or "odd" that means. Let's assume we're using "even" parity, which means that the number of '1's across all drives, including the parity drive, will be even for a given bit. In this example, bit 0 of the parity drive will be a '1'.

This is a table of all the possibilities of bit combinations for two drives, and what the parity bit would be for that combination:
Code:
driveA driveB parity
0        0       0
0        1       1
1        0       1
1        1       0
If you lose drive A, and a given bit is '1' on drive B and '1' on the parity drive. Then you know, since the number of '1's is even on those drives, that the data you lost on drive A is a '0'.

Is that helpful. I would think the wikipedia article on parity would be just as helpful if not moreso.
http://en.wikipedia.org/wiki/Parity_bit
 

Modelworks

Lifer
Feb 22, 2007
16,240
7
76
Something many people do not know is that you can do parity on single files without raid if you want to make sure it can be recovered. Good for when you want to transfer files to another media or make backups.

You can use the program quickpar which generates parity files much smaller than the total file size where if parts of the original are corrupt it can restore them for you.
http://www.par2.net/index.php
 

FishAk

Senior member
Jun 13, 2010
987
0
0
So can Unraid really keep track of data for 9 drives on just 1 parity drive? That sounds like a good trick. Too good.
 

poofyhairguy

Lifer
Nov 20, 2005
14,612
318
126
FishAk, Unraid can keep track of 20 drives for one parity drive. That is part of the reason (along with the ability to mix and match drives of any size and speed) that it is so worth the money.

This thread is awesome too, I have often wondered how parity worked...
 

veri745

Golden Member
Oct 11, 2007
1,163
4
81
Simple parity can be used to preserve data from a SINGLE drive failure for any number of drives (although scaling the parity calculation to that many drives becomes prohibitively time intensive)

The problem is that it is slow (data must be read from all N drives when writing to one of them to calculate parity) and if you lose more than one drive, you are screwed. If you put 20 drives in your system, it becomes much more likely that more than one will fail at the same time.
 

pjkenned

Senior member
Jan 14, 2008
630
0
71
www.servethehome.com
If you put 20 drives in your system, it becomes much more likely that more than one will fail at the same time.

Like 15&#37; chance of surviving <365 days IIRC. 3yr is over 30% chance.

And those numbers assumed either hot swaps or <4hr failed drive replacement time. That's with 1TB drives too. Hence why single parity is not really an option for large arrays and even at 10 drives it gets scary. 3-4 drive single parity arrays are not too bad though.
 

veri745

Golden Member
Oct 11, 2007
1,163
4
81
Like 15% chance of surviving <365 days IIRC. 3yr is over 30% chance.

And those numbers assumed either hot swaps or <4hr failed drive replacement time. That's with 1TB drives too. Hence why single parity is not really an option for large arrays and even at 10 drives it gets scary. 3-4 drive single parity arrays are not too bad though.

At least for unRaid you only lose data on the drives that fail in the case of a multiple-drive failure. If a RAID5 array fails you would lose the entire array. But still, there are definite pros and cons of unRAID, just like any other redundancy methodology.
 

Barnaby W. Füi

Elite Member
Aug 14, 2001
12,343
0
0
The problem is that it is slow (data must be read from all N drives when writing to one of them to calculate parity) and if you lose more than one drive, you are screwed.

That really sucks. If you write to a drive, all of them must spin up. Not good for power, cooling, or noise.

It seems like every variation on RAID has pretty serious pitfalls these days. Maybe it's better to just figure out how to divide up your data, and manually distribute it among the different drives. And use backups for redundancy. I get the feeling a lot of people here are mostly interested in storing lots of media on home servers, so high availability is really not needed.
 

veri745

Golden Member
Oct 11, 2007
1,163
4
81
Barnaby W. Füi;30099624 said:
That really sucks. If you write to a drive, all of them must spin up. Not good for power, cooling, or noise.

It seems like every variation on RAID has pretty serious pitfalls these days. Maybe it's better to just figure out how to divide up your data, and manually distribute it among the different drives. And use backups for redundancy. I get the feeling a lot of people here are mostly interested in storing lots of media on home servers, so high availability is really not needed.

If you're just storing media, you're probably not going to be re-writing a whole lot of data, and speed isn't particularly important.

I don't have an unRAID box yet, but my plan is to create one as soon and I get the motivation and the right CPU for it.
 

pjkenned

Senior member
Jan 14, 2008
630
0
71
www.servethehome.com
If you're just storing media, you're probably not going to be re-writing a whole lot of data, and speed isn't particularly important.

I don't have an unRAID box yet, but my plan is to create one as soon and I get the motivation and the right CPU for it.

unRaid read speeds are not fantastic either. Again, there is a reason the rest of the storage industry moved away from Raid 4 making it supported at this point by one major storage vendor only as backwards legacy support and a tiny software company peddling it to consumers.