could someone explain raid 5 / parity to me?

Maezr · Aug 8, 2006

my understanding is that you can have three identical drives with raid 5. the data is striped across two disks and the third disk stores parity info.

in the invent one of the first two disks fail, the lost data can be rebuilt via the parity.

how is this possible? say I have data on my array that is already as compressed as it can possibly be. how can ALL of that info/data be recovered when one of the disks fail? doesn't this imply that everything can be compressed to half the size it currently is? obviously that can't be right. what am I missing?

DaveSimmons · Aug 8, 2006

wiki, google, AT FAQ, ...

Think of this:
(10 + 1 + 2 + 3) = 16

Given the 16 (parity) if you remove _any_ one of the other numbers I can tell you what it was, by comparing the remaining numbers to the parity number. Try it!

The trick is the parity is not all that you have, you still have the other drives too. (10 +

+ 2 + 3) = 16, solve for

Arcanedeath · Aug 8, 2006

the parity data is stored on all the drives not just 1 so if one drive fails the other 2+ have enough info to rebuild the missing data think of it like each drive holds 2/3 data and 1/3 parity and when a drive dies the other 2 drives use their each 1/3 parity data to put the missing 2/3's of data back and then the array generates new parity info for all the drives and data. Check out Storagereview.com for a clear and concise info on raid of all types and how it works.

Maezr · Aug 8, 2006

Originally posted by: DaveSimmons
wiki, google, AT FAQ, ...

Think of this:
(10 + 1 + 2 + 3) = 16

Given the 16 (parity) if you remove _any_ one of the other numbers I can tell you what it was, by comparing the remaining numbers to the parity number. Try it!

The trick is the parity is not all that you have, you still have the other drives too. (10 + + 2 + 3) = 16, solve for

I understand what you're saying.

what I don't understand is why this same concept can't be used to achieve better compression via, say, zip files online; if you don't need all of the data to rebuild it, why not just store the data you need to rebuild it?

Arcanedeath · Aug 8, 2006

because you actualy lose space when you use raid to store the parity data, its for data protection not compression, a raid 5 array has a capasity of n-1 where n is the number of hard drives, ie 3 drives only have 2 drives capasity and the other drive is used for parity.

DaveSimmons · Aug 8, 2006

^ Exactly, the parity isn't compressing the data.

You start with (say) 10 integer numbers and 1 parity number. You've achieved a 10% increase in size.

That 11th parity number doesn't decompress back to the original 10 numbers, it only lets you take 9 of those numbers plus the 11th number to get back the missing 10th number. By itself the parity number is worthless. With 8 (out of 10) numbers instead of 9 the parity number is also usless.

It's error correction not compresion.

Maezr · Aug 8, 2006

Originally posted by: DaveSimmons
^ Exactly, the parity isn't compressing the data.

You start with (say) 10 integer numbers and 1 parity number. You've achieved a 10% increase in size.

That 11th parity number doesn't decompress back to the original 10 numbers, it only lets you take 9 of those numbers plus the 11th number to get back the missing 10th number. By itself the parity number is worthless. With 8 (out of 10) numbers instead of 9 the parity number is also usless.

It's error correction not compresion.

but when a disk fails, you'd lose 5 out of 10, not 1 out of 10, no?

if 5 numbers + 1 can be used to rebuild it back to 10, isn't that taking up less space than it was originally? now do you see what I mean?

DaveSimmons · Aug 8, 2006

No, each hard drive stores only 1 of the numbers, not 5 of them.

This is an over-simplification though, since like Arcanedeath said the parity number and other numbers are mixed up across all the drives.

The key point is that if the data is split into say 5 chunks plus parity you still need 4 of the 5 chunks plus the parity chunk to recover your data. That 4+1 is = 5, which is not any smaller than the size of your data before you added error correction.

Maezr · Aug 8, 2006

how is it that you'd still have four out of five chunks after one of the drives (essentially 50% of the data in a three disc raid 5 array) is lost though?

Aluvus · Aug 8, 2006

Originally posted by: Maezr
how is it that you'd still have four out of five chunks after one of the drives (essentially 50% of the data in a three disc raid 5 array) is lost though?

They are describing systems with more than 3 disks.

Maezr · Aug 8, 2006

but the data on a three disk system is still recoverable. I'm just not understanding how.

if started with 10 pieces of information, one drive fails, you'd now have 5. how can 5 + 1 piece of parity info be used to rebuild the initial data that totaled 10 pieces of information?

that's my question, really, and I haven't seen anything here (or elsewhere) that can explain that.

Bobthelost · Aug 8, 2006

Originally posted by: Maezr
but the data on a three disk system is still recoverable. I'm just not understanding how.

if started with 10 pieces of information, one drive fails, you'd now have 5. how can 5 + 1 piece of parity info be used to rebuild the initial data that totaled 10 pieces of information?

that's my question, really, and I haven't seen anything here (or elsewhere) that can explain that.

You're getting confused. Very confused.

With a 3 disc array you take each file coming in and break it in half, part A goes to disk A, part B goes to disc B and the parity is calculated and sent to disc C.

The reason they were talking about 10 pieces of info is to make it clearer with larger scale examples (like a 10 disc array, where the first 1/10th goes to A, the second 1/10th goes to B etc.). If you wanted to store 10 files with your array then each file would be ripped in half and stored with half on A, half on B and the parity on C.

Each file is stored with 50% on each drive and the parity information on the third. It's done on a much lower level than what you're thinking of. We're talking about kilobytes here when we talk about information being split up.

Maezr · Aug 8, 2006

I'm most curious about a 3 disk setup.

in such a setup, as you said, half of the data is stored on disk A, and half on disk B, and a parity file is placed on disk C.

let's say disk A dies. how can disk B + disk C be used to restore 100% of the information stored on disk A?

Bobthelost · Aug 8, 2006

Originally posted by: Maezr
I'm most curious about a 3 disk setup.

in such a setup, as you said, half of the data is stored on disk A, and half on disk B, and a parity file is placed on disk C.

let's say disk A dies. how can disk B + disk C be used to restore 100% of the information stored on disk A?

Ok, the information stored on:
A = 5
B = 6
C(Parity) = 11

A dies. The controller notices this, checks the parity number (11) and subtracts the value from the surviving disc B (6) which results in the value A would have held.

This was already explained above by DaveSimmons.

DaveSimmons · Aug 8, 2006

The parity chunk is _the same size_ as the A and B chunks.

think of it letter by letter:

Disk 1 H L O
Disk 2 E L !
Disk 3 p p p

Arcanedeath · Aug 8, 2006

check the raid faq at storagereview it will answer all questions http://faq.storagereview.com/

GreenGhost · Aug 8, 2006

Originally posted by: Arcanedeath
check the raid faq at storagereview it will answer all questions http://faq.storagereview.com/

Great link, more specifically: http://www.storagereview.com/guide2000/ref/hdd/perf/raid/concepts/genParity.html

- copying data to my new raid 5 as I type -

Search

could someone explain raid 5 / parity to me?

Maezr

Senior member

DaveSimmons

Elite Member

Arcanedeath

Platinum Member

Maezr

Senior member

Arcanedeath

Platinum Member

DaveSimmons

Elite Member

Maezr

Senior member

DaveSimmons

Elite Member

Maezr

Senior member

Aluvus

Platinum Member

Maezr

Senior member

Bobthelost

Diamond Member

Maezr

Senior member

Bobthelost

Diamond Member

DaveSimmons

Elite Member

Arcanedeath

Platinum Member

GreenGhost

Golden Member

TRENDING THREADS