Raid 5 vs Raid 10

ncalipari · Apr 26, 2011

I read often that Raid 10 is preferred over Raid 5 for reliability and performance.

I would like to share my thoughts with you for critics/comments.

1) Small writes are surely slower, and this might be a problems for somes. Anyhow if you use linux, and you setup a software Raid 5, you can have performances equal or greater than Raid 10, by using aggressive write cache (Ram is so cheap that for 60$ you can have a 8 GB write cache), a modern filesystem (ZFS or EXT4) and lazy flush (240 seconds or more). Obviously an UPS is mandatory.

2) Reliability is worse. I built a statistical model that shows that this.

Hypotesis:

1) Each disk failure is independent from each other. This means that disk failures depends on internal causes rather than external, like buggy controller, misplaced disks, improper solicitation, ....

2) Declared MTBF is a lie. Taking for example a caviar green 2TB, it has a declared MTBF of 1,2 million hours, or 136 year, which I consider an outright lie, at least for continuous use. A more reasonable MTBF of 10 years is taken in account.

This may be viewed as controversial by someone, also because if we take in account the original MTBF as true the conclusions I came to not hold true anymore.
Anyhow thinking that a disk, ANY disk, can work for 136 years, even in idle, is simply an unreal hypothesis.

3) Critical faliure is:

Raid 5 = 2 disks are broken at a given time, thus data is lost.

Raid 10 = 3 disks , or two disks of the same battery are broken at a given time, thus data is lost.

This might be seen as another critical hypothesis by some readers, because it doesn't take in account that when a disk fails in Raid 10 it can be replaced without data loss.

Anyhow I didn't consider this for several reasons:
- On very large numbers (e.g.: a datacenter) the two models are the same.

- Replacing a disk is a cost and a burden, thus my model leans toward minimal maintenance.

- Also with Raid 5 you can replace a disk after a failure, without data loss.

- Calculations would have been much more complex 😛

Conclusions

After one year, the Critical Failure has a probability of 2.8% on Raid 5 and 8.6% on Raid 10. After two years is 10.4% for Raid 5 and 28.8% for Raid 10.

This show us that the benefit of having 2 disks for parity is outweighed by the cumulative risk of having 4 disks rather than 3.

The critical point is that if the correct two disks (i.e.: two disks that belong to the same battery) fails, then the system is compromised.

Obviously Raid 6 is safer than Raid 5 and 10, because any two disks can fail without a problem.

Thus Raid 5 is the best in Reliability/Dollar, Performance/Dollar and Functions over Complexity in the Raid segment (Raid 0, 1, 10, 01, 6).

Send me a PM with your mail if you want to receive an ODT spreadsheet which takes MTBF, years and hours of function and gives back Critical Failure probability for Raid 5 and 10.

nanaki333 · Apr 26, 2011

i swear by raid5 for my servers. my exchange server ran for 4 years straight with occasional reboots for patches and updates, and i never once had a disk failure. thinking of doing raid6 for a new huge backup NAS. 2 arrays with 10x 2TB drives each. call me old fashion, but i feel safer with a hardware raid card.

Viper GTS · Apr 26, 2011

Did your model account for rebuild times with large drives?

RAID 5 is essentially obsolete. RAID 6 is far more appropriate.

Here's a more thorough model (at least based on your description of what you accounted for):

http://blogs.sun.com/relling/entry/raid_recommendations_space_vs_mttdl

This is a ZFS specific post but interpret it as:

RAID 10 = 2 way mirror
RAID 5 = RAIDZ
RAID 6 = RAIDZ2

Viper GTS

ncalipari · Apr 26, 2011

Viper GTS said:
Did your model account for rebuild times with large drives?

Well it's almost zero confronted with the lifetime of a disk

RAID 5 is essentially obsolete. RAID 6 is far more appropriate.

Well I wouldn't say that. Some data requires no parity, some data requires some parity, some requires high parity.

JBOD, Raid 5 and Raid 6 all have their meanings in a context.

Here's a more thorough model (at least based on your description of what you accounted for):

http://blogs.sun.com/relling/entry/raid_recommendations_space_vs_mttdl

This is a ZFS specific post but interpret it as:

RAID 10 = 2 way mirror
RAID 5 = RAIDZ
RAID 6 = RAIDZ2

Viper GTS

That model is very rough, and is somewhat stated in the description.

If we take in consideration MTBF declared by manufacturers, then it is true.

If we consider real MTBF (in the order of tens of years rather than centuries) then RAIDZ is above 2 way mirroring in reliability.

By changing the MTBF from centuries to decades, as it really is, that graph get shacked quite a bit.

Mark R · Apr 26, 2011

ncalipari said:
I read often that Raid 10 is preferred over Raid 5 for reliability and performance.

2) Reliability is worse. I built a statistical model that shows that this.

2) Declared MTBF is a lie. Taking for example a caviar green 2TB, it has a declared MTBF of 1,2 million hours, or 136 year, which I consider an outright lie, at least for continuous use. A more reasonable MTBF of 10 years is taken in account.

This may be viewed as controversial by someone, also because if we take in account the original MTBF as true the conclusions I came to not hold true anymore.
Anyhow thinking that a disk, ANY disk, can work for 136 years, even in idle, is simply an unreal hypothesis.

You misunderstand what MTBF is. MTBF is not expected drive life. An MTBF of 136 years does not mean that a drive would be expected to run for 136 years.

MTBF is a statistical measure of reliability of drives during their design lifetime. An MTBF of 136 years means that if you have 136 drives (all within the warranty period, and if a drive goes out of warranty it is replaced with a new one which is within warranty) and you use them for 1 year, then you would expect, on average, 1 drive to fail during that year.

I've put together a detailed statistical model of drive failure in RAID arrays, which allows you to investigate different drive reliability, error rate, RAID level and rebuild time parameters.

You can download the excel file from:
MTTDLcalc.xlsx

Emulex · Apr 27, 2011

hp servers put every other pair in a separate chassis with separate dual ported access to the drive and separate power domain so you can lose an entire chassis in a dl380 and keep on rocking. when the power wire fraps to the half of the drives. you still got 4 or 8 drives kicking it with raid-10. with raid-6 you are game over.

This is a very real scenario. number 1 rule also applies to external. you always build for redundant chassis because someone will miscable, yank, unplug, ?? fail. an entire chassis. iirc one best practice was 8 drives in a chassis so you do 7x7 or 8x8 disks to ensure you have 1 spare per chassis and that you can survive both a horizontal and vertical failure without having your hotspare move to another chassis which then would put you at a SPOF.

raid-5/6/Z2 requires more ops than raid-10 which is simply a striped set of mirrors. more writes will count eventually with SSD 🙂

the more cache you throw at it the more critical that battery becomes. can you imagine a system hard lock with 8gb of in-flight data? geez.

(i have 4 HP Lefthand units which are raid-5) which can then be network raid-5/0/10 i get the cache part

Raid 5 vs Raid 10

ncalipari

Senior member

nanaki333

Diamond Member

Viper GTS

Lifer

ncalipari

Senior member

Mark R

Diamond Member

Emulex

Diamond Member

TRENDING THREADS