ALL ABOUT RAID!

islandtechengineers

Senior member
Feb 3, 2004
331
0
0
I know there?s tons of information out there in regards to learning about raid and all it takes is for the seeker to spend some time researching. I?m posting this because when I searched, I went through hundreds of posts to learn about RAID. I hope this post will help others learn about it without having to spend a lot of time searching! Of course, it may take a while for them to stumble upon this particular post, so for the time being, here?s some info that I?ve found useful about RAID!!

By the way, some of this is repetitive and improperly quoted=the info?s there! :)
I?m not a thief and I?m not going to take credit for other?s work. After this sentence, most of the content is QUOTED!




From: Network + Certification Training Book, 2nd edition, Microsoft press
Location for info in regards to this soruce: http://www.microsoft.com/mspress/support/

? Mirroring. Disk mirroring is an arrangement in which two identical hard disk drives connected to a single host adapter always contain identical data. The two drives appear to users as one logical drive, and whenever anyone saves data to the mirror set, the computer writes it to both drives simultaneously. If one hard drive unit should fail, the other can take over immediately until the malfunctioning drive is replaced. Many operating systems, support disk mirroring. The two main drawbacks of this technique are that the server provides only half of its available disk space to users, and although mirroring protects against a drive failure, a failure of the host adapter or the computer can still render the data unavailable.
?
? Duplexing. Disk duplexing provides a higher degree of data availability by using duplicate host adapters as well as disk drives. Identical disk drives on separate host adapters maintain exact copies of the same data, creating a single logical drive, just as in disk mirroring, but with duplexing, the server can survive either a disk failure or a host adapter failure and still make its data available to users.
?
? Volumes. A volume is a fixed amount of data storage space on a hard disk or other storage device. On a typical computer, the hard disk drive may be broken up into multiple volumes to separate data into discrete storage units. You can create multiple volumes on a single drive or create a single volume out of multiple drives. This latter technique is called drive spanning. You can use drive spanning to make all the storage space on multiple drives in a server appear to users as a single entity. The drawback of this technique is that if one of the hard drives containing part of the volume fails, the whole volume is lost.
?
? Striping. Disk striping is a method by which you create a single volume by combining the storage on two or more drives and writing data alternately to each one. Normally, a spanned volume stores whole files on each disk. When you use disk striping, the computer splits each file into multiple segments and writes alternate segments to each disk. This speeds up data access by enabling one drive to read a segment while the other drive's heads are moving to the next segment. When you consider that network servers might need to process dozens of file access requests at once (from various users), the speed improvement provided by disk striping can be significant. However, striped volumes are subject to the same problem as volumes that are spanned. If one drive in the stripe set fails, the entire volume is lost.
?
? Redundant array of independent disks (RAID). This is a comprehensive data availability technology with various levels that provide all of the functions described in the technologies previously listed. Higher RAID levels store error correction information along with the data, so that even if a drive in a RAID array fails, its data still remains available from the other drives. Although RAID is available as a software product that works with standard disk drives, many high-end servers use dedicated RAID drive arrays, which consist of multiple hard drive units in a single housing, often with hot swap capability. Hot swapping is when you can remove and replace a malfunctioning drive without shutting off the other drives in the array. This enables the data to remain continuously available to network users, even when the support staff is dealing with a drive failure.
?
RAID Level===RAID Technology===Description

0===Disk striping ===Enhances performance by writing data to multiple disk drives, one block at a time; provides no fault tolerance.

1===Disk mirroring and duplexing===Provides fault tolerance by maintaining duplicate copies of all data on two drives. Disk mirroring uses two drives connected to the same host adapter, and disk duplexing uses two drives connected to different host adapters.

2===Hamming error-correcting code (ECC) ===Ensures data integrity by writing error-correcting code to a separate disk drive; rarely implemented.

3===Parallel transfer with shared parity ===Provides fault tolerance by striping data at the byte level across a minimum of two drives and storing parity information on a third drive. If one of the data drives fails, its data can be restored using the parity information.

4===Independent data disks with shared parity===Identical to RAID 3, except that the data is striped across the drives at the block level.

5===Independent data disks with distributed parity===Provides fault tolerance by striping both data and parity across three or more drives, instead of using a dedicated parity drive, as in RAID 3 and RAID 4.

6===Independent disks with two-dimensional parity ===Provides additional fault tolerance by striping data and two complete copies of the parity information across three or more drives.

7===Asynchronous RAID ===Proprietary hardware solution that consists of a striped data array and a separate parity drive, plus a dedicated operating system that coordinates the disk storage activities.

10 ===Striping of mirrored disks ===Combines RAID 0 and RAID 1 by striping data across mirrored pairs of disks, thus providing both fault tolerance and enhanced performance.

53===Striped array of arrays===Stripes data across multiple RAID 5 arrays, providing the same fault tolerance as RAID 5 with additional performance enhancement.

0+1===Mirroring of striped disks ===Combines RAID 0 and RAID 1 in a different manner by mirroring the data stored on identical striped disk arrays.





Informative posts I?ve found by other users on anandtech!...





Member : crazyeddie
Location: http://forums.anandtech.com/messageview...atid=29&threadid=1571985&enterthread=y
Post :

? Raid 3 and Raid 5 use a combination of striping and mirroring to improve data reliability, maximize storage, and improve drive array performance. These types of RAID arrays are usually found in SCSI-based server storage solutions and usually involve a minimum of three or more hard drives.

Essentially, the data is divided into multiple parts and then each of those parts is written to multiple hard drives. For example, data parts A, B, and C might be written to hard drives 1, 2, and 3. Drive 1 holds parts A and B, drive 2 holds parts B and C, and drive 3 holds parts A and C.

If any of the three hard drives fails, all segments of the data are still available. While running normally, any segment of the data can be pulled from multiple drives simultaneously. Storage is improved vs. RAID 1 (mirroring) as well. In RAID 1, two 100Mb hard drives provide 100Mb of storage. In RAID 3/5, 3 100Mb drives provide 200Mb of storage. If two drives fail, you're screwed.

There is a technical difference between how RAID 3 accomplishes this versus RAID 5, but off the top of my head I don't remember.

There is also the ability to add a "hot spare" to an array, which is essentially a stand-by drive that waits for an active drive in the array to fail. When an array drive fails, the RAID controller rebuilds it onto the hot spare with data pulled from the surviving disk drives. The array can then continue running with redundancy and the failed drive can be removed at the convenience of the server administrator. A file server or storage system with "hot swappable" drives can have the bad drive removed without shutting down the file server for maintenence.

Rebuilding an array on the fly can impact overall system performance and may not be desirable on a server or storage system under heavy load. In high load environments, RAID 3 and 5 arrays can be clustered so a backup array can take over while a failed drive in an array is replaced and the array rebuilt. Huge databases can also be striped across multiple RAID arrays for enhanced performance. You may see references to RAID 30/50 (31/51) or RAID 3+1/5+1 (3+0/5+0).

These types of sophisticated storage systems generally involve multi-channel SCSI RAID controllers and very expensive hard drives.?



Member : NotquiteanooB
Location : http://forums.anandtech.com/messageview...atid=29&threadid=1571985&enterthread=y
Posts:

Read this article :

http://pcpitstop.com/tinylink.asp?MaxDisk




Member : erwos
Location: http://forums.anandtech.com/messageview...atid=29&threadid=1571985&enterthread=y
Posts:

RAID 0 is block-level striping Two drives, minimum.

RAID 1 is mirroring. Two drives, minimum.

RAID 2 is bit-level striping with ECC. It's not used anymore. It required _14_ disks, and spindle-locked moves. See why no one uses it?

RAID 3 is an array of disks with a single parity disk, striped by byte. Three drives, minimum.

RAID 4 is RAID 3 with block-level striping.

RAID 5 is RAID 4 with distributed parity.

RAID 6 is double parity RAID 5. (allows you to lose two drives)

RAID 7 is a proprietary standard similar to RAID 3-6. Only SCC uses it.

There's no such thing as RAID 8-9.

RAID 0+1 is striped mirrors (IE, you have two RAID 1s and then stripe them into a single RAID 0). RAID 10 is the opposite (you take striped arrays and mirror them). RAID 10 is faster than RAID 0+1. Neither is really a proper RAID level, since they're hybrid arrays. 4 drives, minimum.

RAID 50 is striped RAID 5 arrays. Again, not a real RAID level, just a hybrid. You could theoretically have RAID 0+5, too, except that there's no point (lower performance). 6 drives, miniumum.

You can actually build interesting hybrid arrays on a standard PC now, due to the influx of cheap 4-bay Firewire enclosures onto the Internet. If you're not worried about bandwidth, you could conceivably chain 64+ hard drives together, allowing for such oddities as RAID 51 (mirrored parity - you can lose up to half plus one of your disks!).






Member: airfoil
Location: http://forums.anandtech.com/messageview...atid=27&threadid=1572871&enterthread=y
Posts:
Take a look at this website to understand what RAID means and figure out which type you would like to go with.
http://www.acnc.com/04_00.html





Member: Matthias99
Location:
http://forums.anandtech.com/messageview...atid=50&threadid=1543869&enterthread=y
Posts:
A RAID1 array of N disks allows you to always run up to N parallel reads -- if you find RAID tests of something that does a lot of small, random parallel reads (for instance, a webserver), you'll find that RAID1 blows everything else out of the water. Its write performance, however, is terrible (as it doesn't scale up at all with the number of disks).

It also doesn't help a whole lot with STR (Sustained Transfer Rate). For a fairly small read (less than N stripes for an N-disk array), you can sort of 'pseudo-stripe' the read by having each disk read one of the N stripes you need. But beyond that, the disks need to reseek all the time to find the next stripe you haven't read yet, and that takes enough time to basically kill the benefit you get from doing the reads in parallel. Drives are MUCH faster when doing long linear reads than they are doing "Read 4K, seek, Read 4K, seek, Read 4K, seek, ...". With a striped RAID level (such as RAID0, RAID1+0, or RAID5), when you do one big read, each drive just reads linearly, so you get maximum STR. See crappy diagram below:

RAID1:
DISK1.....A.....B.....C.....D.....E.....F.....
DISK2.....A.....B.....C.....D.....E.....F.....
DISK3.....A.....B.....C.....D.....E.....F.....
DISK4.....A.....B.....C.....D.....E.....F.....

RAID0:
DISK1.....A.....E.....I.....
DISK2.....B.....F.....J.....
DISK3.....C.....G.....K.....
DISK4.....D.....H.....L.....

If you want to read blocks A-D, you can see that the RAID1 and RAID0 can both split the read up into four parallel operations. But if you wanted to read blocks A-L, you would get the following usage patterns:

RAID1:
DISK1: Seek to A, Read A, seek to E, read E, seek to I, read I
DISK2: Seek to B, Read B, seek to F, read F, seek to J, read J
DISK3: Seek to C, Read C, seek to G, read G, seek to K, read K
DISK4: Seek to D, Read D, seek to H, read H, seek to L, read L

RAID0:
DISK1: Seek to A, Read A, Read E, Read I
DISK2: Seek to B, Read B, Read F, Read J
DISK3: Seek to C, Read C, Read G, Read K
DISK4: Seek to D, Read D, Read H, Read L

The RAID0 doesn't have to seek when doing linear reads. That's why it's faster doing those, but slower at small random reads (since the RAID1 can do them all in parallel, whereas the RAID0 can only do them in parallel if they do not need to access the same disks).





Member: ImpactDNI
Location:
http://forums.anandtech.com/messageview...atid=27&threadid=1581661&enterthread=y
Posts:
http://www.tweakers.net/reviews/515/1






Like I said, there's a lot more information out there and i hope the above will help others learn about RAID! by all means, add on, comment, critique and shut me down :D !!