RAID Thoughts

Matthias99 · Mar 9, 2007

Originally posted by: Enlightenment
snip -- exceedingly long discussion about RAID0 and why it is actually good because of I/O queueing

Two major problems:

1) With RAID0, each particular block is only hosted on one drive. On average, each drive in a 2-way RAID0 can only service 50% of the random requests (25% each in a 4-way array), cutting into the efficiency.

2) Most desktop programs don't queue I/Os deeply (if at all). Yeah, if you're running a webserver or something that might queue 100+ totally independent I/Os in the background, this factor can come into play. But it won't do much for a lot of desktop work.

how come a *real* hardware RAID controller like Areca ARC-1210 does have higher access time than a single disk but beat the single disk in any benchmark by a great margin? Apparantly, access time is not the paramount variable regarding 'realistic speed'.

"real" hardware controllers also usually provide a substantial amount of cache onboard, and will frequently do prefetching as well. Also, most RAID controller benchmarks involve server programs (databases, webservers, email servers, etc.) that deeply queue I/Os. IMO, this is not really representative of a typical desktop workload.

sjandrewbsme · Mar 9, 2007

My opinion, fwiw, has always been that RAID's complexity and potential for problems are not justified considering the marginal (I would argue barely perceptable) performance increase you get.

For every RAID user who swears by the seed they get (I think this is largely placebo speed if you know what I mean) I know 5 who have lost arrays and tons of data.

Enlightenment · Mar 10, 2007

Originally posted by: guptasa1
I'm *still* pretty freaked out about the idea of RAID 0 drive failures, but I guess one drive failing in a RAID array isn't THAT much riskier than a single drive failing anyhow.

Correct. But in both cases i would argue you need backups for data you do not wish to loose. A disk failure should only result in some inconvenience - not more. If you data is important, then i'm sure you can afford some money into proper backups. After all - life is all about making priorities.

Enlightenment · Mar 10, 2007

Originally posted by: Matthias99
snip -- exceedingly long discussion about RAID0 and why it is actually good because of I/O queueing

No, not I/O queueing, you did not read my post correctly. It is about parallelisation. With a single drive all I/O requests will be done in a serial order: A, then B, then C. With a 2-disk RAID0 array it is possible to do two I/O's at once, A+B, then C+D, then E+F. Thus: a theoretical performance increase of 100%; for random I/O.

1) With RAID0, each particular block is only hosted on one drive. On average, each drive in a 2-way RAID0 can only service 50% of the random requests (25% each in a 4-way array), cutting into the efficiency.

If the I/O would land equally divided on all available disks, a linear performance increase is to be expected. This is an ideal situation though, in real-life situations can be less ideal. But on average, a significant performance increase can be expected for random I/O. The major problem is actually the implementation of RAID0 and the filesystem logic. Windows does bad in both, unfortunately.

2) Most desktop programs don't queue I/Os deeply (if at all). Yeah, if you're running a webserver or something that might queue 100+ totally independent I/Os in the background, this factor can come into play. But it won't do much for a lot of desktop work.

It is true that desktop systems, due to their single user application pattern, do not profit as much from parallellisation than busy multiuser server systems. But even with an I/O queue of just 1, which is the worst possible situation, my numbers indicate a 63% performance increase. If you look at a queue depth of 4, the performance increase is already 255%. These numbers are strictly random I/O done with transfer sizes ranging from 16KB to 128KB - very much non-sequential.

The only important factor and assumption is that there is no misalignment between stripe block and filesystem. On Windows, this is a problem since Windows partitioning does nothing to prevent a misalignment. It can be fixed manually though. Unfortunately not many people know about it and FAQs do not mention it - a shame!

"real" hardware controllers also usually provide a substantial amount of cache onboard, and will frequently do prefetching as well. Also, most RAID controller benchmarks involve server programs (databases, webservers, email servers, etc.) that deeply queue I/Os. IMO, this is not really representative of a typical desktop workload.

Agreed, but my point was that access time is useless if comparing a single disk to a multidisk RAID system. While the latter might not have a lower access time, it does allow processing any type of I/O more quickly. Access time is only a measurement of ONE I/O request, RAID0 might not be able to process that SINGLE I/O request more quickly, but will process a *bunch* of I/O requests more quickly.

So someone cannot say hey this raptor here has a lower accesstime than my 24-disk RAID0 hardware array and thus is faster. No way. Access time is hyped and totally misunderstood.

guptasa1 · Mar 10, 2007

A few replies:

It was pointed out that I initially was considering a RAID 1 array for redundancy, and now I seem to be going in another direction. I decided RAID 1 for three main reasons after the helpful replies here:

1.) It might be redundant, but doesn't protect well against accident or compromise. It also wouldn't protect as well as an external backup drive against certain types of disasters (and I'd never even considered an external drive until I looked at this thread).

2.) It might be slightly faster than a single drive alone for reads, but it's not nearly as fast as RAID 0. Now, you're right in that I might not get all the speed benefit of say a web server, but, I've read several anecdotal reports of people running two Raptors in RAID 0 saying that they do definitely notice a significant speed increase (everything from OS boot time to game loading to processing large files). Placebo affect? Maybe, but I really don't think so, and I *know* thoroughput will definitely increase during disk-intensive tasks.

3.) RAID 1 would cut my storage space in half. At least this way I'll be using the maximum storage of all drives even though they are smaller than I was initially considering. Additionally, I'd never considered the Raptor until reading this thread, mostly due to it's smaller size. But linking two of them will give me 300 GB, which is more than adequate for my needs.

So I honestly think that two Raptors and a slower but bigger external drive will accomplish my two main goals - best possible speed and best possible data protection as long as I'm disciplined about it. I'll probably set up a "crucial" backup to run daily (business stuff, save games, etc.) and then a less important full system backup to run weekly.

Also, looking at the failure rate of drives, it's not really *THAT* risky to run RAID 0. Of course it's riskier than a single drive with all one's data on it, but I've had two drives in my current system for several years and never had a problem. And RAID makes it no more likely for a drive to fail, so I should be okay unless I simply have bad luck, and even then, it'll just be a headache with proper backups.

Also, cool on the jumperless drives - man, I feel old! =oP

Enlightenment · Mar 10, 2007

Originally posted by: guptasa1
2.) It might be slightly faster than a single drive alone for reads, but it's not nearly as fast as RAID 0. Now, you're right in that I might not get all the speed benefit of say a web server, but, I've read several anecdotal reports of people running two Raptors in RAID 0 saying that they do definitely notice a significant speed increase (everything from OS boot time to game loading to processing large files). Placebo affect? Maybe, but I really don't think so, and I *know* thoroughput will definitely increase during disk-intensive tasks.

Well there is no concensus on RAID0 benefits; a lot of opinions and certainly a lot of flawed benchmarks. I'm determined to shake up things in the current status quo and popular beliefs about RAID0.

So I honestly think that two Raptors and a slower but bigger external drive will accomplish my two main goals - best possible speed and best possible data protection as long as I'm disciplined about it. I'll probably set up a "crucial" backup to run daily (business stuff, save games, etc.) and then a less important full system backup to run weekly.

This is certainly an excellent solution. You will both have a good backup (which is more valuable than redundancy and an extremely fast Raptor in RAID0 setup. Just make sure you get the newer Raptor ADFD drives, and not the older GD ones. This is quite important for performance! Make sure you do not select a small stripe size. Apart from that, i think performance will be very nice and if you setup the backups correctly you indeed have the 'best of both worlds'. Go for it! 🙂

Also, looking at the failure rate of drives, it's not really *THAT* risky to run RAID 0. Of course it's riskier than a single drive with all one's data on it, but I've had two drives in my current system for several years and never had a problem. And RAID makes it no more likely for a drive to fail

Correct, so it's unfair that people complain about unsafety about RAID0 while (multiple) single drives are just as unsafe. RAID0 is a great technology for performance-lovers and it's easily available and to setup. I recommend it. But i tell them to run backups, and i also tell that to people who are just using one or more single disks. And that's the difference with some - some people only talk about data security when they hear the word "RAID0", while in fact single drives are just as unsafe.

Good luck with your setup! As for the 'big drive' i recommend Samsung T166 500GB. Its cheap, its fast, its power-efficient and very silent.

guptasa1 · Mar 10, 2007

Thanks for the feedback! So should I go for the largest possible stripe size?

Madwand1 · Mar 10, 2007

Originally posted by: Enlightenment
Correct, so it's unfair that people complain about unsafety about RAID0 while (multiple) single drives are just as unsafe. RAID0 is a great technology for performance-lovers and it's easily available and to setup. I recommend it. But i tell them to run backups, and i also tell that to people who are just using one or more single disks. And that's the difference with some - some people only talk about data security when they hear the word "RAID0", while in fact single drives are just as unsafe.

Apart from the fact that this is incorrect, I have no problem with what you're trying to do.

Edit: I hope that the enlightened one will figure out what he got wrong and correct it himself.

As for one example of what can go wrong -- lookee, we don't have to go far:

http://forums.anandtech.com/messageview...atid=27&threadid=2018528&enterthread=y

Enlightenment · Mar 11, 2007

Originally posted by: Madwand1

Originally posted by: Enlightenment
Correct, so it's unfair that people complain about unsafety about RAID0 while (multiple) single drives are just as unsafe. RAID0 is a great technology for performance-lovers and it's easily available and to setup. I recommend it. But i tell them to run backups, and i also tell that to people who are just using one or more single disks. And that's the difference with some - some people only talk about data security when they hear the word "RAID0", while in fact single drives are just as unsafe.

Click to expand...

Apart from the fact that this is incorrect, I have no problem with what you're trying to do.

Edit: I hope that the enlightened one will figure out what he got wrong and correct it himself.

As for one example of what can go wrong -- lookee, we don't have to go far:

http://forums.anandtech.com/messageview...atid=27&threadid=2018528&enterthread=y

His data is not gone. The disks did not die (i assume), he probably made some mistakes by deleting his array by removing the CMOS configuration. So recreate the array with same stripesize and it should work. Else you have to reconstruct manually to get your data back (which is kind of a pain in the ass). Either way, the only argument against RAID0 in this case might be added complexity or 'hassle' - not data insecurity.

Madwand1 · Mar 11, 2007

Help him recover his data then in his thread, and while you're at it, you can help this other person:

http://forums.anandtech.com/messageview...atid=27&threadid=2018472&enterthread=y

Saying that RAID 0 is just as reliable as single drives is incorrect and misleading.

guptasa1 · Mar 11, 2007

I see what you're saying, and yes, the controller can fail or configuration can get messed up in the BIOS (or even cables switched), but in both cases in the article you pointed out, it doesn't necessarily mean the data's just gone (and if backups are done, that's not as much of a problem anyways...just a major pain)...hopefully both will be fine if the configs get hammered out, though of course that would be highly unpleasant to experience.

So essentially there are two main causes of RAID 0 problems:
1.) Controller/BIOS fails or is misconfigured, which *should* in theory be fixable.
2.) One of the hard drive fails.

Any time you add a point of failure, reliability goes down. I guess to me after thinking about it I simply think reliability isn't *that* far decreased from a single drive where I should have to be overly worried if I'm careful about backups.

Madwand1 · Mar 11, 2007

If you have a good backup, and you maintain it, then you're better off even with RAID 0 than most people -- who don't have any backup at all.

However, with RAID 0, you do have significantly more risk than you have without RAID, and moreover great capacity to increase the cost of the risk by accumulating more data, so backups are more important when you have RAID 0 than when you have single drives. (Which is not to say that backups aren't important with single drives, but just that they're even more important with any sort of striped RAID, because you have greater data risk.)

I have an on-board RAID 0 setup which has been running for about a year without any problems. And I have another on-board RAID 5 setup which just failed yesterday for no apparent reason -- one drive declared "error", and the array was degraded. If that had been a RAID 0 setup, I would have been recovering from backup, assuming I had one. As it was a RAID 5 setup, I could rebuild the "error" drive from the remaining degraded array and carry on. I've also seen RAID arrays just go "poof". And data recovery for RAID arrays is harder than it is for single drives, especially for laymen.

An expression I like to use for such situations is "bugs kill RAID dead". Understand this point, and have a backup or two, and you should be just fine. Delude yourself that RAID 0 is just as reliable as single drives, which rarely fail, and proceed without backups, and you'll probably learn the hard way.

guptasa1 · Mar 11, 2007

I appreciate the advice and experience, and I'll definitely make sure I'm reliable about backups. I'm gonna be running Vista and from what I hear it has a pretty nice backup feature in itself.

Enlightenment · Mar 12, 2007

Originally posted by: Madwand1
However, with RAID 0, you do have significantly more risk than you have without RAID

Risk of what, exactly? Dataloss?

And I have another on-board RAID 5 setup which just failed yesterday for no apparent reason -- one drive declared "error", and the array was degraded.

Then probably what happened was:
- one disk either timed out or had a read/write error
- the controller agressively disconnected the disk; since it has a low threshold for bad disks
- you should immediately add a new (good) disk and rebuild
- inspect and/or replace the failed disk

To avoid this, either choose disks who support TLER (Time-Limited Error Recovery) or pick a RAID controller/subsystem which simply fails the I/O request instead of kicking a drive out of the array at the first sign of trouble. Remember, onboard RAID5 solutions are new and still very crappy. They are hastilly written by careless Taiwanese manufacturers who sell their chips for a few bucks. This does not mean you should judge RAID5 itself, but rather it's implementation.

And data recovery for RAID arrays is harder than it is for single drives, especially for laymen.

Agreed, but that does not mean it's less safe in terms of data security. It may however require more knowledge, be less user-friendly and time-consuming.

Saying that RAID 0 is just as reliable as single drives is incorrect and misleading.

Reliable in terms of data availability: probably not, reliable as in terms of data security: that is entirely the same as with single disks. Unless you have good arguments/proof to suggest otherwise.

Madwand1 · Mar 12, 2007

Originally posted by: Enlightenment

Saying that RAID 0 is just as reliable as single drives is incorrect and misleading.

Click to expand...

Reliable in terms of data availability: probably not, reliable as in terms of data security: that is entirely the same as with single disks. Unless you have good arguments/proof to suggest otherwise.

There are trivial arguments, and trivial observations of experiences that I've pointed out here. Arguing the fine points of detail here on how RAID 0 can indeed be very successful in some cases is missing the forest for the trees.

Trivial argument 1: RAID itself is a new point of failure, (i) esp. in the hands on newbies. (ii) as a component which must be integrated with the hardware, OS and other software. (iii) as inexpensive and not fully robust implementations for the most part.

Trivial argument 2: If you lose one drive in a spanned / JBOD set, you lose 1 drive of data. If you lose 1 drive in a RAID 0 array, you lose all of your drives' data, and in the hands of newbies, probably the very OS that they need to do anything on the system.

will889 · Mar 12, 2007

Originally posted by: chizow
Spend @$100 more and grab a Raptor 150GB for OS/apps and use 1x500GB as a storage/backup drive. 🙂

Exactly what I do. 15GB partition of the 500GB drive for backups with Acronis (along with DVD recovery set) and the rest for video and storage/data. All games and OS on the raptor. Page file on raptor.

guptasa1 · Mar 12, 2007

While I'm a newbie to RAID, I just want to point out I'm no newbie to computers and OSes. This'll be my first complete build as I have more experience with OS installs and software than I do with hardware, but I have worked inside my current machine (everything from a new PSU to processors to RAM, etc.). I'm more worried about configuring hardware than anything else, but OS installs and software configuration are no sweat for me and I could do them in my sleep. I agree to someone who's uncomfortable with that, RAID0 would probably be a bad idea, but recovery really isn't an issue for me (I'm more than capable of fixing it) as long as I have good backups.

It's all this lovely hardware stuff I have to get down pat, with the help of you nice folks. =o) And I do appreciate the help because I've been learning a LOT on here in a very short amount of time.

Now another question - for a two-disk RAID0 array and the best overall performance, what stripe size would you recommend and why? I've seen several arguments on this - right now I'm leaning towards either 128K or 64K (maybe optimal if it's an option in my BIOS...won't know til I see it). And I work with all kinds of files, big and small, so I'm looking for the best general boost possible.

Madwand1 · Mar 12, 2007

Originally posted by: guptasa1
Now another question - for a two-disk RAID0 array and the best overall performance, what stripe size would you recommend and why? I've seen several arguments on this - right now I'm leaning towards either 128K or 64K (maybe optimal if it's an option in my BIOS...won't know til I see it). And I work with all kinds of files, big and small, so I'm looking for the best general boost possible.

This is a common question, and there are some different answers, but the only really right one is essentially "try all possibilities" with your implementation and most important applications. Here's SR's version saying that: http://www.storagereview.com/guide2000/...hdd/perf/raid/concepts/perfStripe.html

As you're going to be running Vista, you might assume that your applications are going to benefit from Vista's nice big request size feature, and so need not be as concerned about small stripe sizes in order to maximize STR for those applications which benefit from them. So... 64 KiB (aka "optimal") or higher could be the "best" options here.

guptasa1 · Mar 12, 2007

Great - thanks =o) Awesome info!

RAID Thoughts

Matthias99

Diamond Member

sjandrewbsme

Senior member

Enlightenment

Junior Member

Enlightenment

Junior Member

guptasa1

Senior member

Enlightenment

Junior Member

guptasa1

Senior member

Madwand1

Diamond Member

Enlightenment

Junior Member

Madwand1

Diamond Member

guptasa1

Senior member

Madwand1

Diamond Member

guptasa1

Senior member

Enlightenment

Junior Member

Madwand1

Diamond Member

will889

Golden Member

guptasa1

Senior member

Madwand1

Diamond Member

guptasa1

Senior member

TRENDING THREADS