Does Raid 0 actually improve performance?

pelikan · Apr 1, 2004

Will raid 0 load a game faster than a single drive?
The only time I ever wait for anything on my system is when starting a game or a new level. I have a gig of ram.

Matthias99 · Apr 1, 2004

Matthias, we'd gone so long without butting heads over RAID 0 opinions!

Like GOSHARKS said, it can help improve rotational latency a touch, but only if both drives happen to line up perfectly over the data. My personal experience with RAID 0 vs single drive (same model, same identical hardware, just one drive vs. two) was that there was tangible difference. "Sampson Simpson, I stick by my story!"

I'm not following you with the 'lining up perfectly over the data' bit. It has nothing to do with position of data relative to the other disks, but the fact that the data is striped. See below.

Originally posted by: Pariah

Originally posted by: Matthias99
I don't care how many times people say this:

Seek times will still be the same (I doubt any significant boost) for RAID 0 as it will be for a single drive.

Click to expand...

it's still not true. The MAIN benefit of a RAID controller (other than essentially doubling your read/write transfer rates when dealing with large files) is the reduction in effective seek time for heavy file I/O. If your data is split across N disks, it takes on average only 1/N times as long for a read head to get into the right position to read it. The effect is more dramatic with a RAID1 or 0+1 (where there are many disks that can fetch a particular piece of information), but even on a RAID0 it's there. While it's true that RAID cannot make your physical drives any faster (ie, reducing the minimum seek times for the disks), it can and does improve average seek times in real-world situations.

Click to expand...

That's only true if you increase the array capacity by the capacity of a single drive every time you add a drive which would be extremely wasteful for most people. If you limit the array capacity to the same capacity of the single drive, no matter how many drives you add, you aren't going to gain any positional seek advantage.

For example. If dealing with 40GB/platter drives, if you have one 5 platter 200GB drive, it will have to search the exact same distance on the platters as 5 single platter 40GB drives in RAID 0, meaning you gain nothing. In order for your calculations to work, you would have to add an additional 200GB drive to array every time. So in order to get 1/5 seek distance you would have to raid 5 drives for a 1TB array capacity which would be a complete waste if all you really need is 200GB.

Okay. We're talking about different setups.

Let's take a hypothetical drive with 20 pieces of data on it:

[1---2---3---4---5---6---7---8---9---10---11---12---13---14---15---16---17---18---19---20]

And let's compare that to a striped RAID0 of four smaller disks:

[1---5---9--13--17]

[2---6--10--14--18]

[3---7--11--15--19]

[4---8--12--16--20]

For a random sequence of reads and writes, the disk array can offer a noticeably lower average seek time, simply because the disk array can seek for multiple pieces of data simultaneously as long as they're not on the same disk. Even if the single large drive has multiple platters (and so has the exact same seek time as each small drive alone), together the RAID group can give a better average-case performance. This effect, I will admit, is much more dramatic with a RAID1, where *any* sequence of reads and writes has a lower average seek time (1/N as long, in effect, when amortized over many operations). I was unclear above; an N-disk RAID0 does *not* give average seek times 1/N times as long (that's RAID1; I should have said "If your data is mirrored across N disks" rather than 'split' above), but it does make a noticeable improvement in a lot of cases if the controller is smart enough (prefetch, etc.) and the OS and/or the programs being run queue reads and writes. This is not often the case for everyday desktop use, though, but in some situations it does help.

dejacky · Apr 1, 2004

Basically,
raid 0 will increase the throughputl read/write performance during large file transfers/copies/etc. However, the same single hard drive will have lower response latency if it's not setup as raid 0. Tomshardware proved this in their raid 0 article. www.tomshardware.com. So if you're not working with large files constantly (800MB+), normal 2 hard drive array is better.

-dejacky

Amused · Apr 1, 2004

Originally posted by: dejacky
Basically,
raid 0 will increase the throughputl read/write performance during large file transfers/copies/etc. However, the same single hard drive will have lower response latency if it's not setup as raid 0. Tomshardware proved this in their raid 0 article. www.tomshardware.com. So if you're not working with large files constantly (800MB+), normal 2 hard drive array is better.

-dejacky

Which article?

Pariah · Apr 1, 2004

Originally posted by: KristopherKubicki
My gigabit ethernet wasnt the bottleneck.

Kristopher

If you were transferring large files as your original posts states, why were you only getting 25MB/s? 25MB/s is roughly what I found to be the performance of older Gb ethernet adapters on standard PC's. If you were copying large files and you didn't see a significant improvement in transfer rates, then there is a bottleneck somewhere else since that is one scenario where you actually should see big gains. Were you copying to a single drive? If so, then that would explain why there was no improvement.

Matthias99, as far as I know, and someone can correct me if I am wrong here, drives do not independently search within an array. Since files are striped among the drives, they all have to be looking for the same data at the same time. If there are 5 files queued and 5 drives in the array, only one file at a time is requested from the array, not all 5 at once.

I don't understand why you think there is an improvement with RAID 1 writes. If both drives are writing the exact same data at the same time, how is that faster than just one drive writing the data? Also most RAID controllers (almost all ATA RAID controllers) don't load balance reads, so there won't be any benefit on read performance either.

Markfw · Apr 2, 2004

Originally posted by: Pariah

Originally posted by: KristopherKubicki
My gigabit ethernet wasnt the bottleneck.

Kristopher

Click to expand...

If you were transferring large files as your original posts states, why were you only getting 25MB/s? 25MB/s is roughly what I found to be the performance of older Gb ethernet adapters on standard PC's. If you were copying large files and you didn't see a significant improvement in transfer rates, then there is a bottleneck somewhere else since that is one scenario where you actually should see big gains. Were you copying to a single drive? If so, then that would explain why there was no improvement.

Matthias99, as far as I know, and someone can correct me if I am wrong here, drives do not independently search within an array. Since files are striped among the drives, they all have to be looking for the same data at the same time. If there are 5 files queued and 5 drives in the array, only one file at a time is requested from the array, not all 5 at once.

I don't understand why you think there is an improvement with RAID 1 writes. If both drives are writing the exact same data at the same time, how is that faster than just one drive writing the data? Also most RAID controllers (almost all ATA RAID controllers) don't load balance reads, so there won't be any benefit on read performance either.

Yes, you are wrong. This is exactly why raid0 or 5 is faster in seek times. When the requested sector happens to be in cylinder 2 of sector 5 of physical drive2, it does it on its own. Then the next sector is probably on a different drive,etc.... That is why raid is faster. On the average close to x times faster seek where x is the number of drives in the array. I am getting tired of explaining this to you, so please read more using google if you don't understand....

Pariah · Apr 2, 2004

Well, if that is what you've been trying to say all along, then I can tell you once and for all that you are wrong and why. Modern file systems like NTFS have almost eliminated file fragmentation on all but large sized files where access time isn't particularly vital anyway. This means that individual files are all stored in contiguous blocks of sectors. Because of this there is only ONE access per file. Once the initial sector is found there are no further seek time penalties associated with reading consecutive sectors. So whether you have one drive, 2 in a RAID array are 50 in a RAID they all have the same one seek and access per file. Again, I don't believe RAID arrays can access multiple files at once, and if someone can dig of info to the contrary I would like to see it. But, for one file, I am positive that there is no access time benefit for RAID arrays because they perform the same number of head seeks as a single drive.

GoSharks · Apr 2, 2004

http://faq.storagereview.com/tiki-index.php?page=SingleDriveVsRaid0

Monoman · Apr 2, 2004

Originally posted by: Pariah
Well, if that is what you've been trying to say all along, then I can tell you once and for all that you are wrong and why. Modern file systems like NTFS have almost eliminated file fragmentation on all but large sized files where access time isn't particularly vital anyway. This means that individual files are all stored in contiguous blocks of sectors. Because of this there is only ONE access per file. Once the initial sector is found there are no further seek time penalties associated with reading consecutive sectors. So whether you have one drive, 2 in a RAID array are 50 in a RAID they all have the same one seek and access per file. Again, I don't believe RAID arrays can access multiple files at once, and if someone can dig of info to the contrary I would like to see it. But, for one file, I am positive that there is no access time benefit for RAID arrays because they perform the same number of head seeks as a single drive.

you are describing IDE RAID, not SCSI. They are a different beast all together.

Pariah · Apr 2, 2004

OK, now are you going to tell us what these differences are, or are you going to make the rest of us guess? Besides the inherent benefits of moving to SCSI for features like command queuing and reordering, what unique abilities does SCSI RAID have that highend ATA RAID like 3Ware doesn't have that pertain to the discussion here?

Markfw · Apr 2, 2004

It has nothing to do with fragmentation either ! Scenario : You are loading a game level that requires 300 files to be loaded and the system is totally defragged. File 1 starts on physical disk 2 then 3 then 4, etc.. Next file starts on physical disk 5, and even though in the logical map, it is way away from file1, it happens to be where the head currently resides on physical disk 1, NO seek required ! Its simple, with 5 heads independently seeking requested sectors, your odds of one of the heads being located where you want are 5 times higher. Thus loading multiple files is much faster due to lower seek times. And NTFS still has fragmentation issues. I run Win2k most of the time to be compatable with work networks, and I still have to defrag once in a while.

Antisocial Virge · Apr 2, 2004

I do not like doubling my odds that one of those drives will fail

Running raid does not double the odds that a drive will fail. I think the point your making is that the chance that the raid will fail to a lost drive is doubled by using two drives. Its been a long time since I took a stats class but I think the whole "doubling" the odds of a failure is wrong. I figure the odds of it dying is far less than twice of a single drive

Look at it this way. You buy 3 harddrives to put in 2 computers. One is a dud and instead of lasting three years its gonna only last 2 1/2. If you put that dud in the single drive computer and it dies, your data is gone. If you had put that dud in the raid computer and it dies, your data is still gone.

EDIT: More on the subject. I ran raid a while back on several different configurations and stuff always seemed to load faster to me.

Markfw · Apr 2, 2004

Originally posted by: Antisocial-Virge

I do not like doubling my odds that one of those drives will fail

Click to expand...

Running raid does not double the odds that a drive will fail. I think the point your making is that the chance that the raid will fail to a lost drive is doubled by using two drives. Its been a long time since I took a stats class but I think the whole "doubling" the odds of a failure is wrong. I figure the odds of it dying is far less than twice of a single drive

Look at it this way. You buy 3 harddrives to put in 2 computers. One is a dud and instead of lasting three years its gonna only last 2 1/2. If you put that dud in the single drive computer and it dies, your data is gone. If you had put that dud in the raid computer and it dies, your data is still gone.

And my raid5 doesn't care if one of the drives fails. My data is more secure than any one drive system.

GoSharks · Apr 2, 2004

Originally posted by: Antisocial-Virge

Look at it this way. You buy 3 harddrives to put in 2 computers. One is a dud and instead of lasting three years its gonna only last 2 1/2. If you put that dud in the single drive computer and it dies, your data is gone. If you had put that dud in the raid computer and it dies, your data is still gone.

Click to expand...

but if i am running raid 0, u lost the data on the dud drive that is in the raid array IN ADDITION to any other data that is stored on that ENTIRE raid - array.

and can we get some solid documentation/benchmarks that actually show raid0 (over many drives) improves access times? all we have in this thread is a bunch of theoretical stuff, with the sole exception of the 2-drive raid array link to SR.com that i posted earlier today.

Matthias99 · Apr 2, 2004

Originally posted by: Pariah

Matthias99, as far as I know, and someone can correct me if I am wrong here, drives do not independently search within an array. Since files are striped among the drives, they all have to be looking for the same data at the same time. If there are 5 files queued and 5 drives in the array, only one file at a time is requested from the array, not all 5 at once.

That's up to the controller, not the drives -- any particular data block maps to a particular block on a specific, single drive, and the controller does the mapping. Maybe this functionality just isn't implemented in IDE RAID controllers to save money, but it is in a lot of SAN/NAS systems, and high-end SCSI RAID (which are what I'm more familiar with). In theory, though, any RAID controller capable of queueing and reordering reads and writes should be able to do this. In practice, this is unlikely to happen all that often on a single-user system anyway.

I don't understand why you think there is an improvement with RAID 1 writes. If both drives are writing the exact same data at the same time, how is that faster than just one drive writing the data? Also most RAID controllers (almost all ATA RAID controllers) don't load balance reads, so there won't be any benefit on read performance either.

Okay, okay, yes, RAID1 doesn't help with writes (since both disks have to write the data, although not necessarily at exactly the same time depending on the controller), and I should have been clearer about that. It does cut your read access times down, theoretically to 1/N times for an N-way mirrored RAID1.

Happy?

Markfw · Apr 2, 2004

Originally posted by: GOSHARKS

Originally posted by: Antisocial-Virge

Look at it this way. You buy 3 harddrives to put in 2 computers. One is a dud and instead of lasting three years its gonna only last 2 1/2. If you put that dud in the single drive computer and it dies, your data is gone. If you had put that dud in the raid computer and it dies, your data is still gone.

Click to expand...

but if i am running raid 0, u lost the data on the dud drive that is in the raid array IN ADDITION to any other data that is stored on that ENTIRE raid - array.

and can we get some solid documentation/benchmarks that actually show raid0 (over many drives) improves access times? all we have in this thread is a bunch of theoretical stuff, with the sole exception of the 2-drive raid array link to SR.com that i posted earlier today.

Click to expand...

If you don;t backup on a ragular basis, raid0 is not for you........ You should do that anyway
And as for proof, I already told everyone that I get either 1ms seek (Sandra 2004) or .1ms seek (HD tach) on my raid array. That all I need, if it isn;t enough for you then don;t use it, I don't care....

GoSharks · Apr 2, 2004

Originally posted by: Markfw900

Originally posted by: GOSHARKS
but if i am running raid 0, u lost the data on the dud drive that is in the raid array IN ADDITION to any other data that is stored on that ENTIRE raid - array.

and can we get some solid documentation/benchmarks that actually show raid0 (over many drives) improves access times? all we have in this thread is a bunch of theoretical stuff, with the sole exception of the 2-drive raid array link to SR.com that i posted earlier today.

Click to expand...

If you don;t backup on a ragular basis, raid0 is not for you........ You should do that anyway
And as for proof, I already told everyone that I get either 1ms seek (Sandra 2004) or .1ms seek (HD tach) on my raid array. That all I need, if it isn;t enough for you then don;t use it, I don't care....

well if you put it that way, any drive failures are inconsequential...?? if you lose a drive in the raid0, you will not be able to quickly access at least 2x the data (assuming the drives are full in both the single and raid0 configs) than if you lose a drive that is just running by itself. my point is that Antisocial-Virge's "analogy" is incorrect.

you have told anybody of your access times, in this thread. lucky for me i read the other thread and have a clue as to what you are talking about... and i thought that your access times were attributed to the caching - and not the physical ablities of the hard drive array by itself (sans cache)?

Markfw · Apr 2, 2004

Originally posted by: GOSHARKS

Originally posted by: Markfw900

Originally posted by: GOSHARKS
but if i am running raid 0, u lost the data on the dud drive that is in the raid array IN ADDITION to any other data that is stored on that ENTIRE raid - array.

and can we get some solid documentation/benchmarks that actually show raid0 (over many drives) improves access times? all we have in this thread is a bunch of theoretical stuff, with the sole exception of the 2-drive raid array link to SR.com that i posted earlier today.

Click to expand...

If you don;t backup on a ragular basis, raid0 is not for you........ You should do that anyway
And as for proof, I already told everyone that I get either 1ms seek (Sandra 2004) or .1ms seek (HD tach) on my raid array. That all I need, if it isn;t enough for you then don;t use it, I don't care....

Click to expand...

well if you put it that way, any drive failures are inconsequential...?? if you lose a drive in the raid0, you will not be able to quickly access at least 2x the data (assuming the drives are full in both the single and raid0 configs) than if you lose a drive that is just running by itself. my point is that Antisocial-Virge's "analogy" is incorrect.

you have told anybody of your access times, in this thread. lucky for me i read the other thread and have a clue as to what you are talking about... and i thought that your access times were attributed to the caching - and not the physical ablities of the hard drive array by itself (sans cache)?

I am fairly certain based on real usage and my experience using the array is that the physical average seek time (no cache) due to the 5 disks is really 1ms. The .1ms I am sure is due to the cache.

Regs · Apr 2, 2004

I have little doubt that the Raid array could make a system much snappier in terms of read/write burst speeds and large data transmissions, but the seek time to be .1ms is just unheard of.

You would be able to load windows up in the matter of... (calculate minimal time here). And Boot up a game in...(same as above)

Pariah · Apr 2, 2004

Originally posted by: Markfw900
It has nothing to do with fragmentation either ! Scenario : You are loading a game level that requires 300 files to be loaded and the system is totally defragged. File 1 starts on physical disk 2 then 3 then 4, etc.. Next file starts on physical disk 5, and even though in the logical map, it is way away from file1, it happens to be where the head currently resides on physical disk 1, NO seek required ! Its simple, with 5 heads independently seeking requested sectors, your odds of one of the heads being located where you want are 5 times higher. Thus loading multiple files is much faster due to lower seek times. And NTFS still has fragmentation issues. I run Win2k most of the time to be compatable with work networks, and I still have to defrag once in a while.

Even if your scenario is technically possible with today's mainstream hardware (not the stuff Matthias is talking about), which no one has provided evidence it is, the scenario is completely implausible from a practical standpoint. What do you think the odds are that 5 files that are smaller than your stripe size (probably in the 64KB range) and all happen to be on different drives just happen to be requested at the same time? It would happen so rarely, that it wouldn't have any affect on everyday usage, and certainly would not show up in access time benchmarks. Also I specifically said file fragmentation, not file system fragmentation. There is no magic file system that can avoid fragmentation without self maintainance. File system fragmentation is irrelevant when reading a single file.

Also, if you want to ruin any credibility continue to quote Sandra HD benchmarks, the universally accepted worthless HD benchmarker. Storage Review has a slew of RAID benchmarks for perusal. Among them is the following:

3Ware Escalade 6400 access time

1 drive: 14.04
2 drives RAID0: 15.2
3 drives RAID0: 15.18
4 drives RAID0: 15.4

What's wrong with SR's #'s? I 'm not seeing the 75% increase you say there should be. Other configurations show similar patterns with some showing slight improvements (under 2%).

beatle · Apr 2, 2004

In a RAID-0 setup, any one file is written to all disks in the. Both disks must load their part of the file for it to be of any use. Just because one drive is finished doesn't mean the other is as well. Seek times are about the same, which explains why for most use, it will feel about the same as a single drive.

In regards to reliability, the likelihood of failure is roughly doubled. Matthias can explain this better than I can.

Markfw · Apr 2, 2004

Originally posted by: Pariah

Originally posted by: Markfw900
It has nothing to do with fragmentation either ! Scenario : You are loading a game level that requires 300 files to be loaded and the system is totally defragged. File 1 starts on physical disk 2 then 3 then 4, etc.. Next file starts on physical disk 5, and even though in the logical map, it is way away from file1, it happens to be where the head currently resides on physical disk 1, NO seek required ! Its simple, with 5 heads independently seeking requested sectors, your odds of one of the heads being located where you want are 5 times higher. Thus loading multiple files is much faster due to lower seek times. And NTFS still has fragmentation issues. I run Win2k most of the time to be compatable with work networks, and I still have to defrag once in a while.

Click to expand...

Even if your scenario is technically possible with today's mainstream hardware (not the stuff Matthias is talking about), which no one has provided evidence it is, the scenario is completely implausible from a practical standpoint. What do you think the odds are that 5 files that are smaller than your stripe size (probably in the 64KB range) and all happen to be on different drives just happen to be requested at the same time? It would happen so rarely, that it wouldn't have any affect on everyday usage, and certainly would not show up in access time benchmarks. Also I specifically said file fragmentation, not file system fragmentation. There is no magic file system that can avoid fragmentation without self maintainance. File system fragmentation is irrelevant when reading a single file.

Also, if you want to ruin any credibility continue to quote Sandra HD benchmarks, the universally accepted worthless HD benchmarker. Storage Review has a slew of RAID benchmarks for perusal. Among them is the following:

3Ware Escalade 6400 access time

1 drive: 14.04
2 drives RAID0: 15.2
3 drives RAID0: 15.18
4 drives RAID0: 15.4

What's wrong with SR's #'s? I 'm not seeing the 75% increase you say there should be. Other configurations show similar patterns with some showing slight improvements (under 2%).

Wrong ! From your own source, Storagereview :
Link to article

Large files that are split into enough blocks to span every drive in the array require each drive to position to a particular spot, so positioning performance is not improved; once the heads are all in place however, data is read from all the drives at once, greatly improving transfer performance. On reads, small files that don't require reading from all the disks in the array can allow a smart controller to actually run two or more accesses in parallel (and if the files are in the same stripe block, then it will be even faster). This improves both positioning and transfer performance, though the increase in transfer performance is relatively small. Performance improvement in a striping environment also depends on stripe size and stripe width.

Also, transfer speed is limited to other factors such as PCI bus limits (in my case with a U160 controller) But seek times CAN be much faster. depending on stribe size, file size, and coltroller characteristics. My transfer speeds are currently limited to 133,000 theorectical and my best banches show 126,000 is actual. Once you hace used raid, especially SCSI raid, you can see it like I can. You should use it before you comment.

GoSharks · Apr 2, 2004

so whats the definition of a "smart controller?" would the average home raid user have such a "smart controller," or a 0 array with 3+ drives? we are talking about your ave enthusiast here, not enterprise setups, correct?

Markfw · Apr 2, 2004

Originally posted by: GOSHARKS
so whats the definition of a "smart controller?" would the average home raid user have such a "smart controller," or a 0 array with 3+ drives? we are talking about your ave enthusiast here, not enterprise setups, correct?

OK, I have 3 raid setups here, two SCSI and one IDE. The IDE one feels "snappier", but I don't really have hard stats to back that up. One SCSI one feels much faster (two drive array), and the 5 disk SCSI one (which I have about $650 invested in) is a real screamer that anybody who has used it agrees that it blows away anything they have seen.

It's all about money. I regular raid controller I think is a faster than a single drive. The exspensive one is a screamer.

Regs · Apr 2, 2004

Mark, how much is your CPU utilization of the SCSI Raid set up?

Does Raid 0 actually improve performance?

Diamond Member

Diamond Member

Banned

Elite Member

Elite Member

Moderator Emeritus, Elite Member

Elite Member

Diamond Member

Platinum Member

Elite Member

Moderator Emeritus, Elite Member

Diamond Member

Moderator Emeritus, Elite Member

Diamond Member

Diamond Member

Moderator Emeritus, Elite Member

Diamond Member

Moderator Emeritus, Elite Member

Lifer

Elite Member

Diamond Member

Moderator Emeritus, Elite Member

Diamond Member

Moderator Emeritus, Elite Member

Lifer