Matthias, we'd gone so long without butting heads over RAID 0 opinions!
Like GOSHARKS said, it can help improve rotational latency a touch, but only if both drives happen to line up perfectly over the data. My personal experience with RAID 0 vs single drive (same model, same identical hardware, just one drive vs. two) was that there was tangible difference. "Sampson Simpson, I stick by my story!"
Originally posted by: Pariah
Originally posted by: Matthias99
I don't care how many times people say this:
Seek times will still be the same (I doubt any significant boost) for RAID 0 as it will be for a single drive.
it's still not true. The MAIN benefit of a RAID controller (other than essentially doubling your read/write transfer rates when dealing with large files) is the reduction in effective seek time for heavy file I/O. If your data is split across N disks, it takes on average only 1/N times as long for a read head to get into the right position to read it. The effect is more dramatic with a RAID1 or 0+1 (where there are many disks that can fetch a particular piece of information), but even on a RAID0 it's there. While it's true that RAID cannot make your physical drives any faster (ie, reducing the minimum seek times for the disks), it can and does improve average seek times in real-world situations.
That's only true if you increase the array capacity by the capacity of a single drive every time you add a drive which would be extremely wasteful for most people. If you limit the array capacity to the same capacity of the single drive, no matter how many drives you add, you aren't going to gain any positional seek advantage.
For example. If dealing with 40GB/platter drives, if you have one 5 platter 200GB drive, it will have to search the exact same distance on the platters as 5 single platter 40GB drives in RAID 0, meaning you gain nothing. In order for your calculations to work, you would have to add an additional 200GB drive to array every time. So in order to get 1/5 seek distance you would have to raid 5 drives for a 1TB array capacity which would be a complete waste if all you really need is 200GB.
Originally posted by: dejacky
Basically,
raid 0 will increase the throughputl read/write performance during large file transfers/copies/etc. However, the same single hard drive will have lower response latency if it's not setup as raid 0. Tomshardware proved this in their raid 0 article. www.tomshardware.com. So if you're not working with large files constantly (800MB+), normal 2 hard drive array is better.
-dejacky
Originally posted by: KristopherKubicki
My gigabit ethernet wasnt the bottleneck.
Kristopher
Yes, you are wrong. This is exactly why raid0 or 5 is faster in seek times. When the requested sector happens to be in cylinder 2 of sector 5 of physical drive2, it does it on its own. Then the next sector is probably on a different drive,etc.... That is why raid is faster. On the average close to x times faster seek where x is the number of drives in the array. I am getting tired of explaining this to you, so please read more using google if you don't understand....Originally posted by: Pariah
Originally posted by: KristopherKubicki
My gigabit ethernet wasnt the bottleneck.
Kristopher
If you were transferring large files as your original posts states, why were you only getting 25MB/s? 25MB/s is roughly what I found to be the performance of older Gb ethernet adapters on standard PC's. If you were copying large files and you didn't see a significant improvement in transfer rates, then there is a bottleneck somewhere else since that is one scenario where you actually should see big gains. Were you copying to a single drive? If so, then that would explain why there was no improvement.
Matthias99, as far as I know, and someone can correct me if I am wrong here, drives do not independently search within an array. Since files are striped among the drives, they all have to be looking for the same data at the same time. If there are 5 files queued and 5 drives in the array, only one file at a time is requested from the array, not all 5 at once.
I don't understand why you think there is an improvement with RAID 1 writes. If both drives are writing the exact same data at the same time, how is that faster than just one drive writing the data? Also most RAID controllers (almost all ATA RAID controllers) don't load balance reads, so there won't be any benefit on read performance either.
Originally posted by: Pariah
Well, if that is what you've been trying to say all along, then I can tell you once and for all that you are wrong and why. Modern file systems like NTFS have almost eliminated file fragmentation on all but large sized files where access time isn't particularly vital anyway. This means that individual files are all stored in contiguous blocks of sectors. Because of this there is only ONE access per file. Once the initial sector is found there are no further seek time penalties associated with reading consecutive sectors. So whether you have one drive, 2 in a RAID array are 50 in a RAID they all have the same one seek and access per file. Again, I don't believe RAID arrays can access multiple files at once, and if someone can dig of info to the contrary I would like to see it. But, for one file, I am positive that there is no access time benefit for RAID arrays because they perform the same number of head seeks as a single drive.
Running raid does not double the odds that a drive will fail. I think the point your making is that the chance that the raid will fail to a lost drive is doubled by using two drives. Its been a long time since I took a stats class but I think the whole "doubling" the odds of a failure is wrong. I figure the odds of it dying is far less than twice of a single driveI do not like doubling my odds that one of those drives will fail
Originally posted by: Antisocial-Virge
Running raid does not double the odds that a drive will fail. I think the point your making is that the chance that the raid will fail to a lost drive is doubled by using two drives. Its been a long time since I took a stats class but I think the whole "doubling" the odds of a failure is wrong. I figure the odds of it dying is far less than twice of a single driveI do not like doubling my odds that one of those drives will fail
Look at it this way. You buy 3 harddrives to put in 2 computers. One is a dud and instead of lasting three years its gonna only last 2 1/2. If you put that dud in the single drive computer and it dies, your data is gone. If you had put that dud in the raid computer and it dies, your data is still gone.
Originally posted by: Antisocial-Virge
Look at it this way. You buy 3 harddrives to put in 2 computers. One is a dud and instead of lasting three years its gonna only last 2 1/2. If you put that dud in the single drive computer and it dies, your data is gone. If you had put that dud in the raid computer and it dies, your data is still gone.
but if i am running raid 0, u lost the data on the dud drive that is in the raid array IN ADDITION to any other data that is stored on that ENTIRE raid - array.
and can we get some solid documentation/benchmarks that actually show raid0 (over many drives) improves access times? all we have in this thread is a bunch of theoretical stuff, with the sole exception of the 2-drive raid array link to SR.com that i posted earlier today.
Originally posted by: Pariah
Matthias99, as far as I know, and someone can correct me if I am wrong here, drives do not independently search within an array. Since files are striped among the drives, they all have to be looking for the same data at the same time. If there are 5 files queued and 5 drives in the array, only one file at a time is requested from the array, not all 5 at once.
I don't understand why you think there is an improvement with RAID 1 writes. If both drives are writing the exact same data at the same time, how is that faster than just one drive writing the data? Also most RAID controllers (almost all ATA RAID controllers) don't load balance reads, so there won't be any benefit on read performance either.
Originally posted by: GOSHARKS
Originally posted by: Antisocial-Virge
Look at it this way. You buy 3 harddrives to put in 2 computers. One is a dud and instead of lasting three years its gonna only last 2 1/2. If you put that dud in the single drive computer and it dies, your data is gone. If you had put that dud in the raid computer and it dies, your data is still gone.
but if i am running raid 0, u lost the data on the dud drive that is in the raid array IN ADDITION to any other data that is stored on that ENTIRE raid - array.
and can we get some solid documentation/benchmarks that actually show raid0 (over many drives) improves access times? all we have in this thread is a bunch of theoretical stuff, with the sole exception of the 2-drive raid array link to SR.com that i posted earlier today.
If you don;t backup on a ragular basis, raid0 is not for you........ You should do that anyway
And as for proof, I already told everyone that I get either 1ms seek (Sandra 2004) or .1ms seek (HD tach) on my raid array. That all I need, if it isn;t enough for you then don;t use it, I don't care....
Originally posted by: Markfw900
Originally posted by: GOSHARKS
but if i am running raid 0, u lost the data on the dud drive that is in the raid array IN ADDITION to any other data that is stored on that ENTIRE raid - array.
and can we get some solid documentation/benchmarks that actually show raid0 (over many drives) improves access times? all we have in this thread is a bunch of theoretical stuff, with the sole exception of the 2-drive raid array link to SR.com that i posted earlier today.
If you don;t backup on a ragular basis, raid0 is not for you........ You should do that anyway
And as for proof, I already told everyone that I get either 1ms seek (Sandra 2004) or .1ms seek (HD tach) on my raid array. That all I need, if it isn;t enough for you then don;t use it, I don't care....
Originally posted by: GOSHARKS
Originally posted by: Markfw900
Originally posted by: GOSHARKS
but if i am running raid 0, u lost the data on the dud drive that is in the raid array IN ADDITION to any other data that is stored on that ENTIRE raid - array.
and can we get some solid documentation/benchmarks that actually show raid0 (over many drives) improves access times? all we have in this thread is a bunch of theoretical stuff, with the sole exception of the 2-drive raid array link to SR.com that i posted earlier today.
If you don;t backup on a ragular basis, raid0 is not for you........ You should do that anyway
And as for proof, I already told everyone that I get either 1ms seek (Sandra 2004) or .1ms seek (HD tach) on my raid array. That all I need, if it isn;t enough for you then don;t use it, I don't care....
well if you put it that way, any drive failures are inconsequential...?? if you lose a drive in the raid0, you will not be able to quickly access at least 2x the data (assuming the drives are full in both the single and raid0 configs) than if you lose a drive that is just running by itself. my point is that Antisocial-Virge's "analogy" is incorrect.
you have told anybody of your access times, in this thread. lucky for me i read the other thread and have a clue as to what you are talking about... and i thought that your access times were attributed to the caching - and not the physical ablities of the hard drive array by itself (sans cache)?
Originally posted by: Markfw900
It has nothing to do with fragmentation either ! Scenario : You are loading a game level that requires 300 files to be loaded and the system is totally defragged. File 1 starts on physical disk 2 then 3 then 4, etc.. Next file starts on physical disk 5, and even though in the logical map, it is way away from file1, it happens to be where the head currently resides on physical disk 1, NO seek required ! Its simple, with 5 heads independently seeking requested sectors, your odds of one of the heads being located where you want are 5 times higher. Thus loading multiple files is much faster due to lower seek times. And NTFS still has fragmentation issues. I run Win2k most of the time to be compatable with work networks, and I still have to defrag once in a while.
Originally posted by: Pariah
Originally posted by: Markfw900
It has nothing to do with fragmentation either ! Scenario : You are loading a game level that requires 300 files to be loaded and the system is totally defragged. File 1 starts on physical disk 2 then 3 then 4, etc.. Next file starts on physical disk 5, and even though in the logical map, it is way away from file1, it happens to be where the head currently resides on physical disk 1, NO seek required ! Its simple, with 5 heads independently seeking requested sectors, your odds of one of the heads being located where you want are 5 times higher. Thus loading multiple files is much faster due to lower seek times. And NTFS still has fragmentation issues. I run Win2k most of the time to be compatable with work networks, and I still have to defrag once in a while.
Even if your scenario is technically possible with today's mainstream hardware (not the stuff Matthias is talking about), which no one has provided evidence it is, the scenario is completely implausible from a practical standpoint. What do you think the odds are that 5 files that are smaller than your stripe size (probably in the 64KB range) and all happen to be on different drives just happen to be requested at the same time? It would happen so rarely, that it wouldn't have any affect on everyday usage, and certainly would not show up in access time benchmarks. Also I specifically said file fragmentation, not file system fragmentation. There is no magic file system that can avoid fragmentation without self maintainance. File system fragmentation is irrelevant when reading a single file.
Also, if you want to ruin any credibility continue to quote Sandra HD benchmarks, the universally accepted worthless HD benchmarker. Storage Review has a slew of RAID benchmarks for perusal. Among them is the following:
3Ware Escalade 6400 access time
1 drive: 14.04
2 drives RAID0: 15.2
3 drives RAID0: 15.18
4 drives RAID0: 15.4
What's wrong with SR's #'s? I 'm not seeing the 75% increase you say there should be. Other configurations show similar patterns with some showing slight improvements (under 2%).
Large files that are split into enough blocks to span every drive in the array require each drive to position to a particular spot, so positioning performance is not improved; once the heads are all in place however, data is read from all the drives at once, greatly improving transfer performance. On reads, small files that don't require reading from all the disks in the array can allow a smart controller to actually run two or more accesses in parallel (and if the files are in the same stripe block, then it will be even faster). This improves both positioning and transfer performance, though the increase in transfer performance is relatively small. Performance improvement in a striping environment also depends on stripe size and stripe width.
OK, I have 3 raid setups here, two SCSI and one IDE. The IDE one feels "snappier", but I don't really have hard stats to back that up. One SCSI one feels much faster (two drive array), and the 5 disk SCSI one (which I have about $650 invested in) is a real screamer that anybody who has used it agrees that it blows away anything they have seen.Originally posted by: GOSHARKS
so whats the definition of a "smart controller?" would the average home raid user have such a "smart controller," or a 0 array with 3+ drives? we are talking about your ave enthusiast here, not enterprise setups, correct?