2TB HDD's and RAID

MalVeauX

Senior member
Dec 19, 2008
653
176
116
Hey guys,

I've been eyeing some 2TB drives, but have read some interesting commentary about them not playing nice with RAID devices, cards, etc. Not very comforting. I'd like to get some 2TB Greens and put them in RAID1 (multiple pairs). But if they are not playing nice in RAID, that would be a really major waste in expense. I don't like the idea of 2TB of data in one place that has no redundancy, so I would fork up for the second drive in RAID1. And while I could just duplicate the data manually, I would rather it be automated via RAID1 anyways. But I've read about issues with the 2TB drives dropping from arrays due to reconstruction times taking longer than 7 seconds and so the array reports the drive as dropped, etc.

Can anyone blast away the myth from what's accurate?

I'd love to grab 2x 2TB drives and RAID1 them without having to do crazy firmware updates, worry about controllers and all that stuff simply not working, or setting it up and then having the drives fall from the array all the time when they're fine.

Thoughts?

Very best,
 

MalVeauX

Senior member
Dec 19, 2008
653
176
116
Heya,

Ok, so does anyone know why this is? Seems silly to have these huge drives out and we can't make redundancy with the best known method (RAID). That's way too much data in one place to just crash and burn on someone.

Very best,
 

jvroig

Platinum Member
Nov 4, 2009
2,394
1
81
Seems silly to have these huge drives out and we can't make redundancy with the best known method (RAID).
I was typing a reply that reached about 5 very long paragraphs. I saw it was too long, deleted it and rewrote it as such:
1. RAID isn't the best method for redundancy, not as much as it used to be anyway, thanks to HDD density going up several times over but speeds and throughput practically progressing at a snail's pace.
2. This has less impact to you as a home user as opposed to an enterprise server admin, but if RAID doesn't want to play nice, then great. Stay away.
3. Your concern isn't striping but mirroring, so I suppose what you really want or need is a backup. Good for you, shows good sense. Use a real backup solution. RAID isn't a backup solution anyway. Something like rsync is available on most operating systems.
4. Backup solutions can be automated/scheduled, so you don't have to worry about "manually" doing it yourself.
5. A real backup solution, as opposed to misusing RAID as a backup device, has the added benefit of saving your "spare"/2nd harddisk. In a RAID1, it is put to the same stress as your primary harddisk. Using a real backup solution, it will mostly be spun down 99% of the time, and will be active only when the backup is taking place. Imagine how much less stress that is, and how "new" that drive will compare to your primary drive after years of use, making your "spare" statistically more reliable than if it were exposed to the same stress as the HDD you were backing up.

Hope this helps you. For what it's worth, part of my boring day job has me overseeing server backup solutions that also pretty much hit your size (for medium installations), while the latest server for another project takes about 7TB (for the "primary" storage, backup storage capacity not yet counted). If you count your data as important as an enterprise regards its data (and you should, since those are personal and you probably have much sentimental value stored there), then you will be far better served with a real backup solution.
 

MalVeauX

Senior member
Dec 19, 2008
653
176
116
Hope this helps you. For what it's worth, part of my boring day job has me overseeing server backup solutions that also pretty much hit your size (for medium installations), while the latest server for another project takes about 7TB (for the "primary" storage, backup storage capacity not yet counted). If you count your data as important as an enterprise regards its data (and you should, since those are personal and you probably have much sentimental value stored there), then you will be far better served with a real backup solution.

Heya,

This is fantastically helpful actually. To give you an idea, I keep all of my DVD's in ISO form on my server. Several terabytes and it's growing and growing. I use a lot of the 1TB WD greens. But as space is running out, I consider getting a 2TB drive to decrease the number of drives I'm using so that I can make room for more drives. But, putting 170 DVD's onto a single drive (1TB) and then doubling that basically if I move to 2TB while nice poses a risk--that's like 20 minutes of redoing each DVD if the drive fails, times 170 for 1TB or times 340 on a 2TB drive. And I have several of those. That's just hours and hours that I'd like to avoid. So I don't need a real backup, more like just some redundancy to avoid a crash that will result in days of reloading from DVD's. I stream these DVD's from my server to my HTPC over my gigabit network by the way.

So that's why RAID1 appealed. Simple way to duplicate without having to use software, automation, etc.

But, if you're saying there's easier/better ways to get the kind of redundancy I'm talking about, then I'm all ears. I don't have to use RAID. I'm fine not using it at all. I just wanted to avoid having to manually duplicate files to another drive all the time. So automation software could do that of course.

So by all means, recommend me something. I've love some more options. Especially free options (RAID is essentially free since I have the drives, and I have RAID capability). So some free backup/automation software would be great. I'm totally open to it. I'm using Win 7 on my server (as it doubles as a HTPC when I take it places).

Thanks!

Very best, :)
 

Red Squirrel

No Lifer
May 24, 2003
70,347
13,673
126
www.anyf.ca
First time I hear of this. I fell in love with Linux MD raid (software raid) and I can't see why there would be issues with that. I'd recommend it if hardware cars have trouble. I run quite a few VMs on my raid 5 (3x 1TB drives) with no performance issues.
 

jvroig

Platinum Member
Nov 4, 2009
2,394
1
81
So by all means, recommend me something. I've love some more options. Especially free options (RAID is essentially free since I have the drives, and I have RAID capability). So some free backup/automation software would be great. I'm totally open to it. I'm using Win 7 on my server (as it doubles as a HTPC when I take it places).

http://www.aboutmyip.com/AboutMyXApp/DeltaCopy.jsp
(Disclaimer: This is an rsync implementation for Windows. I have never used it myself. I use Linux, so I use the classic rsync)

Your server contains your ISOs. You have a separate machine containing your backups, yes? DeltaCopy seems to be rather straightforward in that case. Install and make a schedule, then forget about it (you might have to do that for each HD pair [server + backup] you add to your collection)

This has several good implications.
1) You don't have to do everything twice
2) You don't have to finish filling up a HD in one sitting; rsync "syncs" the backup with the master (your server) no matter what changes you made (new ISO, deleted some corrupted ISOs, etc) on every schedule backup (which you set)
3) Rsync is fast. It only sends "deltas" through the network, instead of copying everything.

If you don't have a separate backup machine (not ideal, but what are we going to do) you can use free software backup like Fbackup for mirroring purposes from one drive to another in the same machine. There are a lot of others, I am sorry I won't be able to recommend a "clear leader" in free backup tools for your purposes. They are mostly all the same, and I work in a server environment, the tools are a bit different. If a free one you find says "mirroring" and "scheduled backups", good. If it says "deltas" or "rsync", even better.

Software like Fbackup have almost the same advantages as DeltaCopy above, but not quite:
1) You don't have to do everything twice
2) You don't have to finish filling up a HD in one sitting; in mirror mode, it will pretty much act like rsync (you probably have to set it up as such in the options - the instructions seem pretty clear and straightforward).
3) If it does not use "deltas" like rsync, it will be slow. But you won't care anyway, because you aren't watching it work during scheduled backups.

Now that I think of it, you might be better served with free backup/mirroring tools that are scheduled instead of an rsync implementation like DeltaCopy, I am bewildered why it is a purely client-server tool, when rsync itself is easily used for local/spare disks.
 

jvroig

Platinum Member
Nov 4, 2009
2,394
1
81
First time I hear of this. I fell in love with Linux MD raid (software raid) and I can't see why there would be issues with that. I'd recommend it if hardware cars have trouble. I run quite a few VMs on my raid 5 (3x 1TB drives) with no performance issues.
Performance issues have mostly been solved. The trouble with RAID, whether hardware or software, is not really RAID itself but the fact that HDDs are so dense but very slow.

So what we have is this:
1.) 3 x 1TB (as per your example) in a RAID5
2.) Your total capacity is 2TB
3.) Upon loss of 1 drive, the array lives on (the point of RAID - uptime, not backup)
4.) Performance suffers until that drive is rebuilt, because the data lost in the drive needs to be calculated from the distributed parity upon each read to the array.
5.) Say the three drives are almost all full containing about 750GB, parity and data.
6.) Rebuilding 750GB is a lot of work. This takes a long time, several hours.
7.) Let's say you have a hot spare in your array to simplify rebuilding. Great. Once a failure is detected, the array drops the dead drive, "activates" the hot spare and starts the rebuild.
8.) During a rebuild, your HDDs are at almost 100% throughput for the entire time. And since your array is still up, that means whatever your servers are serving, they are still doing it as well.
9.) Considering that your HDDs most likely are the same age (you bought them all at the same time when you decided to build an array, common case), probably came from the same manufacturer, you bought it at the same store, perhaps even came from the same lot at the factory, and they have been exposed to the same levels of stress as each other (obvious, since they are in a RAID5 array), then it is not statistically improbable that one of your other HDDs will also fail as well.
10.) Being at sustained max or near-max throughput for several hours does not help you, given #9.
11.) If another drive does fail during rebuilding, goodbye to all data.

Lessons:
-RAID is still useful somewhat for uptime purposes, if you consider building a huge array.
-RAID is still useful for performance (RAID 1 mirroring for databases that like it, RAID 0 for the much-sought-for striping-induced performance benefit that gamers often desire with their raptor drives)
-RAID is still not a good backup solution.
 
Last edited:

MalVeauX

Senior member
Dec 19, 2008
653
176
116
Heya,

Fbackup looks perfect for me. My ISO collection on my server feeds out the data to my HTPC and my gaming machine (mp3's too, stuff like that). I don't do backups on my other machines. I just wanted to make my ISO collection, since it grew so large, a little safer against hard drive fails so that I don't have to waste days reloading it all in that event.

So my data and the backup of that data (or just its redundancy) is in the same machine.

Fbackup can handle that. Mirroring is perfect. So I'll see how it works out.

Thanks for the suggestion, this will let me use 2TB drives mirrored without using RAID, so I don't have to deal with issues, drivers, etc. And it's free. Good deal.

Thanks again. :)

Very best,
 

Burner27

Diamond Member
Jul 18, 2001
4,452
50
101
I have 4 x 2TB Seagates in a RAID 5 array on a Silicon Image 3114 card and have had no issues.
 

Mark R

Diamond Member
Oct 9, 1999
8,513
16
81
Heya,

Ok, so does anyone know why this is? Seems silly to have these huge drives out and we can't make redundancy with the best known method (RAID). That's way too much data in one place to just crash and burn on someone.

Very best,

There have been a lot of problems reported, but the cause doesn't appear clear.

It's probably something related to weak-sector handling on the drives. When a desktop drive hits a weak or bad sector, it will do everything it can to try and get the data off - taking up to 1-2 minutes. Meanwhile the drive is frozen. A desktop PC will freeze and wait for the data to be read, or the drive to report the sector as bad.

In a RAID system, when the drive goes unresponsive, the RAID card or driver often treats that as total drive failure, drops the drive from the RAID and switches to degraded mode.

To get around this, 'RAID edition' drives will give weak or bad sectors 7 seconds, and then give up reporting the sector as bad. When the RAID controller gets the 'bad sector' message it recovers the data from the other drives, it then makes a note of the bad sector and moves the data to a new sector.

This function has been called, variously, Time-limited error recovery (TLER), error recovery control (ERC) and others. Some drives, like the older WD green power drives could have this function enabled (even on non-RAID edition drives) - however, the manufacturers have removed this function in the most recent 1.5 and 2 TB non-RAID drives.

RAID edition drives also have vibrational rotation compensation, which may be relevant. When a drive seeks, the vibration it causes, can affect the seeks in a nearby drive in the same chassis. When you have many drives together, may of which are seeking in sync, the vibrations can seriously degrade seek accuracy. RAID edition drives include additional sensors and firmware to correct for external vibration. So, theoretically at least, they should be less prone to mis-read/mis-written sectors than basic drives. I suspect though, that the main issue would be a slight drop in performance, and that reliability effects would be negligible.

I will echo the statements about RAID not being backup. It's about keeping you running in the event of hardware failure, not about protecting data. RAID doesn't protect your data against faulty RAM, faulty SATA controllers, or PEBKAC errors like rm -rf /.
 

Red Squirrel

No Lifer
May 24, 2003
70,347
13,673
126
www.anyf.ca
Performance issues have mostly been solved. The trouble with RAID, whether hardware or software, is not really RAID itself but the fact that HDDs are so dense but very slow.

So what we have is this:
1.) 3 x 1TB (as per your example) in a RAID5
2.) Your total capacity is 2TB
3.) Upon loss of 1 drive, the array lives on (the point of RAID - uptime, not backup)
4.) Performance suffers until that drive is rebuilt, because the data lost in the drive needs to be calculated from the distributed parity upon each read to the array.
5.) Say the three drives are almost all full containing about 750GB, parity and data.
6.) Rebuilding 750GB is a lot of work. This takes a long time, several hours.
7.) Let's say you have a hot spare in your array to simplify rebuilding. Great. Once a failure is detected, the array drops the dead drive, "activates" the hot spare and starts the rebuild.
8.) During a rebuild, your HDDs are at almost 100% throughput for the entire time. And since your array is still up, that means whatever your servers are serving, they are still doing it as well.
9.) Considering that your HDDs most likely are the same age (you bought them all at the same time when you decided to build an array, common case), probably came from the same manufacturer, you bought it at the same store, perhaps even came from the same lot at the factory, and they have been exposed to the same levels of stress as each other (obvious, since they are in a RAID5 array), then it is not statistically improbable that one of your other HDDs will also fail as well.
10.) Being at sustained max or near-max throughput for several hours does not help you, given #9.
11.) If another drive does fail during rebuilding, goodbye to all data.

Lessons:
-RAID is still useful somewhat for uptime purposes, if you consider building a huge array.
-RAID is still useful for performance (RAID 1 mirroring for databases that like it, RAID 0 for the much-sought-for striping-induced performance benefit that gamers often desire with their raptor drives)
-RAID is still not a good backup solution.

Ah so it's only really a performance issue when in "fail" mode or rebuilding, yeah that makes sense, more data to deal with. For a healthy raid then it's no issue right?
 

jvroig

Platinum Member
Nov 4, 2009
2,394
1
81
Some drives, like the older WD green power drives could have this function enabled (even on non-RAID edition drives) - however, the manufacturers have removed this function in the most recent 1.5 and 2 TB non-RAID drives.
Without TLER, such a drive could definitely drop from an array. I cannot blame manufacturer's from doing so, though. They probably want to move more of their RAID/Enterprise-class units, and so locked-down on "desktop" drives having this function.

Ah so it's only really a performance issue when in "fail" mode or rebuilding, yeah that makes sense, more data to deal with. For a healthy raid then it's no issue right?
I am not sure what performance issue you are after, but the short and sweet of it is that there are no "performance" issues at all, aside from any hit you willingly take when you choose a RAID level (for example, you know you are getting a sizeable performance bump when striping, but choosing a RAID level with double parity will decrease performance).

Particularly in software raid, the sheer power of modern processors have mostly made any overhead (computation of parity bits) mostly a moot point, so any performance penalty associated with RAID levels with single or dual parity have greatly been reduced in the past several years.
 

Emulex

Diamond Member
Jan 28, 2001
9,759
1
71
raid-5 requires a double read/write (raid-6 too); without a battery backed (write-back) cache for the raid you will have alot of slow down. raid definitely is meant for 100% uptime (no resets, no crashes, no power outages).

otherwise use windows home server it just handles your storage in a nifty jbod kind of way
 

jvroig

Platinum Member
Nov 4, 2009
2,394
1
81
Yes, I should have clarified that. There's already a hit you accept when you choose a RAID level with parity, as I mentioned when I said "aside from any hit you willingly take when you choose a RAID level..."

Since he was using software RAID and was asking about performance issues, I assumed he was talking about another performance hit due to software RAID vs a hardware RAID. As far as I know, that performance hit has been gone for a long time, thanks to the vast improvement in modern processors.

But that in no way means the read-modify-write sequences for data+parity have been cured, so the hit you take there remains. Disks still do twice the work. It's only the computation of parity bits (which used to be a lot faster in RAID cards than software) that have been "cured" thanks to the power of the processors we have today.
 

lothar

Diamond Member
Jan 5, 2000
6,674
7
76
Yeah, I'm not sure why people go through the process of RAIDing drives for backup when they can use a software solution like Acronis and others.
 

MalVeauX

Senior member
Dec 19, 2008
653
176
116
Why do you want to go with the green drives?

Heya,

I've listened to green/low power drives compared to 7200rpm drives of the same capacity to get an idea of how they sound (noise level). Also, temperature. Also, 24/7 wattage use. While the difference between 4~7 watts and 10~17 watts really is insignificant to a lot of people, when you consider 24/7 use all year long, and multiply it by many drives, you do end up with a number that is larger. And while it doesn't add up to hardly anything at all (less than $1 on my electric bill, actually more like $0.25 a month), if I don't need the extra power, why use it for no reason? Note that my gigabit network speeds do not allow a 7200rpm drive to work any faster than a 5400 or 5900rpm drive. So why spend more on faster drives? Faster drives make more noise, more heat, cost more, and don't do anything `faster' over my network than a green drive does.

I've really been enjoying Fbackup. It works like a charm. I've certainly seen the light! Thanks again guys.

Very best, :)
 

Lead Butthead

Senior member
Oct 5, 2009
250
0
71
A lot of problems with ATA RAID is actually caused by the error recovery policy of ATA in general. Most RAID stacks (fake or ''hardware'') had their root in SCSI RAID.
With SCSI drive, typically drive would have its mode page set to kicks back a command with a check condition/media error whenever it hits one, and leave the error recovery to the RAID stack (which depending on the RAID level could be as simple as retrying the command to reconstructing the data; policies that varies from vendor to vendor.)
With ATA drives, the drive would go off on its corner and does ''things'' to attempt to recover the data while leaving the RAID stack hanging and wondering if the drive is just taking its sweet time to recover from media error(s) or has it actually choked and died a horrible death.
The behavior of the ATA drives makes error recovery in general problematic and very painful in the RAID stack, and that usually translate to less than stellar reliability on ATA RAID.
 

Red Squirrel

No Lifer
May 24, 2003
70,347
13,673
126
www.anyf.ca
Heya,

I've listened to green/low power drives compared to 7200rpm drives of the same capacity to get an idea of how they sound (noise level). Also, temperature. Also, 24/7 wattage use. While the difference between 4~7 watts and 10~17 watts really is insignificant to a lot of people, when you consider 24/7 use all year long, and multiply it by many drives, you do end up with a number that is larger. And while it doesn't add up to hardly anything at all (less than $1 on my electric bill, actually more like $0.25 a month), if I don't need the extra power, why use it for no reason? Note that my gigabit network speeds do not allow a 7200rpm drive to work any faster than a 5400 or 5900rpm drive. So why spend more on faster drives? Faster drives make more noise, more heat, cost more, and don't do anything `faster' over my network than a green drive does.

I've really been enjoying Fbackup. It works like a charm. I've certainly seen the light! Thanks again guys.

Very best, :)

I always thought HDD speed was the bottleneck of a gigabit network. Though depending on your application I see what you mean, in most cases speed is not really noticable outside of benchmarks. I transfer lot of big files but I don't sit there with a stop watch thinking "if I had an iscsi SAN this would take less time"
 

MalVeauX

Senior member
Dec 19, 2008
653
176
116
I always thought HDD speed was the bottleneck of a gigabit network. Though depending on your application I see what you mean, in most cases speed is not really noticable outside of benchmarks. I transfer lot of big files but I don't sit there with a stop watch thinking "if I had an iscsi SAN this would take less time"

Heya,

Gigabit `theoretically' can go 1.0Gbps (this is basically 128 megabytes per second maximum rated throughput). Note that a lot of drives can do much higher than this sustained, from RAID0 to SSD transfer speeds on a single channel that still is more than that. But if you set one up and look at it's real world throughput, I'd be rather surprised if you saw greater than 60ish megs per second sustained transfers on gigabit networking, even if only traveling on 3 feet of the appropriate cat cable. Just because it's supposed to be able to do 1.0Gbps doesn't mean it does (just like your HDD at 3.0Gbps doesn't ever see that maximum potential threshold number, not even our current SSD's saturate that channel yet, but will of course, hence 6.0Gbps SATA on the horizon). More realistically, you're going to see an initial burst that is high, then the transfer rate will dwindle down to the 70's, 60's, 50's and if you have good equipment and all is well, if you can stay in the upper 40's and 50's, you'd have gigabit networking that is functioning appropriately. This is still worlds faster than the previous 10/100 speeds.

The HDD becomes the bottleneck on Gigabit if you're transferring data to several clients. If it's one client, the HDD is way beyond what Gigabit can do, and so the network is the bottleneck. But if you were transferring several sets of data from that HDD to 10 different clients, you would quickly see the HDD become the lagging component there and the Gigabit network would be waiting on it.

Very best, :)
 
Last edited: