RAID1 failure

Cat

Golden Member
Oct 10, 1999
1,059
0
0
I have two Seagate 160GB drives in RAID1, on an Intel D875PBZ's onboard SATA RAID controller. Last night, one of the drives experienced a SMART Event (which in particular I can't lookup: http://www.intel.com/support/chipsets/sb/cs-015002.htm )

Rebooting multiple times resulted in CHKDSK running each time, and the file structure going from bad to worse. Now I can't even access the drive: Windows says the structure is corrupt, and I'm currently using a recovery utility to gather the bits and pieces that I can.

So, apparently mirroring didn't really help me here. Should I have just unplugged the bad drive immediately?
 

Tarrant64

Diamond Member
Sep 20, 2004
3,203
0
76
Good question. I've kinda wondered it myself. What are u supposed to do when using Raid1, and one of the drive fails? Are you able to reboot and it just uses one drive only(since the other drive should be an exact mirror of the first)? Hmm...
 

DaFinn

Diamond Member
Jan 24, 2002
4,725
0
0
in raid 1, both drives contain the exact same data. It should boot up just fine w. other drive corrupted. Just go to menu, rebuild the raid array, you're good to go.
Or, change the drive to a new one and rebuild.

In your case, it looks like your filesystem is corrupted or some such thing... in that case raid1 wont' do you no good.
 

aceO07

Diamond Member
Nov 6, 2000
4,491
0
76
Interesting. I was actually just trying to configure a dell computer that has raid1 and had some questions.

So, according to DaFinn's response, it seems like if the filesystem is bad then both drives will go bad. Is Raid1 just protection for physical errors? Also, is it possible to take out 1 of the 2 drives in Raid1 and use it in another system w/o raid?
 

beatle

Diamond Member
Apr 2, 2001
5,661
5
81
I didn't know you were running RAID 1. Speak in IRC more often.

What happens if you pull the bad drive and boot up with the one good one?
 

Cat

Golden Member
Oct 10, 1999
1,059
0
0
I'm going to try that when I get home, but CHKDSK has already messed up the filesystem, I think. I probably should have immediately pulled the bad drive.
 

VictorLazlo

Senior member
Jul 23, 2003
996
0
0
RAID pisses me off. Yea, mirroring makes you more fault tollerant "in theory", but it also introduces a whole host of new ways that faults can happen.
 

jonnyGURU

Moderator <BR> Power Supplies
Moderator
Oct 30, 1999
11,815
104
106
You shouldn't have attempted to reboot the computer so many times.

I use SATA RAID1 in all of my Voicemail servers. If I EVER get a drive error at the BIOS level, I immediately disable the drive in question.

The bad thing about SATA RAID1 is it's not true RAID1. Both drives don't write simultaneously all of the time. I've run into issues where, on a particular AOpen board that I've vowed NEVER to use again because it doesn't have a SATA RAID BIOS and can only rebuild once in the OS, people will do large installs, go to reboot and then find their install in partial condition or some sort of data corruption because all of the data is not copied over.

The only time SATA RAID1 has really helped me out is when a drive fails IN WINDOWS (which happens more often than any other failure because the Voicemail servers run 24/7 and are always in Windows) and I get a pop up that tells me one of my drives is dead. When this happens, I shut down, double check the cables, fire it up and if I get a similar error at the BIOS level (assuming the board has SATA RAID BIOS unlike the AOpen I had used) shut her down and yank the drive.
 

uOpt

Golden Member
Oct 19, 2004
1,628
0
0
With data-preserving RAID (1 and 5 and a few others), what is supposed to happen is that the disk tells the controller it is failing, the controller is telling the OS and the OS reconfigures the RAID to be marked to run without this particular drive.

If you then reboot the bad drive is already marked as out of the array and you run clean.

Now, Windoze sucks (it really does, sorry), obviously it didn't do anything well here and I won't speculate about the reasons.

However, I'm the one to talk since my Linux slightly screwed up, too, although in a way that didn't cause me to lose data, only wasted time for recovery.

One problem I had is that one of disks in the array failed in a way that made the kernel hang. That means the kernel didn't have an opportinity to mark the RAID array to run without that drive, which in turn means the fact that this drive is failing. When I then rebooted the drive was pretending to play nice during startup and caused a small mess with incorrect markings of the RAID due to bugs in one set of Linux RAID utilities (raidstart and friends) and I hd to use mdadm to sort it out.

Now, all that is for software RAID.

If you do hardware RAID, the controller is doing all this behind the back of the OS. You see, this is fairly complicated software and I would never trust a stupid firmware to do everything right. One of the major reasons why I don't use hardware RAID.

If the controller is relying on reporting to a human via the BIOS to disable the drive, well, that's not the right thing to do, IMHO.
 

Cat

Golden Member
Oct 10, 1999
1,059
0
0
Thanks for the posts, Jonny and Martin.

Where can I find more information on SATA RAID1 not being 'true?'
 

jonnyGURU

Moderator <BR> Power Supplies
Moderator
Oct 30, 1999
11,815
104
106
I can give you links to site that do nothing more than explain RAID. Like Martin said, on-board SATA hardware RAID is all done on firmware software. Any controller card that has a controller with CPU and memory is a lot more reliable than what you and I are using.
 

uOpt

Golden Member
Oct 19, 2004
1,628
0
0
Originally posted by: Cat
Thanks for the posts, Jonny and Martin.

Where can I find more information on SATA RAID1 not being 'true?'

I don't know what this is supposed to mean, I think somebody with half-knowledge mislead you.

RAID-5 is way more complicated than 0 and 1 but that makes 0 and 1 still RAID.

It is a misnormer to say "SATA-RAID". There is nothing special about SATA that would make it more or less uitable for RAID. It is just coincidence that many vendors provided RAID at the same time they introduced SATA. The reason is probably that with SATA you need more channels anyway and if you have more channels RAID makes more sense.

However, there are plenty of PATA, SCSI, hippy and whatnot RAID controllers and of course you can do software RAID.
 

Cat

Golden Member
Oct 10, 1999
1,059
0
0
I wouldn't call it a misnomer, it's just a qualifier. I think what he meant was that there are no industrial strength SATA RAID controllers available. I don't know how true that is, however.
 

jonnyGURU

Moderator <BR> Power Supplies
Moderator
Oct 30, 1999
11,815
104
106
Originally posted by: MartinCracauer
Originally posted by: Cat
Thanks for the posts, Jonny and Martin.

Where can I find more information on SATA RAID1 not being 'true?'

I don't know what this is supposed to mean, I think somebody with half-knowledge mislead you.

Haha... I'm pretty sure he got that from taking what I said out of context.

I said that the SATA RAID wasn't "true" in the sense of actually writing to both drives simultaneously the way SCSI does, or even SATA controller cards that can perform mirror RAID at a "true" hardware level (constantly writing the data to both drives simultaneously even at the BIOS level.)

I had said that I've had many problems where the data from one SATA drive is not written to it's mirror at the precise moment it was written to the source, especially on a particular AOpen board that I've used that performs NO MIRRORING until after the OS is booted and the RAID service is running. If changes are made during a reboot, the drive has to rebuild once it gets into Windows. If the mirroring is not done by the next reboot (which happens during Windows installs or during Windows Updates that require a lot of reboots, etc.) I've been left with a mirror that is not a true mirror of the source.

This is why I considered that RAID controller, not a "true" RAID controller. ;) More like, an SATA controller with a software program that copies over the data to a second drive once Windows boots up and may not even be bootable! :)

Now, it'd be pretty bad luck to have a drive fail during the install and update process leaving the PC high and dry with one dead drive and one non-functioning drive, but with this particular AOpen board, when Windows 2000 is installed, it installs onto SATA1, not SATA0, BY DEFAULT. Sort of b-ass-ackwards. Once the boot sector is copied over and the SATA controller sees that SATA0 is now bootable, it starts to boot off of it. If the mirror is not a complete image of the source, I end up with corrupt data. Can you say SUCK?!?
 

uOpt

Golden Member
Oct 19, 2004
1,628
0
0
Originally posted by: jonnyGURU
Originally posted by: MartinCracauer
Originally posted by: Cat
Thanks for the posts, Jonny and Martin.

Where can I find more information on SATA RAID1 not being 'true?'

I don't know what this is supposed to mean, I think somebody with half-knowledge mislead you.

Haha... I'm pretty sure he got that from taking what I said out of context.

I said that the SATA RAID wasn't "true" in the sense of actually writing to both drives simultaneously the way SCSI does, or even SATA controller cards that can perform mirror RAID at a "true" hardware level (constantly writing the data to both drives simultaneously even at the BIOS level.)

I'm sorry, that is not true as such (but, see below, I agree with your general point).

On my mainboard (see sig) when doing software-RAID I know for a fact that data is written to the three connected SATA disks simultaneously.

Now, technically I cannot rule out that if you switch to hardware RAID that it stops doing that but I wouldn't knwo why. The hardware is obviously capable of writing to all four of SATA ports on that mainboard simultaneously.

But in general I certainly agree that much of the el cheapo SATA RAID stuff offered these days is more of a toy and shouldn't be relied on for data safety, in particular not in combination with Windows which hides all the details and will end up make you run with not effective RAID, making you lose data where you shouldn't.

The best part of this is that if you do ineffective RAID-1 you not only run the same risk as with just one disk, you have a probability multiplication of two, because you now have two drives and any of them can take the array down :)
 

jonnyGURU

Moderator <BR> Power Supplies
Moderator
Oct 30, 1999
11,815
104
106
Ah! You're too used to Linux. YOU'RE using LOGIC! :D

You're assuming that because the hardware is capable of writing to multiple drives simultaneously that therefore it must. I guarantee you in spades that this is not the case. Why? I don't know, because in theory it obviously can. But with certain (not all. I don't use blanket statements) motherboards with on-board RAID, the drives are NOT written to simultanously. ;)

I HAVE used SATA RAID controllers like the Promise FastTrack SX150 that DO write to multiple drives simultaneously SO WELL that I can pull one drive out of the array while still hot, pop it back in, the OS doesn't miss a beat, and then pull the other drive out. And this is without ANY additional software at all! EXACT copies source and mirror at all times is a beautiful thing.

This is why I used the term "true." That Promise card, I would say, now that's "true RAID." Sure, RAID just means Redundant Array of Inexpensive Drives, so by name that means ANY arrangement of redundant drives, no matter how the mirror is created, is a "true" RAID. ;)
 

bhanson

Golden Member
Jan 16, 2004
1,749
0
76
Originally posted by: aceO07
Interesting. I was actually just trying to configure a dell computer that has raid1 and had some questions.

So, according to DaFinn's response, it seems like if the filesystem is bad then both drives will go bad. Is Raid1 just protection for physical errors? Also, is it possible to take out 1 of the 2 drives in Raid1 and use it in another system w/o raid?


RAID 1 would only theoretically protect you from a single drive failure. RAID is never a solution to backups, if you screw something up it'll be messed up on both drives.
 

uOpt

Golden Member
Oct 19, 2004
1,628
0
0
Originally posted by: jonnyGURU
Ah! You're too used to Linux. YOU'RE using LOGIC! :D

You're assuming that because the hardware is capable of writing to multiple drives simultaneously that therefore it must. I guarantee you in spades that this is not the case. Why? I don't know, because in theory it obviously can. But with certain (not all. I don't use blanket statements) motherboards with on-board RAID, the drives are NOT written to simultanously. ;)

I HAVE used SATA RAID controllers like the Promise FastTrack SX150 that DO write to multiple drives simultaneously SO WELL that I can pull one drive out of the array while still hot, pop it back in, the OS doesn't miss a beat, and then pull the other drive out. And this is without ANY additional software at all! EXACT copies source and mirror at all times is a beautiful thing.

This is why I used the term "true." That Promise card, I would say, now that's "true RAID." Sure, RAID just means Redundant Array of Inexpensive Drives, so by name that means ANY arrangement of redundant drives, no matter how the mirror is created, is a "true" RAID. ;)

Well, there's a reason why I do software RAID, and the more I talk to people using hardware RAID the more convinced I am I am doing the right thing.

For completeness sake, I have to insist that the 4 SATA channels on the P4C800-E Deluxe do actually work independently and at least when not using hardware RAID do support writing at the same time. The measured performance for reading and writing in software striping (RAID0) leaves no other conclusion.

Unfortunately (*&amp;#^*&amp;#&amp;#&amp;!!!) I had to discover that the same doesn't hold true for the PATA channels. Bollocks. There's always a catch.