RAID-5 Issues on Office Server

ankitaggarwal

Junior Member
Apr 25, 2009
6
0
0
My goal is to hopefully keep my RAID array intact since I'm pretty sure my hard disks did not die. Let me summarize what happened.

I have a Windows 2003 Standard Server running at the office that I've built myself. It has five 250GB hard disks. Disk 0 is the non-RAID OS drive, and disks 1 to 4 are part of a RAID array for all the data that is to be stored on and accessed from it by two other machines in the office.

Windows 2003 failed to get past the loading screen (Starting Windows 2003 Server with the loading bar) after a forced reboot two days ago due to the computer locking up. I left it up for two days and it still hadn't progressed.

I rebooted again and viewed the Intel Matrix Storage Manager and the RAID array was of the "Initialize" status. I don't believe it should have continued to say that after two days of being at the Windows loading screen, assuming the raid array could reinitialize during that time.

I rebooted again to check my BIOS settings and everything seemed to check out. However, after leaving the BIOS, the Matrix Storage Manager labeled the RAID array as "Degraded". I went into the utility again and one disk (disk 3) was causing a problem. I tried to select this disk when the "rebuild array" prompt popped up but it didn't work, and afterward I accidentally selected the single non-RAID disk, the one with the OS. This wouldn't be a problem except that the array rebuilding was stated to be done through the OS itself, and since I selected my OS disk I can't boot into it anymore.

All of the above being said, I now have disks 2 and 3 labeled as "Offline Member" after trying to disconnect disk 3's SATA cable and reconnect it in hopes to be able to rebuild onto that disk instead.

I know disks 1, 2, and 4 have their data intact, so if I can remove the flag of "Offline Member" from disk 2, I'd be able to rebuild the array and put all the missing information onto disk 3. Then I can do the necessary reinstalling of the OS and such onto disk 0 and get back up and running.

Questions are:
1) How can I remove the "Offline Member" flags from disk 2 and 3 and then rebuild the array based off disks 1, 2, and 4 and put the missing data onto disk 3?

2) How can I recover the OS and all the data on disk 0 since it seems it was only subject to a quick, high level format? If it's not possible, how can I remove disk 0 from the RAID array without causing issues, assuming it isn't as easy as going back into the RAID utility and setting disk 0 as a "non-RAID" disk?

3) How do I know when my RAID array is reinitialized after a dirty shutdown/reboot or any other similar event? The utility didn't show a progress % and I don't know how I'd be able to view it in the future so I know not to keep rebooting the computer.
 

RebateMonger

Elite Member
Dec 24, 2005
11,586
0
0
I've never used the Intel RAID chips, so I can't offer any experience.

But this sounds REALLY messed up. Especially after telling it to rebuild with your OS disk instead of replacing Disk 3 with a brand-new drive.

My advice, assuming you have no backups and that the data has value (opposing concepts), is to seek professional data recovery help, from folks that have experience recovering RAID 5 arrays. This is not the time to LEARN how to do this.
 

ankitaggarwal

Junior Member
Apr 25, 2009
6
0
0
The data isn't very critical and can be reentered with an hour or two. It's funny, too, because I was just about to install backup software but the server got locked up. The Symantec Backup Exec didn't want to install for some complex reasons I don't want to get into now, so I had to seek alternatives to fulfill my needs.

Regardless, I would indeed like to fix this "Offline Member" nonsense and from there I should be good to go. I think replacing the SATA cables for the two drives in question is a good start because I am very confident my drives didn't fail since I've only recently purchased the parts and built this server. Of course, disk 3 might have been a bad drive, but if 1, 2, and 4 are online and running everything should be fine.
 

Laputa

Golden Member
Jan 18, 2000
1,775
0
0
It's not going to be a pretty scenario after you did all the improper rebuild attempt. Best is to redo everything for the time shake since you only said it's going to take an hour or two to reenter the data.
 

RebateMonger

Elite Member
Dec 24, 2005
11,586
0
0
Originally posted by: ankitaggarwal
The Symantec Backup Exec didn't want to install for some complex reasons I don't want to get into now, so I had to seek alternatives to fulfill my needs.
You can always try the built-in NT Backup software. Recovery of the entire system can take a while, since you generally need to re-install basic Server 2003 and patch it to the same Service Pack level as your backed-up system. But it DOES work. I've restored quite a few servers with NTBackup.

You didn't state how much data you have, giving only the size of the array. 1 and 1.5 Terabyte external hard drives are dirt-cheap nowadays. I suggest you get a couple and implement automatic backups of your entire server.

As I noted earlier, I have zero experience with Intel-chipset RAID chips. Intel has User forums where such things are discussed. You might want to check those out.

Originally posted by: ankitaggarwal
I think replacing the SATA cables for the two drives in question is a good start because I am very confident my drives didn't fail since I've only recently purchased the parts and built this server. Of course, disk 3 might have been a bad drive, but if 1, 2, and 4 are online and running everything should be fine.
You should be able to carefully connect individual drives to another PC and boot to a disk diagnostics program and examine the status of the drives, including SMART status. Just be sure you don't try to "Repair" the drives or do anything that might write to them.
 

Fallen Kell

Diamond Member
Oct 9, 1999
6,249
561
126
Originally posted by: RebateMonger
Originally posted by: ankitaggarwal
The Symantec Backup Exec didn't want to install for some complex reasons I don't want to get into now, so I had to seek alternatives to fulfill my needs.
You can always try the built-in NT Backups software. Recovery of the entire system can take a while, since you generally need to re-install basic Server 2003 and patch it to the same Service Pack level as your backed-up system. But it DOES work. I've restored quite a few servers with NTBackup.

You didn't state how much data you have, giving only the size of the array. 1 and 1.5 Terabyte external hard drives are dirt-cheap nowadays. I suggest you get a couple and implement automatic backups of your entire server.

I don't think that will help him in his current situation at all since as he stated, he messed up and selected the OS disk for use by the chipset to be the recovery/replacement disk for the failed disk in his RAID5 array and it tried/started to rebuild the array, which wipes his install, and thus any data including things like the built in backups.

Maybe after he rebuilds the system that will be an option for the next time it might happen...
 

ankitaggarwal

Junior Member
Apr 25, 2009
6
0
0
Just to reiterate what I said in my OP: when prompted to select to which disk to rebuild the array, both disk 0 and disk 3 were selectable, but every time I tried to select disk 3 it popped up the prompt again.

For what could this be a symptom?