CentOS RAID5 rebuild

Xpage

Senior member
Jun 22, 2005
459
15
81
www.riseofkingdoms.com
Our CentOS linux box at work died, since it's a university and I visit these forums I am the default IT guy. However I know windows more than Linux and am more of a hardware than software guy. Our server has a RAID 5 array.

However a RAID5 rebuilt in theory should be simple. Swap out the damaged hard drive and the RAID controller it should rebuild the array.


Originally when I looked into the computer It said that the raid was degraded, after rebooting now it is saying that the raid is rebuilding, however I have not inserted a new drive. Still need to order it, so any thoughts on what went on?

pic below is the box




Sadly to complicate things somebody nerfed the linux box, as it won't boot the GUI, however ssh into it and it works fine as well as logging in via a browser which is what we use it for to run software.

So my question is should i replace the hard drive or will it fix itself assuming the error was not catastrophic? I haven't done a RAID rescue.

Also of note we are getting a new linux box to replace this 8 year old machine so, i just need it alive long enough to backup the data. Help is appreciated.
 

manly

Lifer
Jan 25, 2000
13,094
3,861
136
You need to run tw_cli to identify the drive that's rebuilding. 1 TB takes a while to rebuild, but if it's not done by tomorrow, then that drive is probably on its way out. With RAID5, you can't really afford to be patient.

smartctl may be able to get some S.M.A.R.T. info off that drive; it does support 3ware controllers.

Don't universities usually have sysadmins/IT departments? :p
 

Xpage

Senior member
Jun 22, 2005
459
15
81
www.riseofkingdoms.com
we are a new dept, also off campus (by only 200m), I would have thought so too but the UC system here changed policies and none of the 3 groups we fall under want to touch linux. Campus decided to stop servicing people's requests for them to come in person if they are from the university, not a student. brilliant move...

I tuned off the PC until I could get a new 10k drive, read elsewhere if it is rebuilding on bad not it is no bueno. The system might be under warrenty, so replacement might be free, though again small custom linux shop far far away no onsite support.

Thanks for the command info, I will try them out.
 

Red Squirrel

No Lifer
May 24, 2003
70,176
13,576
126
www.anyf.ca
Maybe one drive was configured as a hot spare. That's my guess anyway. Remove whatever drive is showing as fail and see what happens. To be safe maybe wait till the rebuild is done in case you pull the wrong drive.
 

Xpage

Senior member
Jun 22, 2005
459
15
81
www.riseofkingdoms.com
I was able to find the original receipt and called the company, they weren't too much of a help, couldn't tell me the configuration. Math said it had a hot spare. (4) 1TB drives, only 2TB space, RAID 5 = 3 drives, 1 spare.

The raid was rebuilding on the spare when i rebooted. I let it finish then was able to ssh into server and was able to go into the raid controller and found that physical drive #1 was the problem.

System was under warranty and I know which drive to pull to send back. so all is well. thanks for the help
 

manly

Lifer
Jan 25, 2000
13,094
3,861
136
Actually, your original screenshot shows that the array is 3 TB. You definitely want to get the bad drive replaced ASAP.

I'm not sure what the rebuild was about, but I've seen 3ware controllers do weird stuff like that.