MD raid1 rebuild events

skyking · Feb 7, 2010

I leave my home server hooked up to a KVM port, logged in. Every month or so, I'll see some drive errors over there in the console. It triggers an array rebuild I can find in daemon.log . I have not checked the disks yet but the infrequency of the problem leads me to believe the hardware will check out OK. I'm thinking disk controller problem?

Nothinman · Feb 8, 2010

What are the drive errors that get logged?

skyking · Feb 8, 2010

They were up on the console, I'll look around for the errors.

Nothinman · Feb 8, 2010

Depending on the distro they should also be in /var/log/kern.log or /var/log/messages.

skyking · Mar 9, 2010

ata2: Status: {DRDY ERR}
Err: {UNC}
Exception EMask 0x0 SAct 0x0 SErr 0x0 action 0x0
BDMA stat 0x24
CMD 25/[buttload of hex] tag 0 DMA524288 in

Lather rinse repeat. I guess I'll run a USB CD-rom diagnostic at it and get an advance RMA, looks like a funky block.

Crusty · Mar 9, 2010

I used to see similar errors with a pair of Samsung Spinpoint 1TB drives on an nforce 4 SATA chipset. Turns out that particular controller was incompatible with those drives, I moved them over to a different board and have had zero problems, so don't rule that out before you RMA a drive.

Nothinman · Mar 9, 2010

Yea, buggy firmware on either side can cause weird issues.

skyking · Mar 9, 2010

Nothinman said:
Yea, buggy firmware on either side can cause weird issues.

I hope that isn't the case, as this a really neat little fileserver on a via embedded chipset. That may be the whole problem. I know it is incompatible with the 3.0gb spec but these WD green drives have a jumper to set them at 1.5, and it is the reason I went with the WDs. I really don't want to start over or abandon the project as it only uses 28 watts with two drives running, and serves up anything I need.

joetekubi · Mar 9, 2010

One time (in 5 or 6 years), I started getting some disk errors on a RAID 1 software raid. I had to manually "break" the array, and ran fsck on each one individually. One physical disk had some bad sectors, and when I restarted the software raid, the other disk mirrored perfectly. Worked out well for me, but a bit nerve racking until I saw the magic "UU" via cat /proc/mdstat

HTH,
joe

skyking · Mar 10, 2010

I ran badblocks -sv /dev/sda and sda passed OK.
sdb is borking out with those errors, so I may really have a bad disk on my hands.

Crusty · Mar 12, 2010

Does smartctl show any errors for the drives?

skyking · Mar 12, 2010

I ran
smartctl -a /dev/sda
smartctl -a /dev/sdb

In both cases down in the middle I read

SMART Error Log Version: 1
No Errors Logged

I don't know if I have to run a specific test or if it is grabbing the standard logs from the drive. I'll read up a bit more about that.

MD raid1 rebuild events

skyking

Lifer

Nothinman

Elite Member

skyking

Lifer

Nothinman

Elite Member

skyking

Lifer

Crusty

Lifer

Nothinman

Elite Member

skyking

Lifer

joetekubi

Member

skyking

Lifer

Crusty

Lifer

skyking

Lifer

TRENDING THREADS