Can a sata controller kill hard drives?

Red Squirrel

No Lifer
May 24, 2003
69,712
13,334
126
www.betteroff.ca
I just replaced a dead hard drive, and the replacement one is already dying. Before I add any drive to my raid I always do a dd write and read of the entire drive to ensure all the sectors are ok, as if there's even 1 bad sector I wont trust it and send it back.

Well while performing this test, this is what I get in dmesg:

Code:
sd 2:0:0:0: [sdc] 1953525168 512-byte hardware sectors (1000205 MB)
sd 2:0:0:0: [sdc] Write Protect is off
sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
ata3.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen
ata3.00: irq_stat 0x08000000, interface fatal error
ata3: SError: { UnrecovData HostInt 10B8B BadCRC }
ata3.00: cmd c8/00:08:78:cf:14/00:00:00:00:00/e0 tag 0 dma 4096 in
         res 50/00:00:77:cf:14/0b:00:00:00:00/e0 Emask 0x50 (ATA bus error)
ata3.00: status: { DRDY }
ata3: hard resetting link
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata3.00: configured for UDMA/33
ata3: EH complete
sd 2:0:0:0: [sdc] 1953525168 512-byte hardware sectors (1000205 MB)
sd 2:0:0:0: [sdc] Write Protect is off
sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x280100 action 0x6 frozen
ata3.00: irq_stat 0x08000000, interface fatal error
ata3: SError: { UnrecovData 10B8B BadCRC }
ata3.00: cmd c8/00:08:10:d3:14/00:00:00:00:00/e0 tag 0 dma 4096 in
         res 50/00:00:0f:d3:14/0b:00:00:00:00/e0 Emask 0x10 (ATA bus error)
ata3.00: status: { DRDY }
ata3: hard resetting link
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata3.00: configured for UDMA/33
ata3: EH complete
sd 2:0:0:0: [sdc] 1953525168 512-byte hardware sectors (1000205 MB)
sd 2:0:0:0: [sdc] Write Protect is off
sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
ata3.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen
ata3.00: irq_stat 0x08000000, interface fatal error
ata3: SError: { UnrecovData HostInt 10B8B BadCRC }
ata3.00: cmd c8/00:08:e0:d6:14/00:00:00:00:00/e0 tag 0 dma 4096 in
         res 50/00:00:df:d6:14/0b:00:00:00:00/e0 Emask 0x50 (ATA bus error)
ata3.00: status: { DRDY }
ata3: hard resetting link

Ton of these. Command eventually fails, and the CRC error count on the drive is at like over 500! D: So definitely not going to add this drive to the raid, thing is, it's brand new! It was put in the same bay as the one that failed, so could it be there is something wrong with that card's port, or the bay itself? Seems like a weird coincidence that the replacement drive would be dead too. I don't have another to test and I'm kind of reluctant to RMA it again.
 

C1

Platinum Member
Feb 21, 2008
2,375
111
106
Good question.

I see these multiple drive failure reports so often in AT and it causes me to wonder also. In the zillions of years and hours Ive put on bushel baskets of HDDs of all kinds (2.5s and 3.5s), I can barely recall more than two failures (and it may actually only be only one).

A guess is to check the voltage levels being applied.

As a final comment, try to upfront buy quality drives (eg, boxed five star rated HDDs based on significant user sample size).
 

Elixer

Lifer
May 7, 2002
10,371
762
126
Ton of these. Command eventually fails, and the CRC error count on the drive is at like over 500! D: So definitely not going to add this drive to the raid, thing is, it's brand new! It was put in the same bay as the one that failed, so could it be there is something wrong with that card's port, or the bay itself? Seems like a weird coincidence that the replacement drive would be dead too. I don't have another to test and I'm kind of reluctant to RMA it again.

Well, step one to rule things out is try it on another machine.
If same issues, it is the HD again.
QC on HDs these days is piss poor, back-2-back RMAs isn't uncommon.

If the drive works on the other machine, then you check PSU, and for heat issues. If those are OK, then could be chipset.
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
Bad CRC could be anything from a bad cable to heat. I wouldn't consider the drive bad until it was verified in another machine.