• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Troubleshooting advice

Have an HD (WD Blue WD10EXEZ 1TB) I bought last week. Dropped it into a pre-existing NAS. NAS seemed fine.

Few days later, drive drops out of RAID.

No SMART errors. Reboot, it's fine.

Yesterday, drive drops out of RAID. Then the RAID drops. Then the other RAID drops. NAS seizes up. Console throws a bunch of errors saying the drive is timing out and that the port multiplier is dead. Errors are looping looping looping looping. NAS wouldn't reboot.

So unplug everything, plug everything back in, boot, it's fine. Run a ZFS scrub, it seizes up. The first error message is specific to that drive, so I pull it, reboot, and everything's fine again. Scrub completes, etc. NAS is fat and happy, even if one of my zpools is is degraded.

Plugged the drive into another computer and it's behaving itself. Surface scan was good, etc. SMART status good. Can't find anything wrong.

So should I just call it cradle death, and say whatever I need to say to exchange the HDD (presumably for a different brand, since if the drive's not actually defective, it could be a firmware problem or something?) or should I pursue/troubleshoot this more?
 
Don't WD drives that are non raid edition have a really high TLER?
The 'timing out' part seems like this is what is going on.
 
Don't WD drives that are non raid edition have a really high TLER?
The 'timing out' part seems like this is what is going on.

Not exactly. RAID drives are the ones with TLER (which puts a cap on the number of retries. TLER isn't a feature as much as it's actually disabling the "try really hard" error recovery mode because RAID controller don't like waiting that long.)

ZFS has its own error control, which, according to what I've read, should be "smart" enough to deal with a drive whether it has TLER or not.

The reason WD Greens are a no-no for home software RAID use is their very aggressive and not-turn-off-able spin down. (The RAID software thinks the disk has gone south and marks it as bad.)
 
Back
Top