About to RMA drive - but TrueCrypt cleared errors - still return it??

vss1980

Platinum Member
Feb 29, 2000
2,944
0
76
Hi all,

Title may seem bizarre but is a pretty good summary of the dilemma.

Essentially, during about a week ago, during a large file copy the hard drive in question stopped transferring data and essentially disappeared from the drive list - after rebooting it would be back however SpeedFan was reporting 0% drive fitness.

After firing up Seatools, this confirmed the problem with an error code (6C9AC2A4). To confirm even more, left the DOS Seatools to scan (and see if it could recover any bad sectors - it failed miserably) but also reported bad media and to replace / rma the drive.

Still being in warranty I obtained an RMA from Seagate, however prior to shipping it out I wanted to properly clear the data on the drive. The Seatools 'Full Erase' option would fail instantly and the DBAN program also would quit as soon as asked to clear the drive.
With limited options available, I used TrueCrypt to write an encrypted partition to the drive.

Job done, but now the drive fitness is back up to nearly 100% in SpeedFan, and Seatools reports that the drive is OK (would fail a DST and full scan previously, but now passes DST - have not had time to run full scan).

Apart from thinking WTF, do I send it back? Will they test it and then give me crap because they can't find an error?

Edit: Oh and now it will happily perform a full erase..... :twisted:
 
Last edited:

C1

Platinum Member
Feb 21, 2008
2,403
117
106
True, once in a blue moon one of my Maxtors might become hosed. After data recovery then wiping the drive, it is okay (ie, tests great thereafter). A tentative conclusion is that drives simply can become corrupted somehow (it seems like something that the HDD controller is at least partly responsible for).

If you really want to verify the drive, back up an image on it using something like Drive Copy or Drive Image. If that works, you'll be fine. Those image programs are real fussy about verifying drive quality during image creation.
 

vss1980

Platinum Member
Feb 29, 2000
2,944
0
76
There is a definite issue - the smart data shows the drive has dumped a load reallocated sectors and still reports 1 uncorrectable sector.... :|

Just waiting for the Full erase to finish and then see what status it comes back with....
 

Auric

Diamond Member
Oct 11, 1999
9,591
2
71
I would carry through with the exchange. It's not worth saving a few quid in shipping for the uncertainty. The claim remains valid because SeaTools did report a failure test code. I had one report the same code intermittently.
 

Elixer

Lifer
May 7, 2002
10,371
762
126
That is what usually happens when you have reallocated sector errors. Once the drive is formatted again, those get remapped to good sectors. However, once the drive has started getting reallocated sector errors, then you can be sure they will be back sometime in the future again.

Best to RMA it.
 

Binky

Diamond Member
Oct 9, 1999
4,046
4
81
It sounds like you had a partition go bad and now the (physical) drive is being blamed. This isn't always the case. With only one bad sector, I would not RMA the drive until it was more certain that the drive is unreliable.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,225
126
It sounds like you had a partition go bad and now the (physical) drive is being blamed. This isn't always the case. With only one bad sector, I would not RMA the drive until it was more certain that the drive is unreliable.

He said one uncorrectable sector. It sounds like there were a number of bad sectors that were re-mapped, by writing over them, triggering the re-map.

I would def. RMA if the number of remapped sectors is over 5-10.

Edit: Or if the count seems to be increasing.
 

lakedude

Platinum Member
Mar 14, 2009
2,778
529
126
I'm wondering if this could be caused by a data cable. SATA cables are not the best in my experience. Wiggle em and the work fine for a while but eventually they seem to have strange issues.

The way the drive came back to life after being so supposedly degraded does not seem to me to be a disk surface error, but I could be wrong...
 

JonBlack

Member
Apr 11, 2012
89
0
0
Personally, once a drive starts acting up, even if corrected, then I never will fully trust it again. Unless, you can narrow it down to an anomaly or cable issue, etc. If you got a RMA number, then I'd send it back, IMHO. Of course, you'll probably end up with a refurb drive.
 

vss1980

Platinum Member
Feb 29, 2000
2,944
0
76
Bit disappointed if Seatools can't distinguish between a partition data error and an actual physical error - after all at the hardware level surely it should be irrelevant what data is stored on the disk.

Did check cables but didn't make any difference, I have locking SATA cables on my drives (which is a real pain should you want to swap order on the motherboard during assembly and the connectors are next to each other).

To be honest it seems almost as if that area of the drive just demagnetized through lack of use, but that is hardly confidence inspiring.
 

murphyc

Senior member
Apr 7, 2012
235
0
0
That is what usually happens when you have reallocated sector errors. Once the drive is formatted again, those get remapped to good sectors.

Formatting doesn't do this. Zeroing, or ATA Secure Erase will. When formatting a disk as NTFS there is a legacy option for full format that reads a disk, and any reported sector errors cause a bad sector table to be created by the NTFS file system. For modern disks this is a bad idea. The disk firmware should manage bad sectors, not the file system. Zero the disk or Secure Erase it, then use NTFS quick format.


However, once the drive has started getting reallocated sector errors, then you can be sure they will be back sometime in the future again.

This is true.

Best to RMA it.

From a consumer perspective, I agree. The disk manufacturer might disagree because some bad sectors are considered normal. So long as ECC corrects transient errors, and zeroing the disk removes persistently bad blocks from use, the drive is functioning correctly.
 

KyrosKrane

Member
Feb 25, 2010
26
0
61
From a consumer perspective, I agree. The disk manufacturer might disagree because some bad sectors are considered normal. So long as ECC corrects transient errors, and zeroing the disk removes persistently bad blocks from use, the drive is functioning correctly.
The primary purpose of a hard disk is to hold data (files, executables, whatever). If it encounters an error and recovers invisibly to the user, great. If it encounters an error that requires wiping the whole hard drive with Secure Erase to fix it, then the hard drive has failed in its primary purpose. Wiping a drive to restore its basic functionality isn't a fix, it's an abrogation of the primary duty of the hard drive. I would have great qualms about dealing with a drive manufacturer who tells me that wiping out all my data is "functioning correctly."

In more general terms, my experience has been that once a drive starts getting physical errors, there's not much point in continuing to use it. Physical errors don't just happen to spoil your otherwise-happy day; something within the drive itself is starting to go bad and cause those errors. More likely than not, even if you do fix the errors temporarily with a full-disk wipe, whatever it is that caused the errors is still there and will cause more errors in the future. I'd rather replace the dud drive with an error-free one and get my data backed up before the drive decides that some critical sector of my hard disk is unreadable and I can no longer get at any data on the drive.
 

Soulkeeper

Diamond Member
Nov 23, 2001
6,739
156
106
All that matter is if your smartctl output says the drive is failing.
Do a long self test.
Bad sectors are not ok.
All smartctl errors will be logged on the drive for the life of the drive regardless of filesystem or if the errors stopped.

the RMA process for wd/seagate isn't all that bad.
You're likely to only lose 1 way shipping cost and get a replacement in a week or so. I don't think they'll give you any hassle with the process.
 
Last edited:

_Rick_

Diamond Member
Apr 20, 2012
3,985
74
91
If it encounters an error that requires wiping the whole hard drive with Secure Erase to fix it, then the hard drive has failed in its primary purpose.

You only have to overwrite the bad sector(s). Not the entire hard drive.
You lose either 512 or 4096 bytes. This has to be expected over the lifetime of a drive. It's why you have backups or RAID.
 

murphyc

Senior member
Apr 7, 2012
235
0
0
If it encounters an error and recovers invisibly to the user, great. If it encounters an error that requires wiping the whole hard drive with Secure Erase to fix it, then the hard drive has failed in its primary purpose.

Flawed logic. For one Secure Erase is only one means of fixing a sector read error. Overwriting that single sector also will cause reallocation by the disk and fix the problem. Second, it's part of the drive specification that it will lose a certain amount of data (statistically). A disk being unable to recover data from one sector is not a failed disk.

A persistent write error, on the other hand, is a drive that has failed. Always.

I'd rather replace the dud drive with an error-free one and get my data backed up before the drive decides that some critical sector of my hard disk is unreadable and I can no longer get at any data on the drive.

I think your expectation of how drives work is idealized, and not realistic. The ECC of a drive can return bogus information to the file system, and the file system won't know it's bogus. This is also normal operation for consumer disks. i.e. the data is corrupt, but ECC doesn't detect it; or ECC detects it and corrects it incorrectly even though it thinks it fixed it correctly. And that too is normal and there really isn't anything you can do about it, it's not a warranty-able offense. It's how drives work. To get better ECC involves paying more for enterprise disks.
 

exdeath

Lifer
Jan 29, 2004
13,679
10
81
I'm wondering if this could be caused by a data cable. SATA cables are not the best in my experience. Wiggle em and the work fine for a while but eventually they seem to have strange issues.

The way the drive came back to life after being so supposedly degraded does not seem to me to be a disk surface error, but I could be wrong...

SMART data is logged internally by the drives direct media to microcontroller interface. SATA cable has nothing to do with it, or rather cable issues have their own spot called IDE checksum errors or something like that. Sector issues are genuine.
 

KyrosKrane

Member
Feb 25, 2010
26
0
61
Murphyc, I think we're agreeing. The point I was making is that if the drive can recover from the error and restore my data, then it's working fine from the user perspective. If the error causes some data loss, but the drive works normally otherwise, that's an iffy situation for me. If that drive was storing critical data, I'm replacing it or moving it to a non-critical role. If it holds non-essential stuff, I may tolerate the error. But if the nature of the error is such that only a total wipe can fix it, then that's classified as a failed drive in my book. There's a real chance that the error may reoccur, and I have neither the time nor the patience to keep wiping a drive to fix repeated errors.
 

murphyc

Senior member
Apr 7, 2012
235
0
0
Murphyc, I think we're agreeing. The point I was making is that if the drive can recover from the error and restore my data, then it's working fine from the user perspective.

There can be such a thing as a bad sector that disk ECC can detect and correct. So you have a bad sector, but the data is (at least for now) recoverable. If you don't force an overwrite to that sector, it will remain bad and eventually the data on it will be lost. The easiest way to deal with such bad sectors is to periodically wipe the disk with zeros or Secure Erase, because most people don't know how to use smartmontools, extended offline test to find sketchy LBAs and then find out what file occupies that LBA and only overwrite that file's LBAs.

If the error causes some data loss, but the drive works normally otherwise, that's an iffy situation for me.

That is the reason why modern file systems: ZFS, ReFS, btrfs, are checksumming data. Drives, especially consumer drives, are "working normally" and routinely passing bogus data. Obviously the overwhelmingly majority of the time it's good data. But enough of the time there are undetected or detected but not correctable or detected but incorrectly corrected data passed on to the file system, which for HFS+ and NTFS and FAT32 and extX have no way of knowing if it's wrong.

If that drive was storing critical data, I'm replacing it or moving it to a non-critical role.

If it's critical data, use a better drive. Have backups. Use a better file system. Consumer drives and file systems increasingly have a problem storing critical data reliably because the error rate is the same, yet the amount we are storing is much greater, with much larger drives.

But if the nature of the error is such that only a total wipe can fix it, then that's classified as a failed drive in my book.

Overwriting a persistent bad sector is sometimes the only way to cause it to be removed from use. That's part of the inherent design of hard drives. It's normal operation.

There's a real chance that the error may reoccur, and I have neither the time nor the patience to keep wiping a drive to fix repeated errors.

Data does indicate that when sectors go bad, it's more likely additional sectors will go bad, but it's not assured.

When you overwrite a bad sector and it's removed from use it is always removed from use, it doesn't even have an LBA. So it can't be used again. But new bad sectors MIGHT occur. A few bad sectors is not inherently grounds for having a disk replaced under warranty. It's not how the warranty reads, explicit or implicit. I don't know any manufacturer with a particularly clear policy on drive replacement in the face of bad sectors.
 

murphyc

Senior member
Apr 7, 2012
235
0
0
They won't check it, so RMA it to be safe.Otherwise it will come back to bite you ;)

It might come back to bite you, it's not a sure thing. They're going to want a reason, they won't just issue an RMA on request. They usually will do that for enterprise drives.

Even if the disk comes up once for all good sectors before put into use, the only way you're going to know if there are problems is if you're doing regular SMART testing, which most people don't do. If the data is critical, you SMART test regularly, and you backup. If you don't, it's not that critical.
 

vss1980

Platinum Member
Feb 29, 2000
2,944
0
76
Sorry, as an update, the drive finally ended up with (based on the SMART data from HDD Sentinel and Crystal disk info) over a few hundred reallocated sectors, the 1 unrecoverable sector (which I assume is a statistical bit of data and was also reallocated), and 14053 Reported Uncorrectable Errors (smart attribute BB).