Hard drive Pending Sector errors: how severe?

ahallanandtech

Junior Member
Mar 7, 2010
8
0
0
I monitor my hard drives' S.M.A.R.T. status with Crystal Disk Info. I've encountered a number of hard drives with "Pending Sector" errors that seemed also to crash the hard drive. In one case, after finding twelve pending sector errors, I ran chkdsk in Windows and although it "recovered" five or six files, it also corrupted them and made them useless (I had backups). In another case, I had a laptop drive that would blue screen however you tried to boot it (even in safe mode); I cloned this drive to another that that one booted right up. The original drive had one pending sector error.

However, after running DBAN on both failed drives to blank them out, all of the pending sector errors disappeared. I might have expected that they would have been replaced by spare sectors, but according to CDI, neither drive has any reallocated sectors or any other S.M.A.R.T. errors anymore.

So how should I view these drives for future use? One was a few years old but the other is only about 6 months old - still under warranty but I can't RMA it I imagine with no errors anymore (it also passes the Western Digital diagnostic with no errors).

Pending Sectors are sectors on the drive that cannot be read, right? What causes them? If blanking these sectors does not cause them to be deactivated by the drive firmware and replaced (as seems to have happened with these two drives), should I assume those sectors are really OK? I had assumed that such drives were basically dying and I should probably recycle them - but could such errors just be glitches that could appear from time to time and not indicate pending hardware failure of the drives, making them no more likely than any other drive to fail in the future?

Thanks for your insight!
 

Russwinters

Senior member
Jul 31, 2009
409
0
0
Pending sectors are sectors that the drive has flagged as "probably bad"


Most likely the drive "tested" them and found them to not be bad, most likely they were UNC errors that were caused by failed/incomplete writes causing the ECC to not match the actual data.


Unfortunately the way CHKDSK tests these is probably the Read/Write/Read method.

So all the data in that sector for your file was wiped out.


Hence why CHKDSK is bad to run for recovery purposes, but ok if you want to "bandaid" a drive.
 

ahallanandtech

Junior Member
Mar 7, 2010
8
0
0
Thanks for your reply, Russ. What I'm really trying to figure out is whether getting pending sector errors like this is somewhat "normal" (if potentially disastrous since it causes lost data) and may happen without hardware issues - so once I clear the pending sectors by blanking the disk can I safely re-use the drives? I used to believe that a drive that showed pending sector errors must be on its way to failing and I should get rid of it or stop using it as soon as I got my data off and zero'ed it. But perhaps these problems can occur as an unfortunate fact of life once in a great while on any properly-functioning drive and one must simply be extremely cautious?
 

Russwinters

Senior member
Jul 31, 2009
409
0
0
Well, I always recommend to have a backup no matter what. Even if your drive is 1 day old.


So if you follow this rule, then as long as the drive is performing well you can use it with no worries. From time to time some sequence of events can take place causing the drive to incorrectly write data/not finish writing to a sector, causing it to be initially misinterpreted as a bad sector, when actually the sector is still magnetically stable, but the checksum with ECC simply doesn't match because the data that was supposed to be written did not get written properly somehow.

Once you perform a Read/Write/Read test the drive will see this and unmark it as pending, but not reallocate it. (it would only reallocate if it failed such test)

sometimes the drive can reallocate these sectors when it should not, this is probably because the drive performed a self test and failed (maybe the self test is not a very thorough one.)



Basically, if you have a backup, yes use it.

If you do not keep a backup...then you are not much safer replacing the drive, all drives fail, even new ones. Never trust a hard drive. Ever.


Regards,
 

ahallanandtech

Junior Member
Mar 7, 2010
8
0
0
Right, I keep backups and well understand that drives can fail even on Day 1. I also now understand how important it is to do real-time S.M.A.R.T. monitoring as I was doing in this case. I had six files get corrupted and would never have even noticed (they weren't files I used regularly, a few were Windows DLL files) until Crystal Disk Info popped up an info message. I had all the files backed up.

But, unless you are doing a RAID (not really likely in a laptop), you are probably not backing up every hour. You're taking your chances til the next backup. And while you know there's always a chance your drive could fail with little or no warning, would you want to take a chance on a drive that has a higher risk of failure?

That's again what I am getting at: does the fact that I got these pending sector errors - now clear after blanking - mean that this drive has a higher likelihood of failure than any other normally functioning drive? Yeah, I understand it could just be ECC errors on 12 sectors - but why did those happen in the first place? Was that due to some pre-failure of the drive, or was it one of those once-a-year Windows events you just have to tolerate? Is this "normal" behavior for hard drives even if it seldom occurs? And what might actually cause it? I hate the idea that there's nothing I can do to prevent it from recurring and just have to take my chances between backups.

And depending on the type of backups you make - what if you back up a drive that had pending sector errors - like mine did - and clobber your good backup files with corrupted files? I had six corrupted but - thanks to chkdsk - perfectly readable files. Had I not been paying attention, I could easily have synced up those corrupted files over my backup. Do I need to start doing incremental or differential backups now just to be safe?
 

Russwinters

Senior member
Jul 31, 2009
409
0
0
I have a feeling it's pretty safe.


Whats is the brand/model of the drive?



to be sure, Run some tests with HDDscan (www.hddscan.com)


it is a nice piece of freeware for testing your disk.


Regards,
 

sheh

Senior member
Jul 25, 2005
245
7
81
I don't know why they disappeared afterwards, maybe they were just suspect and the drive decided it's okay. Are you sure what CHKDSK found was related to the SMART bad sectors?

I have 22 reallocated sectors and 1 reallocation event on a 4 years old WD 80GB drive and haven't encoutered issues or bad files yet. I too would be suspicious of pending sectors that disappear but without getting a clear SMART error that could get the drive replaced I'm not sure what to do.

I don't think bad data could be read without triggering an error (from the drive). That's my impression at least.
 

ahallanandtech

Junior Member
Mar 7, 2010
8
0
0
I was originally talking about a 500GB WD Scorpio Blue laptop drive (12 pending sector errors, six corrupt files, all errors clear after I blanked the drive). However, I experienced something similar with a second drive, an older Hitatchi laptop drive that had one pending sector error and would not boot XP - it blue screened no matter how I tried to boot it, but when I cloned it to a replacement drive, the replacement booted up just fine. Meanwhile, I have also blanked this Hitatchi drive and it, too, now no longer shows any pending sector errors or S.M.A.R.T. errors. I have since played with this Hitatchi drive by installing XP on it and then Windows 7 on top of that - and both installed fine - that drive seems to be working perfectly.

If it matters, both of these drives had been operating in different Dell Inspiron laptops (not the same model either - one is Intel based, one is AMD based).

So that's why I am starting to wonder if these pending sector errors aren't somehow just flukes that can happen once in a while - who knows why? - and do not indicate any hardware problem at all with the drives. As I said, I formerly just assumed that if a drive had pending sector errors in S.M.A.R.T. and some other issue (corrupted files, won't boot) that the drive must be dying - but maybe in fact these are just flukes you can fix by blanking the drive and if all the pending sector errors go away, re-use the drive with caution.
 

Russwinters

Senior member
Jul 31, 2009
409
0
0
Thats exactly what they are, flukes.

Of course they can be caused by an array of different things, and certain things can lead to a dead hard drive.


You experience random write errors, the are resulting in a "software bad sector" not a "physical bad sector"


all it means is that to the OS (and the hard drive initially) the sector is UNC error.

But in fact the sector is fully capable magnetically, the data just doesn't match the ECC which
causes the reported bad sector, and of course because the data didn't get written properly, along with the fact that if the ECC doesn't match the drive will refuse to return that sector (without some special tricks) your OS will not boot properly in some cases.


Once the drive does the final self test of the sector to determine if it needs to reallocate it, it realizes that it was only a software bad sector, so it will reuse it, but the data in that sector will be lost. (once the sector gets reused)
 

sheh

Senior member
Jul 25, 2005
245
7
81
Software errors should not manifest at hardware level, which these sectors are. I don't know what was the cause but I don't think "fluke" is a good explanation. HDDs should be and are reliable, at least in properly flagging as bad when needed and not doing so when unneeded. Maybe it is related to the fact these were notebook computers; perhaps an error due to moving the computer while the HDDs were active.

ahall: Try asking the drive manufacturer(s) what they think. Perhaps not highly likely, but you might get a helpful reply. :)
 

RebateMonger

Elite Member
Dec 24, 2005
11,588
0
0
Purely "logical" corruption of hard drives isn't very common these days. In the days of FAT32 Windows 98, it was much more common.

If you are seeing corruption of hard drives, it's likely it's a hardware problem. Chkdsk will do whatever's necessary to restore a "logically correct" partition, which may mean it'll delete files it can't correct.

If you are concerned about long-term loss of files, the best solution (not perfect) is maintaining archival backups. Windows Home Server, for instance, will keep a daily version of every file on your PC for months at a time. If files get deleted, overwritten, or corrupted, you can go back to the last backup of correct file and restore it. You can do the same thing with other backup software, although you may have to do some manual backup file management.

This is one big reason why keeping only a single backup is dangerous. A deleted or corrupted file or a bad backup will overwrite the "good" backup and things are lost.
 

ahallanandtech

Junior Member
Mar 7, 2010
8
0
0
Software errors should not manifest at hardware level, which these sectors are. I don't know what was the cause but I don't think "fluke" is a good explanation. HDDs should be and are reliable, at least in properly flagging as bad when needed and not doing so when unneeded. Maybe it is related to the fact these were notebook computers; perhaps an error due to moving the computer while the HDDs were active.

ahall: Try asking the drive manufacturer(s) what they think. Perhaps not highly likely, but you might get a helpful reply. :)

One of these is a WD drive, the other a Hitachi. I did check with WD (in part because that drive is still under warranty) and they basically said what I thought they'd say: sure, no errors, the drive is fine, go ahead and keep using it.

I guess I'll go ahead and keep using the drives instead of tossing them - and WD isn't going to let me RMA the drive anyway. I'm not sure it gains me anything. I still wish I knew what caused these original pending sector errors. One of them was my laptop and I am careful about not moving it if possible - like, when it's hibernating, I don't move it til it is finished dumping the memory file and then shuts off, because that's sustained hard drive activity.
 

RebateMonger

Elite Member
Dec 24, 2005
11,588
0
0
- and WD isn't going to let me RMA the drive anyway. I'm not sure it gains me anything.
WD will let you RMA if you want. You can even get an Advanced Replacement, where they'll send you a refurbished drive in a shipping crate and you can return your old drive in the next 30 days.

I don't know of any disk makers that require you prove that a disk is failing before you can RMA it.
 

Mark R

Diamond Member
Oct 9, 1999
8,513
14
81
A 'pending' sector is one where the drive has detected corrupted, possibly corrupted data, or it has detected that the recording in that sector is 'weak' and is at risk of getting corrupted.

Modern drives will try to move data to another area of the disk, a process called reallocation, when it occupies a defective sector. Depending on the fault, the sector may not be redirected immediately - and the drive may decide to retest the sector, next time the sector is written to. When a sector is waiting for retesting, it is called 'pending'.

If a sector goes weak, but is still readable - the drive may automatically reallocate it, next time the sector is read. Or maybe it will wait a bit, and next time it does an idle scan, re-test it, and if it fails again, reallocate it.

However, if the sector is unreadable, next time the PC tries to read it, the drive will say 'unreadable sector' to the PC - but won't reallocate immedately - it can't copy the data to a new sector - because it doesn't know what the data should be. However, when the PC wants to write to the sector, and trash the data, then it can try writing, and then, if necessary, reallocate the new data.

It's important to recognise that there are many reasons why a data on a drive may be corrupted internally.
Soft errors: These are fluke occurances, that cause incorrect or weak data to be recorded to the disk. The data bits on modern drives are so small, that the signal is incredibly weak, and is very sensitive to any defect, and outside interference. Indeed, the signal is so weak, that the head's don't get a perfectly clear digital signal - and a complex 'maximum likelihood' statistical analysis is required to find the most likely digital signal to explain the readings - as a result, there is a small chance that the data just can't be read, because there's too much interference. Interference may include, power glitches, vibration (causing the heads to move too far out of range while writing), temperature fluctuations, cosmic rays affecting the operation of the drive's CPU, etc.
These 'soft' errors, while they may mean data loss or cause the drive to report a bad sector - do not mean that the disk platter is faulty. If the drive rewrites new data to that sector, the chances are that the new data will be fine.
This is the point of 'pending' sectors for reallocating - the drive manufacturers know that 'soft' errors are common, and there's no point wasting spare space on them, as long as you can recognise that's what they are.
Most drive manufacturers quote a soft error rate, in the region of 1 corrupted sector per 10 TB for consumer level drives. Enterprise level drives are usually quoted at 1 corrupted sector per 100 TB. 4k 'advanced format' drives are quoted higher still. High-end SSDs are often quoted at up to 1 corrupted sector per 100 PB.

Hard errors: These represent data lost because the disk platter has degraded. That particular part of the surface can't hold a magnetic pattern reliably - e.g. due to age, damage due to contact with the head, contamination, manufacturing fault, etc.

Drives are expected to get hard errors over the course of their life time, and they will copy the data (if the sector is still readable) to spare sectors when a hard error is confirmed.

So, what for the OP:
A sudden flurry of 'pending' sectors and corrupted data suggests that there was some sort of event that caused the drive to lose data. Possibly, a power glitch - e.g. loose power cable, or vibration when those sectors were being written (e.g. if it's a laptop, it could have been picked put down harder than normal or dropped a short distance).

However, if the pending sectors have disappeared and the drive wipes fine, then it seems likely that whatever caused the data to get corrupted has gone - and we'll never know what it was.

While conceivably, it could be a problem with the drive mechanism - these are usually catastrophic.
 

mips

Junior Member
Nov 15, 2005
10
0
0
But in fact the sector is fully capable magnetically, the data just doesn't match the ECC which
causes the reported bad sector, and of course because the data didn't get written properly, along with the fact that if the ECC doesn't match the drive will refuse to return that sector (without some special tricks) your OS will not boot properly in some cases.

What "special tricks" are you referring to?

I have a 2.5" Hitachi here with lots of UNC errors / bad blocks. I would like to try and read those bad blocks with the drive ignoring ECC.

Any advice?
 
Last edited: