CHKDSK finds and corrects errors, now what?

lakedude

Platinum Member
Mar 14, 2009
2,549
265
126
We have a production machine at work running XP that refused to boot citing missing or corrupted dll files. After several attempts to boot we pulled the drive out and stuck it in a working XP machine booting from a good copy of Windows XP. The system immediately noticed issues with the problematic disk and automatically ran a 3 stage CHKDSK. The scan found lots of problems at stage 2 but claimed to have corrected them. They wanted the machine to run ASAP so we installed the problematic drive back in the machine after the CHKDSK scan and the machine booted and ran.

Needless to say I'm a bit worried about the long term health of the drive.

What do y'all thing the issue was?

Software corruption, perhaps from recent power outages caused by stormy weather?

The problematic drive is having hardware problems and is failing?

The computer is having some issues, perhaps with RAM and it is corrupting the software?

We have another hard disk on order. We can load software from scratch but cloning is much easier, would you chance a clone (we can always load software by hand later if need be)?

Would CHKDSK be able to find and correct errors if the drive itself had hardware issues?
 

mikeymikec

Lifer
May 19, 2011
17,700
9,555
136
The '3 stage chkdsk' actually checks for very little: Filesystem inconsistencies, that's about it, which any healthy drive could end up having and is completely fixable without any long-term worries.

A full chkdsk can be run with this command: chkdsk driveletter: /v /r

It'll take somewhat longer to run and actually checks that every bit of data is accessible as well as the free sectors. The quick chkdsk that you refer to is like asking a librarian to read out to you of all the book titles in their library and whether their index of those books seems to be in order, and the librarian doesn't even need to leave their desk or check a single book. A full chkdsk is more like asking for a complete audit of all the books and check whether they're all in usable condition, as well as all of the shelves, used or otherwise.

As for chkdsk and hardware issues: In theory, a hard drive should spot sectors that aren't properly readable and take action to try and recover data then rule out those sectors from being used in future, and all of this should get done without downtime or the user noticing anything awry. This does not "fix" those bad sectors, the drive simply startsusing sectors from its reserve instead of the bad ones, in place for situations like this. In practice though, it is common for the hard drive not to spot this kind of problem in the first place. That's when a full chkdsk comes in, then if there is a bad sector, Windows basically does the same thing as the hard drive should have done for itself, though chkdsk checks all sectors rather than just getting a report that an iffy sector is causing problems and acting on that sector. Again, it does not "fix" bad sectors.

So, let's say a hard drive develops a bad sector. Whether the drive catches it itself or whether chkdsk does, what does this mean in the long term? In my experience HDDs developing bad sectors tends to be a chronic/degenerative issue, ie. once a drive starts having a problem like this, it will likely continue to develop more bad sectors over time, and it's just a matter of time before a bad sector forms right where a critical bit of OS data is stored, then Windows starts crashing. IMO if unexpected downtime is something that the user wants to throw money at in order to avoid (ie. if unexpected downtime means severe inconvenience), then replacing the drive sooner rather than later is probably a wise precaution.

A utility like CrystalDiskInfo (I recommend downloading the zip file rather than the setup exe) will report whether the drive's own diagnostic system reports anything iffy about the drive (though I've seen "healthy" drives according to SMART data that were totally up the creek, and chkdsk locating bad sectors won't get reported by this utility).

The full chkdsk normally takes an hour or more to run, depending on drive capacity and performance as well as system performance. Interrupting it can be very unhealthy for the drive.
 

Ketchup

Elite Member
Sep 1, 2002
14,545
236
106
Good post by mikeymikec. If the problem continues, the drive is going to need to be replaced. And whatever the scenario, you need to have another drive on hand that is an exact duplicate (data-wise) of this drive. Good to know you have one on order. Clone the drive, see if it runs after the clone, but continue to run on the original drive so that your backup doesn't turn out to be a failing drive.
 

Malogeek

Golden Member
Mar 5, 2017
1,390
778
136
yaktribe.org
The takeaway from this that hopefully the company will address is if the data/applications on the system is critical to operation then some kind of redundancy and/or disaster recovery should be put in place for this obviously aged PC.
 

lakedude

Platinum Member
Mar 14, 2009
2,549
265
126
Thanks guys. I think when the new part comes (we would ordinarily already have at least a blank spare) I'll clone the drive and then run a full chkdsk on the clone. This will reduce downtime to the time it takes to make a clone.
 

lakedude

Platinum Member
Mar 14, 2009
2,549
265
126
BTW we have 8 similar machines. 5 are exactly the same. We can make a clone from any one of the 5 to use in any other of the 5, however the machine in question is one of the unique ones so we were a little unprepared (panicked) when it didn't boot.
 
  • Like
Reactions: Malogeek

lakedude

Platinum Member
Mar 14, 2009
2,549
265
126
Update: We stole a 74GB Raptor drive from an old scrapped machine and cloned it from the problematic 36GB Raptor drive.

I had planned to do a CHKDSK on the "new" drive but ran into some confusion. Micro$oft CHKDSK instructions say to use an elevated (administrator) level I don't have access to (at work). So we decided to put the new clone in the machine to test it rather than run with the original possibly failing disk.

The clone seems to work fine in the machine and it is much faster as well...

I then ran a 5 stage CHKDSK /v /r (without worrying about the admin privilege level) and scanned the 4 partitions of the original disk (each separately by drive letter). 3 out of 4 partitions checked out with only minor errors and no bad sectors but the main system partition reported 36KB in bad sectors out of 36GB total on the WD Raptor 36 drive.