Discussion I completely mis-read these SMART readings the first time...

mikeymikec

Lifer
May 19, 2011
17,714
9,598
136
I'm posting partly as I've learnt something (though I'm not sure how I would have learnt it from CrystalDiskInfo), and partly because maybe someone can point me in the right direction with CDI.

I had a situation whereby a customer brought a laptop in with a suspect HDD (wouldn't boot, Startup Repair took many hours and still failed). After doing a quick data backup, I ran chkdsk with the HDD back in the laptop and sure enough it crashed and burned during a full chkdsk with 'unspecified error <~12 digit code here>'. The disk didn't even show up in diskpart afterwards, then came back on the reboot.

Where things went a little crazy was that a) no disk errors/warnings in the event log and b) SMART data in CSM (at a glance) said everything was fine, then c) the first/second time I tried to install Windows 10 on the SSD'd laptop it either got stuck (no freeze), or threw IIRC 80070057 at the disk partitioning stage, which got me worried thinking that maybe there's an issue in the laptop that affects SATA? I was changing things up in the BIOS as I had enabled secure boot before starting the install, but when it finally allowed me through setup, secure boot was once again disabled (the only 'permanent' difference between the BIOS settings before and after I got the laptop is that CSM is now disabled, which doesn't really inspire confidence in me that CSM that isn't in play (the old HDD has a GUID partition table so it had to have been in UEFI mode, right?) could affect UEFI stability.

Here's the 'at a glance' seemingly OK CSM screenshot:

smart hill.png

and here's what the Disks utility in Linux showed (note uncorrectable sectors and UDMA CRC!):

smart hill.png

Needless so say I haven't got to grips with raw values in CDM :) Normally if I do any checking of raw values, I'm looking for complete zeroes in the sectors entries, but I didn't check this time partly because I'm used to using an older version of CDM that flagged any iffy sectors as 'caution' which I definitely agree with.

One question - do UDMA CRC errors and bad sectors normally go hand-in-hand? I'll try googling for it as well.
- edit - UDMA CRCs - people are suggesting iffy cable. Tricky in a laptop! I'll check the SSD's SMART readings once it's finished with memtest86.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,348
10,048
126
Yeah, in a desktop, that would make me suspect a SATA cable issue.

If there's a signal / noise interference issue with 6Gb/sec SATA, you could always enable a SATA II (3.0Gb/sec) persistent DCO on the drive. Maybe that would help.

Edit: Maybe one of the chips on the drive's PCB is faulty and overheating, driving signals out of spec.

On the whole, can't you just swap for an SSD?
 

mikeymikec

Lifer
May 19, 2011
17,714
9,598
136
I did say I've replaced the HDD with an SSD :) For added fun and games, the SSD is also reporting CRC errors. I've disabled NCQ in the AMD sata driver but it doesn't appear to have helped (it didn't seem likely to, but it was suggested on the interwebs).
 

Shmee

Memory & Storage, Graphics Cards Mod Elite Member
Super Moderator
Sep 13, 2008
7,408
2,440
146
Likely just an issue with the laptops SATA controller. Also I would leave secure boot off, it generally just gets in the way.
 

fzabkar

Member
Jun 14, 2013
141
32
101
Raw SMART values usually make more sense in hexadecimal.

For example CDI is reporting 0x15150015 for Airflow Temperature. That is actually 3 hex numbers, 0x15 / 0x15 / 0x0015. These are the maximum, minimum and current temperatures for the current power cycle (0x15 = 21 decimal).

Notice also the Command Timeout attribute:

  • 0x0004 / 0x0004 / 0xFFFF

I don't know what to make of the last figure (0xFFFF = 65535), but it doesn't look good.
 

mikeymikec

Lifer
May 19, 2011
17,714
9,598
136
Update:

As the customer was continually hassling me to get their laptop back, and as I had no solid reason to say no, I informed them of the non-ideal situation (at the time the system was stable with the SSD in, but the CRC SMART attribute regularly incremented). Just shy of a couple of weeks later, Windows refused to boot.

I checked things like if the original SATA connector was probably seated, then bought a new one. I checked the SSD and SMART readings at the point of having the laptop open. The cable was a bit of a pain to fit (board removal to get to the dark side of the board), but fingers crossed and a fresh Windows install later, the CRC attribute hasn't budged yet. I've installed Win10 1803, currently updating to 20H2, I'll do 21H1 after that. I suppose another thing I could try is throwing a tonne of data at the SSD and see if the CRC attribute changes and also check for disk errors/warnings in the event log.
 
Jul 27, 2020
16,329
10,343
106
Update:

As the customer was continually hassling me to get their laptop back, and as I had no solid reason to say no, I informed them of the non-ideal situation (at the time the system was stable with the SSD in, but the CRC SMART attribute regularly incremented). Just shy of a couple of weeks later, Windows refused to boot.

I checked things like if the original SATA connector was probably seated, then bought a new one. I checked the SSD and SMART readings at the point of having the laptop open. The cable was a bit of a pain to fit (board removal to get to the dark side of the board), but fingers crossed and a fresh Windows install later, the CRC attribute hasn't budged yet. I've installed Win10 1803, currently updating to 20H2, I'll do 21H1 after that. I suppose another thing I could try is throwing a tonne of data at the SSD and see if the CRC attribute changes and also check for disk errors/warnings in the event log.
I rely on Speedfan's HDDstatus info. It is very helpful in figuring out if a disk should be replaced. This happened yesterday:

1637507044252.png

Needless to say, this HDD is gonna end up in the trash pretty soon.