Why do dying HDD's overheat? What is creating the heat?

GoodEnough

Golden Member
Apr 24, 2011
1,546
19
81
Always meant to ask this.
Why do they get so crazy hot?
What is creating the heat?
 

mikeymikec

Lifer
May 19, 2011
17,767
9,727
136
TBH it's been a long time since I last encountered a particularly hot and faulty HDD. In desktops it used to be more common for the cases to be extremely inhibiting airflow combined with chassis materials that acted as heat insulators, and components that generated a lot of heat, so the HDD would generally cook in the corner of a metal box. These days PSU design has generally changed so that at least some air is sucked from the inside of the case, and generally cases have more air ducts so therefore the air is being stirred a bit more.

Laptops have always been bad from a ventilation perspective (well, measuring from the era that laptops became mainstream, so 2005 ish).

Another factor might be drive activity: If a drive is failing, it may be trying to do a lot of sector recovery work as well as its normal workload, so therefore a busier drive that's hastening its own demise?

I can't remember being at risk of a burn by picking up a HDD, it used to be a lot more common IME.
 

corkyg

Elite Member | Peripherals
Super Moderator
Mar 4, 2000
27,370
238
106
Possibly friction is involved due to bearings getting worn. I have never experienced it.
 

sdifox

No Lifer
Sep 30, 2005
95,190
15,227
126
when a sector goes marginal or bad, the hdd moves the block to somewhere else. when the whole hdd is full of bad sectors, it just thrashes to no end. Thus the heat.
 

corkyg

Elite Member | Peripherals
Super Moderator
Mar 4, 2000
27,370
238
106
when a sector goes marginal or bad, the hdd moves the block to somewhere else. when the whole hdd is full of bad sectors, it just thrashes to no end. Thus the heat.

Agree! And thrashing produces friction.
 

ch33zw1z

Lifer
Nov 4, 2004
37,791
18,089
146
Possibly friction is involved due to bearings getting worn. I have never experienced it.

Yup, mechanical failure is first suspect, after that possibly an electrical failure like a short or failed motor, but mechanical would be my first guess
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Is the dying causing the overheating? Or is the overheating causing the dying.

Well, if the sector dying causes it to ramp up activity, that will make it heat up, and maybe too much.

So its the former. It starts to die, it gets in a frenzy, and it overheats.
 

mikeymikec

Lifer
May 19, 2011
17,767
9,727
136
However, as most of us with experience of drives reallocating sectors will attest to, it doesn't often happen immediately. I suspect the drive will schedule the reallocation op for whenever the drive detects an idle period. I often see ailing drives at the 'pending' stage.

Kinda blows a hole in the vicious circle theory IMO.

Unfortunately I haven't been able to find a good article that describes how the process or reallocation tends to be scheduled though. I can't imagine there's much variation between each manufacturer's implementation or that they'd be tremendously secretive about it, but hey ho.

https://kb.acronis.com/content/9105
This article talks about a sudden decrease of 10% or more in drive performance when reallocation occurs, but I assume that's down to sequential performance tanking because the reserved sector is a continent away from the rest of the data.
 

corkyg

Elite Member | Peripherals
Super Moderator
Mar 4, 2000
27,370
238
106
When a drive gets so full that reallocation becomes difficult, bad things happen. That's why I always leave 30% free.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,376
10,068
126
When a drive gets so full that reallocation becomes difficult, bad things happen. That's why I always leave 30% free.
That may be well and good for the drive and performance as the OS / filesystem-level fragmentation goes, but that shouldn't have any bearing on re-allocation of sectors on a HDD. HDDs actually come with a certain percentage of "spare area", that is NOT host-accessible, that they use for sector-sparing. When that area is used up, the drive DOES NOT start using up host-accessible sectors as spares, it simply starts to error.

So, Corky, your practice is well-intentioned, but not actually helpful for that particular purpose. Like I said, though, it's good to reduce fragmentation at the file-system level.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,376
10,068
126
Unfortunately I haven't been able to find a good article that describes how the process or reallocation tends to be scheduled though. I
It's not "scheduled" at all. It simply waits until there is a host write to that particular logical sector, then it writes to the "spare sector", and updates the re-mapping table, stored in the firmware / SMART / remapping table sectors in a hidden area (not normally host-accessible, without diagnostic-level commands) at the beginning of the drive.
 

corkyg

Elite Member | Peripherals
Super Moderator
Mar 4, 2000
27,370
238
106
Yep, there are hidden reserves, but I still like a healthy freeboard. :)
 

mikeymikec

Lifer
May 19, 2011
17,767
9,727
136
It's not "scheduled" at all. It simply waits until there is a host write to that particular logical sector, then it writes to the "spare sector", and updates the re-mapping table, stored in the firmware / SMART / remapping table sectors in a hidden area (not normally host-accessible, without diagnostic-level commands) at the beginning of the drive.

Citation needed. Surely if it was done pretty much immediately then a 'pending sectors' attribute wouldn't be required?
 

VirtualLarry

No Lifer
Aug 25, 2001
56,376
10,068
126
Surely if it was done pretty much immediately then a 'pending sectors' attribute wouldn't be required?
Sure it is. The error is detected on a read, but it's not safe to replace that host-accessable sector, until the host writes to it. Pretty straightforward.
 

mikeymikec

Lifer
May 19, 2011
17,767
9,727
136
Sure it is. The error is detected on a read, but it's not safe to replace that host-accessable sector, until the host writes to it. Pretty straightforward.

Isn't there a massive flaw in that implementation though, being that most files in a given filesystem do not get written to with any kind of regularity?

(NB: I'm not arguing at this point along the lines of "therefore you must be wrong"... although this disagrees with you:
"Any requests to read or write to that damaged sector will transparently be redirected to a spare sector.")
 

VirtualLarry

No Lifer
Aug 25, 2001
56,376
10,068
126
Did you read the winning answer? It's completely correct, and thorough.

My hard drive has 2 sectors that the drive recognizes as bad, but that cannot be reallocated yet. If you were to attempt to read one of these ‘Pending sectors’, the drive would likely retry (and retry, and retry), and eventually return a read error to the host operating system as shown below:-

It doesn't remap on reads, normally. It usually just errors, and hangs the drive for a while (unless it's a TLER drive). It remaps on writes.
 

mikeymikec

Lifer
May 19, 2011
17,767
9,727
136
Did you read the winning answer? It's completely correct, and thorough.

Of course I did, I posted it here for a reason.

It doesn't remap on reads, normally. It usually just errors, and hangs the drive for a while (unless it's a TLER drive). It remaps on writes.

Two conflicting theories, no citations.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,376
10,068
126
I think that you're misunderstanding here.

That statement, is talking about AFTER a sector has been "spared" by the drive. At that point, any attempt to read or write to the logical host address (LBA) of the original sector, will be transparently re-directed to the spare sector.

I was talking about BEFORE the sector is spared. The data is generally damaged, and cannot be read, such that further attempts to read that sector, will result in more-or-less drive hangs, until it times out (although the first time that happens, that sector will get added to the "pending" list in SMART), and then when that LBA is written, only then can the drive transparently re-map it, re-directing the written host sector to the spare sector, and updating the internal remap table. From then on, reads of that host LBA will be transparently re-directed to the spare sector.