Is bit rot a concern?

It's Not Lupus

Senior member
Aug 19, 2012
838
3
76
I have about 3 TB of data with one back-up copy. The file system used is NTFS.

Anyways, I was wondering about bit rot recently, if I should be concerned, and possible ways to protect against it (without building a ZFS machine).
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
Maybe, maybe not. It's hard to tell. One HDD may never lose a sector, while one might have a platter with thousands of marginal ones; or maybe some writes will be only barely passable, and take little in the way of weakening to become unreadable.

You'd generally want PAR, or some kind of SFV system, to keep track of checksums. Windows being my primary desktop OS, I lazily use 7z--CRC32 might not be the best CRC for arbitrary data sizes, but the chances of just right bits flipping to not change the CRCs are still astronomically small (one of those, "it's basically good enough, but you'd be stupid to design something today with it," kind of things).
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Good idea Cerb. Although I have multiple off site backups, running par on my user folders is a good idea
 

stlc8tr

Golden Member
Jan 5, 2011
1,106
4
76
I've been using checksum to generate a MD5 hash for my backups. You can also use SHA1 if so inclined.

There are a few similar utilities like ExactCopy but I like checksum because it's lightweight and well integrated into Windows. Supposedly it's also faster but I haven't done any benchmarks.
 

ashetos

Senior member
Jul 23, 2013
254
14
76
For NTFS I run synctoy with the "check file contents" option enabled. I have done targeted experiments and it indeed detects data differences even if size, date and everything else is identical, especially in "synchronize" mode.

I would guess it is friendly enough for most users, and its protection level is pretty good in my opinion.
 

corkyg

Elite Member | Peripherals
Super Moderator
Mar 4, 2000
27,370
238
106
"Bitrot" has, in the past, been associated with magnetic media. As Cerb said, a lot depends on how it is stored, etc. This past week I was pleasantly surprised to open some files I created back in 1990 stored on 3.5-in floppies using old WordPerfect 5.1. They opened perfectly in WordPerfect X5. I have since re-copied them to other more contemporary media.

These diskettes have been stored in a drawer in a climate controlled room for 24 years, and they are in good shape. No evidence of bitrot.

As for "atuomated" checking, I have a personal distrust of anything automated, especially when on the spot decisions may be needed. :)
 

It's Not Lupus

Senior member
Aug 19, 2012
838
3
76
For NTFS I run synctoy with the "check file contents" option enabled. I have done targeted experiments and it indeed detects data differences even if size, date and everything else is identical, especially in "synchronize" mode.

I would guess it is friendly enough for most users, and its protection level is pretty good in my opinion.
I've been using SyncToy too, but never bothered to check the additional options. Thanks, I'll consider this.
 
Last edited:

Turbonium

Platinum Member
Mar 15, 2003
2,109
48
91
Bit rot confuses me. Has it only become an issue with the advent of larger capacity HDDs? I never really heard about it until a year ago.

I'm a bit paranoid of it, despite my backups, and backups of the backups.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
It's nothing new, though the physical size and SnR of your data keeps getting smaller.

No need for paranoia. Just have a way to verify your backups. There's a good chance that your specific set of storage device samples will never suffer from it.
 

KingFatty

Diamond Member
Dec 29, 2010
3,034
1
81
For NTFS I run synctoy with the "check file contents" option enabled. I have done targeted experiments and it indeed detects data differences even if size, date and everything else is identical, especially in "synchronize" mode.

I would guess it is friendly enough for most users, and its protection level is pretty good in my opinion.

But would this catch bit rot? I'm not certain, but synctoy seems to do some kind of checksum at first, and then I'm not sure how it decides to ever recalculate that checksum if bit rot occurs. I would guess that synctoy instead would not bother checking the file contents unless it thought something was changed by the windows operating system. But bit rot is like coming in through the back door where it's not a change caused by the OS or anything, so I'm not sure synctoy would even know there was a trigger to cause it to recalculate the checksum or whatever to try to catch bitrot?
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
Higher platter density makes this worse though, right?
That basically is higher platter density. Other advancement help offset it, like more robust ECC, better heads, and more precise control of the arm.

The thing is, if you do what's needed to help protect against bit rot, or any apparent URE problem that appears the same, you will have also protected against several avenues of software-level corruption, and loss of the whole HDD.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
But would this catch bit rot? I'm not certain, but synctoy seems to do some kind of checksum at first, and then I'm not sure how it decides to ever recalculate that checksum if bit rot occurs. I would guess that synctoy instead would not bother checking the file contents unless it thought something was changed by the windows operating system. But bit rot is like coming in through the back door where it's not a change caused by the OS or anything, so I'm not sure synctoy would even know there was a trigger to cause it to recalculate the checksum or whatever to try to catch bitrot?
Not sure what synctoy would do or not do, but you'd encounter a CRC error, and it would be in your system event log.
 

Turbonium

Platinum Member
Mar 15, 2003
2,109
48
91
That basically is higher platter density. Other advancement help offset it, like more robust ECC, better heads, and more precise control of the arm.

The thing is, if you do what's needed to help protect against bit rot, or any apparent URE problem that appears the same, you will have also protected against several avenues of software-level corruption, and loss of the whole HDD.
What is needed? Sorry if this is a silly question, but I'm a bit confused.

All I can see so far is rewriting data regularly, and I don't even know what "regularly" is.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
What is needed? Sorry if this is a silly question, but I'm a bit confused.

All I can see so far is rewriting data regularly, and I don't even know what "regularly" is.
Against a failed read, of any kind that is detected by the drive or controller, another copy. Against more than that, the ability to compare the CRCs from when the backup was made, at the least. Most commercial backup software does this stuff, in one way or another.

What regularly is depends on how important it is that you know all your copies of all your data are good right now. There's no real hard rule, if you can't quantify or qualify lost work due to the loss, so whatever works for you is what is good.
 

ashetos

Senior member
Jul 23, 2013
254
14
76
But would this catch bit rot? I'm not certain, but synctoy seems to do some kind of checksum at first, and then I'm not sure how it decides to ever recalculate that checksum if bit rot occurs. I would guess that synctoy instead would not bother checking the file contents unless it thought something was changed by the windows operating system. But bit rot is like coming in through the back door where it's not a change caused by the OS or anything, so I'm not sure synctoy would even know there was a trigger to cause it to recalculate the checksum or whatever to try to catch bitrot?

I understand your scepticism, let me analyze my experiments with synctoy.

Let's say we have a 2GB video file, that is backed up on a 2GB USB flash drive. I run synctoy to create the backup.

Then I boot in linux (no NTFS mounted) and I corrupt the file in a raw device way: I zero-out some blocks in the middle of the USB flash drive, which are bound to belong to the file contents.

Then I reboot into Windows, that have no idea about the bit changes, and NTFS neither, since I never mounted NTFS even in linux. I check that metadata are identical, such as file size, modified data, creation date etc.

I have made this change in both the left synctoy copy and the right synctoy copy.

In echo mode I remember that not all changes are detected, I think it just checks one direction.

In synchronize mode all differences are captured successfully.

My workflow with synctoy is:
Echo->Run (updates backup)
Change Echo to Synchronize->Preview (does not actually synchronize but checks differences)
 

stlc8tr

Golden Member
Jan 5, 2011
1,106
4
76
I understand your scepticism, let me analyze my experiments with synctoy.

I think his question has more to do with how you can tell which version of file is "true". Does synctoy save a hash file for each directory?

With checksum (or any of the hashing utilities), a hash file is created so you can tell if a file has suffered from bitrot.

(Of course, the hash file itself can get corrupted but if you have two copies of the data, you can hash the data again to check the hash file.)
 

code65536

Golden Member
Mar 7, 2006
1,006
0
76
(Of course, the hash file itself can get corrupted but if you have two copies of the data, you can hash the data again to check the hash file.)

Hash file corruption is easy to detect, though, if you can see the hashes, since hashes are designed such that even a tiny 1-bit change in the data would result in a radically different hash, so the likelihood that a random corruption of the data would result in a hash that is almost identical to the original hash is virtually as low as the likelihood of a random corruption of the data that results in the same hash.

So if there's a very small, localized corruption of a hash file, say, a single bit flip, and you see an expected hash of 0x0223456789abcdef for a file that hashes into 0x01234567989abcdef, you'll know that it's the hash that was corrupted, not the file.

For larger corruptions of a hash file, you'll just see a corrupted hash file that doesn't parse correctly or is garbled.
 

CuriousMike

Diamond Member
Feb 22, 2001
3,044
543
136
Is bit-rot more a function when the file is written, or something that occurs over time ?

The jist I'm getting from this thread is that it happens over time.
 

C1

Platinum Member
Feb 21, 2008
2,316
77
91
My experience has been that problems seem most likely to occur when writing large files (eg, typically video). In particular when attempting to perform concurrent file transfers/writes of large files to the same partition all while doing some CPU/HDD intense activity (like complex photo editing/saving or video editing where there may be lots of disk caching needed).
 

Mark R

Diamond Member
Oct 9, 1999
8,513
14
81
Is bit-rot more a function when the file is written, or something that occurs over time ?

The jist I'm getting from this thread is that it happens over time.

There are many causes:

1. Corruption on hard drives because a sector has become corrupted an unreadable (tends to result in loss of a whole sector, 512 bytes, or more commonly these days 4096 bytes).
2. Random bit flips (overclocking, flaky hardware, cosmic rays)
3. Random corruptions (software bugs, loose or damaged cables, RAID card malfunction)

Point 1 happens over time when the data is simply being stored. Due to some sort of hard drive malfunction, a sector may not be readable (perhaps, there was a power glitch or vibration during writing data, and the drive's self diagnostics didn't notice the glitch and retry; or perhaps that bit of platter is a bit weak; or perhaps there was a glitch during reading).

This is more of an issue these days as we now store more data, but the overall reliability of storage hasn't increased. Traditionally hard drives were specified for a approximately 1 unreadable sector in every 10 TB written - not much of an issue when we had 10 GB drives. With modern 4 TB drives, it's potentially 50:50 that if you fill a drive completely, you'll get it all back (OK. That's a worst case specification, in practice, performance is better than that, but you get the point).

If you've got RAID, then this type of corruption can be detected as it happens. Most high-end RAID cards will automatically read all the data on all the connected hard drives, and correlate the data to make sure it is consistent. They will generate an alert if the data doesn't match. Modern file systems like ReFS and ZFS, have checksums. If you have redundancy as well, they can automatically work out which hard drive had the correct data during a background scan and correct it automatically in the background.

2. These sorts of corruptions when data is in transit. Data might be copied to RAM, and the RAM might suffer random corruption for some reason, before the data gets copied back to HD. Cosmic rays are really rare (about 1 bit flip per GB per year), but if you've doing a lot of data moving, it can happen. More likely, is that you have some borderline RAM or bordeline CPU/mobo combo, which can cause random RAM corruption more frequently. This is especially true if overclocked.

3. Any kind of malfunctioning hardware or software can cause random data corruption. E.g. an OS crash might cause the file system to go crazy and save data in the wrong place on a hard drive, corrupting a random file. Bad cables can cause strange errors - I once had a USB hard drive that I used to archive photos, it corrupted a couple of dozen, by just missing out a sector when writing the files, every now and again. Similarly, a friend had a hardware RAID card on a server go crazy, and it just spewed out random crap onto all the connected hard drives.