Hard drive maintenance on home server?

nobb

Senior member
May 22, 2005
237
0
0
I built a home server running windows 2000 on an old 233mhz pentium 2 machine with a 640gb WD SE16 hard drive. I use it to backup, serve files over my home network, and download torrents. The machine runs 24/7.

I am just wondering if there are any necessary regular maintenance operations that should be done to keep the server in top shape? I am not too worried about the hard drive failure since I backup everything on the server to a separate external hard drive. However, my worry would be data corruption (bit rot?) that would be duplicated onto the external drive. Is this a valid concern?

Maybe you guys have recommendations on some good disk maintenance tools. I do periodic scandisk and defrag operations, but would prefer it if this could be scheduled automatically instead. I am currently using Microsoft SyncToy to synchronize all my files.

 

BlueAcolyte

Platinum Member
Nov 19, 2007
2,793
2
0
Originally posted by: Blain

SpinRite

So pretty much: You don't have to do anything until it fails because you are backing up.

About the disk-corruption: Keep more than one backup (older ones, for example.)
 

nobb

Senior member
May 22, 2005
237
0
0
Yea I did look into SpinRite, but the problem is I would prefer something a bit more automated that can be done within windows. The difficult part in keeping older files as backup is that if individual file corruption does occur, I may never know it since I have hundreds of thousands of files and may possibly never get to use them regularly to know if they are corrupted.

Do you guys think bit rot is even a valid concern these days, especially with modern Error Correction Code (ECC)?
 

Blain

Lifer
Oct 9, 1999
23,643
3
81
SpinRite FAQ...
How often should SpinRite be run for preventive maintenance?

This is mostly a matter of personal taste. For example, how often should you backup your data? However, a general rule of thumb would be that SpinRite should be run every two or three months. Running it more often provides greater safety at the expense of the time consumed. Running it less often provides more opportunity for new problems to go undetected until they become severe. Once every few months should be often enough to catch and detect any early trouble
 

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,225
126
Screw spending money on spinrite. Look for HDTach 2.61 (version is important). That version gave you very detailed graphs (better than more modern versions), and can be used for diagnosing a HD that's going to fail soon. You have to learn how to read the graphs.
 

BonzaiDuck

Lifer
Jun 30, 2004
16,646
2,030
126
Yo, there, VirtualLarry and BlueAcolyte!!

I was going to post a new thread, but this looks like a good place to seek comments if Nobb doesn't mind.

I've got to maintain five household machines on a gigabit LAN, and I, too, have a server such as Nobb's --- but saving my pennies for the Home Server to replace the Win 2K Server OS. In fact, I'm going to just rebuild it with a spare LGA-775 mobo and a spare C2D processor.

But my immediate concern is slightly different.

I built a machine for a not-very-tech-savvy family member last year, using an SATA-150 drive from a P4 system (built 2004) that I passed on to my brother.

Night before last, we discovered the system locking up, and exhibiting periodic sluggishness. Finally, I thought I had it licked -- ran some registry-cleanup SW, defragged, etc. Made (almost a) mistake of installing XP's SP3. System hung at the end where it displays " . . . cleaning up . . . " Couldn't get to boot in Safe mode (w/wo networking) or in Normal mode.

All the important files had been backed up. So we decided to install a spare SATA2 drive and reload the OS and all software. Before going forward, I ran Memtest 86+ -- discounting that there was a problem with memory or CPU. Started shopping for a mobo replacement (just in case), but ran Hitachi's diagnostic disk-testing and repair software.

After Hitachi DFT determined that there was a drive problem, I ran the "Advanced" tests, and then decided to use DFT's "repair" feature to repair damaged blocks and sectors. That software gave it a clean bill of health. I was surprised to find that XP with SP3 booted right up. I'm posting this message on the machine as I continue to troubleshoot, but at this point, it seems normal.

I'm wondering if I should postpone adding the replacement disk. Of all the hard disks I've had in 20 years, only about three or four have "gone south." But some people discover that disks that've been "repaired" tend to spawn more errors.

The other possibility is to clone the existing drive with Acronis or Partition Commander. I just downloaded HD Tach (only v. 3.0.4.0 is available.) Any short insights, comments or suggestions?

 

Elixer

Lifer
May 7, 2002
10,371
762
126
Originally posted by: nobb
I built a home server running windows 2000 on an old 233mhz pentium 2 machine with a 640gb WD SE16 hard drive. I use it to backup, serve files over my home network, and download torrents. The machine runs 24/7.

I am just wondering if there are any necessary regular maintenance operations that should be done to keep the server in top shape? I am not too worried about the hard drive failure since I backup everything on the server to a separate external hard drive. However, my worry would be data corruption (bit rot?) that would be duplicated onto the external drive. Is this a valid concern?

Maybe you guys have recommendations on some good disk maintenance tools. I do periodic scandisk and defrag operations, but would prefer it if this could be scheduled automatically instead. I am currently using Microsoft SyncToy to synchronize all my files.

If you want a free solution, there are a few ways you can handle this. One is to do a md5 checksum on all the files, then when you backup/copy files, you just check the checksum, and that should let you know if everything is OK.

If you are a bit more paranoid about this, then you make par2 recovery files. This allows you to 'recreate' the original file in case of corruption. It does eat up more file space though.

You can find free utilites that do the above pretty easily.
 

RebateMonger

Elite Member
Dec 24, 2005
11,586
0
0
MS' Windows Home Server monitors disk integrity by running automatic Chkdsk scans every twelve hours. That tells WHS if the files are readable or not. If a drive fails, you get a "health notification" warning in the WHS icon in the task bar.

It seems like a reasonable approach and it's free.
 

nobb

Senior member
May 22, 2005
237
0
0
^That would be exactly what I am looking for. Unfortunately I dont think WHS would run too well on my 128ram, 233mhz P2 system.
 

BlueAcolyte

Platinum Member
Nov 19, 2007
2,793
2
0
Originally posted by: BonzaiDuck
Yo, there, VirtualLarry and BlueAcolyte!!

I was going to post a new thread, but this looks like a good place to seek comments if Nobb doesn't mind.

I've got to maintain five household machines on a gigabit LAN, and I, too, have a server such as Nobb's --- but saving my pennies for the Home Server to replace the Win 2K Server OS. In fact, I'm going to just rebuild it with a spare LGA-775 mobo and a spare C2D processor.

But my immediate concern is slightly different.

I built a machine for a not-very-tech-savvy family member last year, using an SATA-150 drive from a P4 system (built 2004) that I passed on to my brother.

Night before last, we discovered the system locking up, and exhibiting periodic sluggishness. Finally, I thought I had it licked -- ran some registry-cleanup SW, defragged, etc. Made (almost a) mistake of installing XP's SP3. System hung at the end where it displays " . . . cleaning up . . . " Couldn't get to boot in Safe mode (w/wo networking) or in Normal mode.

All the important files had been backed up. So we decided to install a spare SATA2 drive and reload the OS and all software. Before going forward, I ran Memtest 86+ -- discounting that there was a problem with memory or CPU. Started shopping for a mobo replacement (just in case), but ran Hitachi's diagnostic disk-testing and repair software.

After Hitachi DFT determined that there was a drive problem, I ran the "Advanced" tests, and then decided to use DFT's "repair" feature to repair damaged blocks and sectors. That software gave it a clean bill of health. I was surprised to find that XP with SP3 booted right up. I'm posting this message on the machine as I continue to troubleshoot, but at this point, it seems normal.

I'm wondering if I should postpone adding the replacement disk. Of all the hard disks I've had in 20 years, only about three or four have "gone south." But some people discover that disks that've been "repaired" tend to spawn more errors.

The other possibility is to clone the existing drive with Acronis or Partition Commander. I just downloaded HD Tach (only v. 3.0.4.0 is available.) Any short insights, comments or suggestions?

Unfortunately, bad sectors are usually followed by... More bad sectors!!!

Sorry BonzaiDuck, it's a good thing you have a replacement.

 

coolVariable

Diamond Member
May 18, 2001
3,724
0
76
Originally posted by: RebateMonger
MS' Windows Home Server monitors disk integrity by running automatic Chkdsk scans every twelve hours. That tells WHS if the files are readable or not. If a drive fails, you get a "health notification" warning in the WHS icon in the task bar.

It seems like a reasonable approach and it's free.

I don't think he is talking about disk integrity but rather data corruption ... e.g. from solar storms when bits are erroneously flipped.
I had a couple of digital photos destroyed that way. HDD was fine ... nothing indicated an error. But the JPEGs were grey starting about 25-40% from the top. Was totally random and had nothing to do with disk integrity.
ECC or checksum would be the only way to catch it but I am not aware of any software checking for this as a background service.
 

nobb

Senior member
May 22, 2005
237
0
0
Yea, overall bit degradation as mentioned by coolVariable is what I am asking about. The hardware I am running is very old and I doubt my 128mb of SDRAM has any sort of ECC. I had never actually thought solar flares could corrupt data like this. I know space radiation can affect flash memory, but I never thought about disk storage. It does make sense though, and is interesting you brought it up.
 

Blain

Lifer
Oct 9, 1999
23,643
3
81
Does your memory have an odd or even number of chips?
Odd = ECC
Even = Non-ECC
 

coolVariable

Diamond Member
May 18, 2001
3,724
0
76
Originally posted by: Blain
Does your memory have an odd or even number of chips?
Odd = ECC
Even = Non-ECC

We are talking HDD not RAM.
The bits on the HDD can get corrupted/flipped by "radiation" or simply deterioration over time.

I believe there is a file system out there that has an algorithm to check for this. Somebody on AT posted about it not too long ago. though you would be stuck using Linux/unix probably.
 

BonzaiDuck

Lifer
Jun 30, 2004
16,646
2,030
126
Originally posted by: coolVariable
Originally posted by: Blain
Does your memory have an odd or even number of chips?
Odd = ECC
Even = Non-ECC

We are talking HDD not RAM.
The bits on the HDD can get corrupted/flipped by "radiation" or simply deterioration over time.

I believe there is a file system out there that has an algorithm to check for this. Somebody on AT posted about it not too long ago. though you would be stuck using Linux/unix probably.

I believe the algorithm is similar to the notion of hash codes used in ECC. You'll find the explanation in a computer architecture textbook, even if printed 15 years ago.

That is -- I would "expect" that the concept is the same. It probably explains why ECC modules are more expensive on the one hand, and slower -- on the other.

I wasn't aware that stray cosmic rays could cause hard disk corruption: I thought the problem arose through applications of magnetic fields and EMP. I can understand why cosmic rays -- alpha particles and such -- would affect memory. If you're sure about this -- the effects of gamma-rays on hard-disk data -- I'd be interested in hearing further, even if briefly.
 

BonzaiDuck

Lifer
Jun 30, 2004
16,646
2,030
126
Originally posted by: BlueAcolyte
Originally posted by: BonzaiDuck
Yo, there, VirtualLarry and BlueAcolyte!!

I was going to post a new thread, but this looks like a good place to seek comments if Nobb doesn't mind.

I've got to maintain five household machines on a gigabit LAN, [yada, yada, etc. etc.] . . .

Unfortunately, bad sectors are usually followed by... More bad sectors!!!

Sorry BonzaiDuck, it's a good thing you have a replacement.

Well, BlueA___ , you were right. It happened again after I'd cloned the drive to an SATA2 320GB. And only time will tell if there was file-corruption that got transferred.

Everyone is touting Acronis now, but I'm glad I had a license to Partition-Commander v.10. I had cloned to a target through USB connection, but couldn't get the latter to boot. I made the Part-Com_v.10 bootable-CD, so when I hooked up the new drive directly to the mobo SATA plugs, the Part-Com software seemed to show that it was recognized with the C: label, and I was pleased to find that it booted right up.

If we have more troubles, I'll have to explore two avenues (hardware and OS), but may likely reinstall the OS and software from scratch on the new HD.

Also -- the mobo is a barebones mATX from Gigabyte, and the BIOS lacks certain features. But it's advertised to accommodate 800, 1066 and 1333(OC) FSB processors. I worried at first that you couldn't lock the bus-speed for the hard drives, but since the E2140 CPU (800 Mhz FSB) is OC'd to (266 x 8) = 2.13 GB, the motherboard features are all running at stock speed. So I'm more confident that the old SATA-150 hard disk went bad without any help.
 

RebateMonger

Elite Member
Dec 24, 2005
11,586
0
0
To detect a failure on the bit level, you'd have to run CRC checks constantly on a drive. Or, at least often enough to assure that you have backups of the "error-free" version of the disk.

This is one place where making frequent backups helps. Simply READING the disk while making the backup requires that it pass the drive's CRC test. If the read doesn't pass the CRC test, the drive will retry until it's able to pass the CRC test, or until it gives up and reports an error to the OS.

I don't know if hard drives have any error-correcting capability, but even a warning of a corrupted file is useful.

Frankly, there's enough drive failures and data corruption due to head crashes, power outages, failing controllers, and the other "normal" stuff, that it probably makes the more esoteric stuff pale by comparison.
 

Keriokie2000

Junior Member
Dec 30, 2008
17
0
0
I don't do anything special to WHS drives aside for copying them off to an external drive. I do have a .RAR file where I automatically copy my most important files. That way, WHS keeps about 6 months worth of multiple versions which would require a lot of random bit flipping to corrupt them all.
 

coolVariable

Diamond Member
May 18, 2001
3,724
0
76
Originally posted by: RebateMonger
To detect a failure on the bit level, you'd have to run CRC checks constantly on a drive. Or, at least often enough to assure that you have backups of the "error-free" version of the disk.

This is one place where making frequent backups helps. Simply READING the disk while making the backup requires that it pass the drive's CRC test. If the read doesn't pass the CRC test, the drive will retry until it's able to pass the CRC test, or until it gives up and reports an error to the OS.

I don't know if hard drives have any error-correcting capability, but even a warning of a corrupted file is useful.

Frankly, there's enough drive failures and data corruption due to head crashes, power outages, failing controllers, and the other "normal" stuff, that it probably makes the more esoteric stuff pale by comparison.

Making a backup checks the CRC?

 

Rubycon

Madame President
Aug 10, 2005
17,768
485
126
Originally posted by: RebateMonger
To detect a failure on the bit level, you'd have to run CRC checks constantly on a drive. Or, at least often enough to assure that you have backups of the "error-free" version of the disk.

This is one place where making frequent backups helps. Simply READING the disk while making the backup requires that it pass the drive's CRC test. If the read doesn't pass the CRC test, the drive will retry until it's able to pass the CRC test, or until it gives up and reports an error to the OS.

I don't know if hard drives have any error-correcting capability, but even a warning of a corrupted file is useful.

Frankly, there's enough drive failures and data corruption due to head crashes, power outages, failing controllers, and the other "normal" stuff, that it probably makes the more esoteric stuff pale by comparison.

Drive patrolling on hosts that support this feature will do this.
 

RebateMonger

Elite Member
Dec 24, 2005
11,586
0
0
Originally posted by: coolVariable
Making a backup checks the CRC?
Hard drives have their own error detection (and, to some extent, error correction) system that they use when reading and writing data.