My weeklong task: Data recovery and archival

Evadman

Administrator Emeritus<br>Elite Member
Feb 18, 2001
30,990
5
81
So I decided that this week I would help my dad with putting in a new pool during the day, and archive all my data when it was dark.

Last weekend I started copying all the data I had ever written to disk (that I still had, or could read anyway) from old backup media (DVD's, CD's, floppy, heck evern 1.2 mb floppys) from school home, work, etc and get it all onto my file server for archival and such. I had to use some data recovery tools on some of my old CD's from college ('98/'99). Archival grade my ass. I had to toss because I couldn't get the data off 'em with the tools I had.

Anyway, I got the last cd copied onto my server at about noon and started running a dupe check on it so I could kill the duplicae files (I back stuff up >1 times. Probably more like 10) then I'll organize them later.

Well, the dupechecker (Dupe File Finer a shareware progrm btw, and it rocks) has been running since noon on my file server and is only 77% complete with it's CRC checks. Ick. Figuring out which is an actual dupe and what isn't is going to suck.

If you wanted to know 12.3 Million files. Yea. That's a lot.
 

amdskip

Lifer
Jan 6, 2001
22,530
13
81
So is a raid array the best method for backing up data? I do have some old cds that should be backed up.
 

Evadman

Administrator Emeritus<br>Elite Member
Feb 18, 2001
30,990
5
81
Originally posted by: amdskip
So is a raid array the best method for backing up data? I do have some old cds that should be backed up.

For me it is. this project uses a 1.75 TB array out of 8 250 GB SATA I and II drives, which now costs about $900. (8 * 80 for drives + $250 for controler + old spare parts for everything else = $900) Probably the absolute best would be tape, but we have problems with tape that is > 10 years old at work, so I don't trust it.

Plus, tape is expensive. ~400 for a 100 GB compressed space per tape drive (that means about 40 GB native space). the media for 1.75 TB in comressed space would be another $700 ($40-45 a tape). But hell, instead of using a tape (which is useless for everything but archival really) you could just go with a 4 way RAID 5 (750 GB) or 4 way RAID 10 (500 GB), end up with 750 or 500 GB of space (roughly the same as the tapes in uncompressed size) and compress the drive. Then you could cut the cost roughly in half to about $450 or so.

I thought CD's were the answer because of all that '100 year archival' stuff that was always spouted, but out of the roughly 300 CD's I copied over I needed to use some sort of recovery software on about 40. Of those I was able to get most of the data off them on all but 6.

The bad thing is that I havea single point of failure. If the controller card fails and starts writing junk to the drive I have no backups. Any corrupted data will be gone forever.
 

neutralizer

Lifer
Oct 4, 2001
11,552
1
0
I've given up backing up things via DVD and CD, I'm going for internal hdd and swapping between external enclosures.
 

GeekDrew

Diamond Member
Jun 7, 2000
9,099
19
81
I switched to duplicate external hard-drive backup about two years ago. That's the only solution I've come up with that will let me keep a reasonably current backup of the majority of my data.
 

WhoBeDaPlaya

Diamond Member
Sep 15, 2000
7,415
404
126
Originally posted by: GeekDrew
I switched to duplicate external hard-drive backup about two years ago. That's the only solution I've come up with that will let me keep a reasonably current backup of the majority of my data.
QFT. Been doing this for the last 7 years. USB 1.1 was a PITA though :Q
 

GeekDrew

Diamond Member
Jun 7, 2000
9,099
19
81
Originally posted by: WhoBeDaPlaya
Originally posted by: GeekDrew
I switched to duplicate external hard-drive backup about two years ago. That's the only solution I've come up with that will let me keep a reasonably current backup of the majority of my data.
QFT. Been doing this for the last 7 years. USB 1.1 was a PITA though :Q

USB1.1 is the main reason I didn't implement it sooner than I did. ;)
 

drinkmorejava

Diamond Member
Jun 24, 2004
3,567
7
81
I need to get around to putting my two 80GB in a raid1. It'll be the only decent backup solution that I have other than occassional dumps to some crappy DVDs.
 

Evadman

Administrator Emeritus<br>Elite Member
Feb 18, 2001
30,990
5
81
118.7 GB of duplicate data found and removed if anyone cares :p
 

Fullmetal Chocobo

Moderator<br>Distributed Computing
Moderator
May 13, 2003
13,704
7
81
After my RAID 5 array went down because two drives hiccuped, I went to my backup DVDz to start recovering the data. Out of ~40 DVDz, I pulled about 13 gigs of data off of them. I will never do data backup to optical media again.
 

aidanjm

Lifer
Aug 9, 2004
12,411
2
0
Originally posted by: amdskip
So is a raid array the best method for backing up data? I do have some old cds that should be backed up.

RAID is not designed for backup. It's just a way of reducing server downtime due to disk corruption. I don't think it should be relied on for backup over the long term. Frankly, I don't know what the solution is for desktop users, considering that optical media don't seem to be all that reliable. I back up to a separate hard disk in an external enclusre.
 

Goosemaster

Lifer
Apr 10, 2001
48,775
3
81
Originally posted by: aidanjm
Originally posted by: amdskip
So is a raid array the best method for backing up data? I do have some old cds that should be backed up.

RAID is not designed for backup. It's just a way of reducing server downtime due to disk corruption. I don't think it should be used for backup over the long term. Frankly, I don't know what the solution is for desktop users, considering that optical media don't seem to be all that reliable. I back up to a separate hard disk in an external enclusre.

agreed.
 

aidanjm

Lifer
Aug 9, 2004
12,411
2
0
Originally posted by: Evadman
The bad thing is that I havea single point of failure. If the controller card fails and starts writing junk to the drive I have no backups. Any corrupted data will be gone forever.

what is all of this data? is it mostly media files (mp3, movies)?
 

Goosemaster

Lifer
Apr 10, 2001
48,775
3
81
Originally posted by: GeekDrew
I switched to duplicate external hard-drive backup about two years ago. That's the only solution I've come up with that will let me keep a reasonably current backup of the majority of my data.

use the same metthod. Really, truly, it is the best solution out there.


Until hd-dvd oir blueray drives..:D
 

Eli

Super Moderator | Elite Member
Oct 9, 1999
50,419
8
81
This makes me really want to back some sh!t up, but I don't have that kinda money to blow on backup equipment. :(

I'd cry if I lost all my pictures and MP3s.
 

Zim Hosein

Super Moderator | Elite Member
Super Moderator
Nov 27, 1999
65,420
408
126
Originally posted by: Evadman
118.7 GB of duplicate data found and removed if anyone cares :p

How much of it was p0rn Evadman? :p
 

eelw

Lifer
Dec 4, 1999
10,351
5,499
136
Any of my important documents are stored on multiple different media formats. Don't rely on a single backup. And even more important, don't store all of those backups in the same location.
 

dartworth

Lifer
Jul 29, 2001
15,200
10
81
Originally posted by: aidanjm
Originally posted by: amdskip
So is a raid array the best method for backing up data? I do have some old cds that should be backed up.

RAID is not designed for backup. It's just a way of reducing server downtime due to disk corruption. I don't think it should be relied on for backup over the long term. Frankly, I don't know what the solution is for desktop users, considering that optical media don't seem to be all that reliable. I back up to a separate hard disk in an external enclusre.

:thumbsup:
 

mchammer

Diamond Member
Dec 7, 2000
3,152
0
76
I would burn at least two copies of each disk for backup purposes in case one went bad. There are tools that will make a file with the MD5 of each file on the disk. Then take the MD5 of that file and write it on the disk with a sharpie.

Next, every year or so check all of the files against the MD5's and fix if necessary. This way both copies would have to fail for data to be lost. For the most important data, make even more duplicate disks, and store them in various locations, with encryption if necessary.

Also I have heard good things about the DVD-RAM format for backup purposes, because when it senses errors on the disk, it will move the data to a good area on the disk, the same way hard drives do.