backups: automatic backups for certain modified files

xSauronx

Lifer
Jul 14, 2000
19,582
4
81
well, ive decided that i need to learn how to use rsync to make some backups to my server on a regular basis, for a number of things.

for the most part, a nightly sync will suffice, but id like to have all the contents/subdirectories of one or two directories backed up more often.

can i have rsync or something else set up to sync things whenever the content changes, or should i have it just setup for what i want done more often than nightly so sync every 15 minutes or something like that?

i havent ever used rsync, and im not looking for an intro to it, im just wondering what the best way to handle what i want is, before i get started :)

thanks
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
AFAIK rsync won't monitor stuff for you but you could probably whip something up with a tool like inotifywatch if you really need real-time updates.
 

xSauronx

Lifer
Jul 14, 2000
19,582
4
81
Originally posted by: Nothinman
AFAIK rsync won't monitor stuff for you but you could probably whip something up with a tool like inotifywatch if you really need real-time updates.

i dont *really* need them, maybe ill be happy if i can get rsync setup and just make a few things update quite often.

i have a bad memory, or get in a hurry sometimes and forget to sync things and then it bites me in the ass :)
 

Red Squirrel

No Lifer
May 24, 2003
70,166
13,573
126
www.anyf.ca
The easiest way would be to have a rsync job that runs every 5 minutes or something. Keep in mind though it has to check if stuff changed so that may introduce a certain level of overhead. really depends how much files will be in that folder.
 

Brazen

Diamond Member
Jul 14, 2000
4,259
0
0
Originally posted by: xSauronx
Originally posted by: Nothinman
AFAIK rsync won't monitor stuff for you but you could probably whip something up with a tool like inotifywatch if you really need real-time updates.

i dont *really* need them, maybe ill be happy if i can get rsync setup and just make a few things update quite often.

i have a bad memory, or get in a hurry sometimes and forget to sync things and then it bites me in the ass :)

You could just do an anacron script to run it every 5 or 10 minutes. Or change the nice level (google "man nice") and run it in a script with a constant loop.
 

drag

Elite Member
Jul 4, 2002
8,708
0
0
As a excersize in learning about about these tools I've been using git with a custom makefile to make backups easy for myself.

I have some information I want to keep replicated across a bunch of machines. I may use them randomly, but I want to keep a more-or-less consistent view of this data across all of them. They are mobile devices so that having a central server for backups or a file server isn't going to work.

Advantages of git..
* fast
* keeps full revision history in a very compact format. Uses a logging system and keeps branches data. This means you can easily rollback changes. It keeps deltas between changes, which can be compressed, to save data. This is contrast with traditional multi-versioned backups that end up using massive amounts of storage space to do the same thing.
* syncing data between machines is faster then rsync
* uses running checksums to help prevent and detect data corruption. Uses real cryptographically secure checksums, sha1, rather then crc stuff. If the top-level checksum checks out then you now your data is safe from corruption.
* fully distributed each copy is a complete and total copy of the repository and can perform all operations on any data contained in it in total isolation
* can be configured in a number of ways, but how I use it through ssh.. no 'git server' is needed. All data is pulled by copying it over ssh. Since I use ssh keypairs I can replicate and sync data securely with no passwords.
* on text files (it's designed for source code, after all) it keeps track of _content_, not files and directories. This means it will track lines of data being moved from one file to another with full revision history AND moving files around, renaming them, combining, splitting, and adding and deleting directories is done gracefully and without user interaction. This is in huge contrast to other rcs that track things on a file-centric basis and moving lots of files around will cause havoc.

downsides:
* designed for code, bad performance with large binary files (very big downside)
* needs mantainance and occasional cleaning to reduce repository size or get rid of unnecessary data..
* not particularly user friendly.
* does not preserve metadata.

If your working with some types of data formats, like Opendoc or whatever then those are XML files that are stored with zip compression. You could probably use those with git with a extra decompression step.

So I keep repository synced across 3 machines.
Then I use the Makefile to keep track of scripts.
If I just type:
make
it echo's out a list of commands it'll take.
$ make commit
will add any new files and then commit them to the repository. Then a text editor opens up so I can add notes and uncomment file change notations.

$ make backup
will then create a tarball of my data, compress it, encrypt it, then sends it out to my PC at home, my file server, and then a online server. This way no matter what copy I end up salvaging after the initial nuclear attack I'll still have full data preservation with checksums and full revision history.

Then it does some other stuff for me with other commands. I started using a similar system at work and this backs up to three servers and then a compact flash drive. No data is going off-site, off course.

Except for the issue with storing binary files it works out pretty well. There are some projects that are trying to make a general purpose application that turns git into something more usefull for a backup. Two that I know of are GHH and Gibak. http://eigenclass.org/hiki/gib...up-system-introduction