• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

best way to keep two remote directories in sync

Red Squirrel

No Lifer
What is the best way to keep two remote directories in sync over a network?

Right now I just have a rsync script, but I was wondering if there is a way that is more real-time, and not resource consuming. I *could* run a rsync loop but the "building file list" would consume lot of resources for nothing.
 
Originally posted by: n0cmonkey
Shared storage.

That works for most common cases but I actually want them synced so if one server goes down the data is still on the other. Kind of like microsoft distributed file system.
 
There is also Unison, but I think you may have to initiate the syncs on it, too. I'm not sure.

The only thing I can think of for certain that would be running full time and syncing files as they are changed would be drbd.

drbd operates at the block level and is synchronous replication (as opposed to rsync or unison which are asynchronous replication). This means that all harddrive activity will have to wait on the remote drive as things are being written. So when you are writing files to a drbd partition, it will be just as slow as if you were writing those files directly to a network share.
 
Also come to think of it, with drbd, if you want more than one computer to access the synced files, you would need to format it with a shareable file system such as gfs or ocfs. Otherwise, you wouldn't even be able to have the remote block device mounted on the remote computer at the same time that it is mounted and being synced from the local computer. However, if the local computer dies, then you could still go ahead and mount the remote block device and access it on the remote computer.
 
Originally posted by: Brazen
There is also Unison, but I think you may have to initiate the syncs on it, too. I'm not sure.

The only thing I can think of for certain that would be running full time and syncing files as they are changed would be drbd.

drbd operates at the block level and is synchronous replication (as opposed to rsync or unison which are asynchronous replication). This means that all harddrive activity will have to wait on the remote drive as things are being written. So when you are writing files to a drbd partition, it will be just as slow as if you were writing those files directly to a network share.

Hmm drbd sounds interesting. So it's sorta like a raid 1 over a network I guess?

Wonder what the performance would be with two machines with a cross over cable and gigabit.

I have no intention of doing this any time soon but I was just curious of what the options where in Linux as I think it would be cool to build a Linux based SAN with multiple boxes and add full blown redundancy.

Now say a box dies then is brought back would drbd actually "rebuild" as it notices changes?
 
I've battled with these issues for a while now and I've settled on sticking with rsync to replicate directories to geographically different clusters. For example, for hosting simple html/php sites that can handled entirely with apache it's easy to set up rsync to keep the web directories in sync with a cron job that runs say every 5 minutes. If you use some sort of geoDNS solution you won't have clients bouncing between different versions of the site during the sync lag, but worse case you just use round-robin DNS and depending on the clients that might even be okay.

That's how I handle web logs as well, an rsync script that runs every hour dumping the logs to the backup server.

On the other hand, when are you dealing with servers local to each other where you have control over the network a solution using drbd is a good choice. With the proper precautions and planning it's a really solid setup that can be used in a completely fault tolerant fashion from the network access down to the disk I/O. I've been looking at this for some time now and am going to start playing around with it in the near future. The question is how much space I want/need... I'm thinking 6-10Tb at first would be nice :Q
 
Originally posted by: Crusty
I've battled with these issues for a while now and I've settled on sticking with rsync to replicate directories to geographically different clusters. For example, for hosting simple html/php sites that can handled entirely with apache it's easy to set up rsync to keep the web directories in sync with a cron job that runs say every 5 minutes. If you use some sort of geoDNS solution you won't have clients bouncing between different versions of the site during the sync lag, but worse case you just use round-robin DNS and depending on the clients that might even be okay.

That's how I handle web logs as well, an rsync script that runs every hour dumping the logs to the backup server.

On the other hand, when are you dealing with servers local to each other where you have control over the network a solution using drbd is a good choice. With the proper precautions and planning it's a really solid setup that can be used in a completely fault tolerant fashion from the network access down to the disk I/O. I've been looking at this for some time now and am going to start playing around with it in the near future. The question is how much space I want/need... I'm thinking 6-10Tb at first would be nice :Q

We just bought a 6TB san (not counting what we'll lose in raid) and it cost 30k PER ENCLOSURE, there's two, and one is fiber, so it would not surprise me if the fiber one is actually 100k.

I know as a fact with the right setup I could build a bigger san for cheaper. I so want to take this on later on when I have too much money to know what to do with. 😛 Could probably build something redundant for like 300 bucks per TB or so.
 
Originally posted by: RedSquirrel
We just bought a 6TB san (not counting what we'll lose in raid) and it cost 30k PER ENCLOSURE, there's two, and one is fiber, so it would not surprise me if the fiber one is actually 100k.

I know as a fact with the right setup I could build a bigger san for cheaper. I so want to take this on later on when I have too much money to know what to do with. 😛 Could probably build something redundant for like 300 bucks per TB or so.

This made me giggle.
 
Originally posted by: n0cmonkey
Originally posted by: RedSquirrel
We just bought a 6TB san (not counting what we'll lose in raid) and it cost 30k PER ENCLOSURE, there's two, and one is fiber, so it would not surprise me if the fiber one is actually 100k.

I know as a fact with the right setup I could build a bigger san for cheaper. I so want to take this on later on when I have too much money to know what to do with. 😛 Could probably build something redundant for like 300 bucks per TB or so.

This made me giggle.

FWIW, I priced out the setup and it came close to $30k for 2x 8TB iSCSI machines with failover switches hosting the SAN.

You can do it cheaper then premade, but nowhere near the price of $300/TB if you want the same reliability.

More then half of your budget would be dedicated to just HDDs....
 
Yeah if you go SCSI, but sata drives are still decent performance, and dirt cheap. I can get a 1TB drive for just a bit over 100 bucks. I'd wait till the 2.0's go down and get those.

Now if you go hardware raid that's where it would get more expensive. i'd go software.
 
Originally posted by: RedSquirrel
Yeah if you go SCSI, but sata drives are still decent performance, and dirt cheap. I can get a 1TB drive for just a bit over 100 bucks. I'd wait till the 2.0's go down and get those.

Now if you go hardware raid that's where it would get more expensive. i'd go software.

The goal is 100% uptime right?

If you're doing this you don't want to use consumer drives, go for something the WD RE2 line. Just under $200 for the 1Tb series. Besides, 6TB of SAS storage is a LOT of money. That's 40 300Gb drives in RAID 10, but with that amount of drives you'll want to be running RAID 6 anyways. Besides, there's no need for that number of spindles unless you're running a powerful database system off of them or massive amounts of VEs.

Even if you DO stick with consumer drives, you still have to deal with network reliability and design. 1gbps link won't be enough to serve more then a couple of clients to the SAN so you're going to want to use Link Aggregation to get more throughput, then you'll also want a secondary network to failover to in case of an outage on the primary SAN network.

Don't forget you'll need 2x the host nodes for the SAN so you can continue to use it while you rebuild one of the broken arrays(that's why you use drdb to sync the two host nodes) or when you suffer a network outage.

Finally don't forget your support contracts to replace broken hardware.

Sure, you can throw some half assed system together but it's rather short sighted to think it's going to be anywhere near reliable as a properly designed system.

 
Originally posted by: Crusty
Sure, you can throw some half assed system together but it's rather short sighted to think it's going to be anywhere near reliable as a properly designed system.

And that's why I giggle. 🙂
 
Originally posted by: Crusty
Originally posted by: RedSquirrel

Finally don't forget your support contracts to replace broken hardware.
able as a properly designed system.


That's actually where you get rapped on comercial grade SANs. They use all propietary parts that you can't just go and order off tigerdirect, so when the contract expires they'll rape you on it.

We have a bad ESM card on our IBM san at work and it's 26k to replace! While using consumer grade stuff, the most expensive part to replace on failure would be maybe a motherboard or raid controller if I go hardware raid. Consumer parts are easier to get and faster to get. For example when I had blown the PSU in my home server i was able to just run to the store and get one and be back up and running within an hour. Had it been a Dell or IBM that is out of warrenty it would be tough luck. It's just too bad they don't universal redundant PSU setups. But if two boxes are mirrored then you can have a whole box go down and the san stays up. drbd sounds like it would get this accomplished.
 
Originally posted by: RedSquirrel
That's actually where you get rapped on comercial grade SANs. They use all propietary parts that you can't just go and order off tigerdirect, so when the contract expires they'll rape you on it.

We have a bad ESM card on our IBM san at work and it's 26k to replace! While using consumer grade stuff, the most expensive part to replace on failure would be maybe a motherboard or raid controller if I go hardware raid. Consumer parts are easier to get and faster to get. For example when I had blown the PSU in my home server i was able to just run to the store and get one and be back up and running within an hour. Had it been a Dell or IBM that is out of warrenty it would be tough luck. It's just too bad they don't universal redundant PSU setups. But if two boxes are mirrored then you can have a whole box go down and the san stays up. drbd sounds like it would get this accomplished.

They usually don't use the cheap crap people buy at tiger direct either. These are embedded systems, and developing your own that can live up to the expectations of someone buying a SAN for cheaper than it is to buy a SAN is probably pretty difficult. Don't forget to include the time it takes to design, build, and maintain the system in the final price.
 
Originally posted by: n0cmonkey
Originally posted by: RedSquirrel
That's actually where you get rapped on comercial grade SANs. They use all propietary parts that you can't just go and order off tigerdirect, so when the contract expires they'll rape you on it.

We have a bad ESM card on our IBM san at work and it's 26k to replace! While using consumer grade stuff, the most expensive part to replace on failure would be maybe a motherboard or raid controller if I go hardware raid. Consumer parts are easier to get and faster to get. For example when I had blown the PSU in my home server i was able to just run to the store and get one and be back up and running within an hour. Had it been a Dell or IBM that is out of warrenty it would be tough luck. It's just too bad they don't universal redundant PSU setups. But if two boxes are mirrored then you can have a whole box go down and the san stays up. drbd sounds like it would get this accomplished.

They usually don't use the cheap crap people buy at tiger direct either. These are embedded systems, and developing your own that can live up to the expectations of someone buying a SAN for cheaper than it is to buy a SAN is probably pretty difficult. Don't forget to include the time it takes to design, build, and maintain the system in the final price.

Yeah probably would not build such system for another company, but for home/small business use it would be more then good enough imo.
 
Originally posted by: RedSquirrel
Originally posted by: n0cmonkey
Originally posted by: RedSquirrel
That's actually where you get rapped on comercial grade SANs. They use all propietary parts that you can't just go and order off tigerdirect, so when the contract expires they'll rape you on it.

We have a bad ESM card on our IBM san at work and it's 26k to replace! While using consumer grade stuff, the most expensive part to replace on failure would be maybe a motherboard or raid controller if I go hardware raid. Consumer parts are easier to get and faster to get. For example when I had blown the PSU in my home server i was able to just run to the store and get one and be back up and running within an hour. Had it been a Dell or IBM that is out of warrenty it would be tough luck. It's just too bad they don't universal redundant PSU setups. But if two boxes are mirrored then you can have a whole box go down and the san stays up. drbd sounds like it would get this accomplished.

They usually don't use the cheap crap people buy at tiger direct either. These are embedded systems, and developing your own that can live up to the expectations of someone buying a SAN for cheaper than it is to buy a SAN is probably pretty difficult. Don't forget to include the time it takes to design, build, and maintain the system in the final price.

Yeah probably would not build such system for another company, but for home/small business use it would be more then good enough imo.

For home use you can do whatever you please, but with the scale of what you are talking about I assume you're talking about enterprise level stuff.

Knowing your ass is covered by a support contract if something breaks is a good feeling for the price of what they'll cost. The last thing you want is a catastrophic failure that gets blamed on you for poor design or by using cheap commodity parts.

 
Back
Top