• We should now be fully online following an overnight outage. Apologies for any inconvenience, we do not expect there to be any further issues.

why is rsync sometimes so slow?

Red Squirrel

No Lifer
May 24, 2003
70,606
13,811
126
www.anyf.ca
I use rsync to sync changes between dev test and prod environments for code and I find that every now and then it will get hung up on a file for no reason. These are all very small files. It will sit there for like 5 minutes, then finally resume again. Is there a way to stop this from happening?

The rsync string being used is this:

Code:
rsync -rv --checksum --delete-after --exclude-from="${maindir}${envfr}/${proj}/syncexcludes_src.txt" "${maindir}${envfr}/${proj}/src/" "${maindir}${env}/${proj}/src/"
 

Essence_of_War

Platinum Member
Feb 21, 2013
2,650
4
81
Yeah, checksum crushes performance.

If you want to see what's hanging you up, try adding "-iP --stats" to give you some additional diagnostics. 'i' will add some explanation to the verbose output of what needs to be changed/updated about every file, 'P' will give you a progress indicator, and --stats will give you some summary statistics about what was actually transferred/deleted on each end.
 

Red Squirrel

No Lifer
May 24, 2003
70,606
13,811
126
www.anyf.ca
Without checksum I find it just copies everything over. I guess it goes by the file time stamps, and if the clocks are not 100% to the microsecond then it assumes the file changed. Is there a way to fix that? I already have a NTP server and clients setup.

I'll try the other options though at least I'll be able to get a better idea of what's going on.

Here's sample output:

Code:
building file list ... 
5893 files to consider
>f+++++++++ runuo.exe.BAK
      696832 100%    2.18MB/s    0:00:00 (xfer#1, to-check=5885/5893)
cd+++++++++ Backups/Crashed/28-2-2015-2-9-42/
cd+++++++++ Backups/Crashed/28-2-2015-2-9-42/Accounts/
cd+++++++++ Backups/Crashed/28-2-2015-2-9-42/Guilds/
cd+++++++++ Backups/Crashed/28-2-2015-2-9-42/Items/
cd+++++++++ Backups/Crashed/28-2-2015-2-9-42/Mobiles/
cd+++++++++ Backups/Crashed/28-2-2015-2-9-42/Regions/
>fcsT...... core/RunUO.pdb
     2264576 100%    2.33MB/s    0:00:00 (xfer#2, to-check=5850/5893)
>fcsT...... core/RunuoAoV/Item.cs
      110076 100%   62.64kB/s    0:00:01 (xfer#3, to-check=5823/5893)
>fcsT...... core/RunuoAoV/Items/Container.cs
       42156 100%   72.73kB/s    0:00:00 (xfer#4, to-check=5744/5893)
>fcsT...... scripts/Items/Misc/SA/Underworld/Experimental Gem Puzzle/ExperimentalGemPuzzleBoard.cs
       18202 100%    0.00kB/s    0:00:00 (xfer#5, to-check=2903/5893)
>fcsT...... scripts/Items/Misc/SA/Underworld/Experimental Gem Puzzle/PartialPuzzleKey.cs
        1374 100%    1.31MB/s    0:00:00 (xfer#6, to-check=2900/5893)
>fcsT...... scripts/Items/Misc/SA/Underworld/Experimental Gem Puzzle/PuzzleKeyPuzzleBoard.cs
       13366 100%   12.75MB/s    0:00:00 (xfer#7, to-check=2899/5893)

Just sat there forever after that. (not always same spot) Eventually finished with this:

Code:
Number of files: 5893
Number of files transferred: 7
Total file size: 1530335219 bytes
Total transferred file size: 3146582 bytes
Literal data: 3146582 bytes
Matched data: 0 bytes
File list size: 223990
File list generation time: 125.785 seconds
File list transfer time: 0.002 seconds
Total bytes sent: 3371276
Total bytes received: 165

sent 3371276 bytes  received 165 bytes  13299.57 bytes/sec
total size is 1530335219  speedup is 453.91
 
Last edited:

Essence_of_War

Platinum Member
Feb 21, 2013
2,650
4
81
Without checksum I find it just copies everything over. I guess it goes by the file time stamps, and if the clocks are not 100% to the microsecond then it assumes the file changed. Is there a way to fix that? I already have a NTP server and clients setup.
The fact that the stats report is showing a huge speed-up factor, and the actual data transfer is small compared to the final directory size indicates to me that rsync definitely is not just copying everything over.

If you don't use the checksum option, rsync checks the mtime and file size. If they don't BOTH match, it reads the files, computes the diffs, and transmits only the diffs to make the receiving file equal to the sending file. What it sounds like is happening is that you've got a lot of files with size or mtime differences and the slowdown you're seeing is rsync reading and comparing the files to make sure it only sends diffs.

There is an option to mess with the mtimes. You can add the "--modify-window=N" flag, or the "--ignore-times" flags, which sort of do what they sound. The latter makes rsync compare only sizes, and the former ignores mtime mismatches if they're within N seconds of each other. This is a really useful option if one of your filesystems is FAT formatted. I'm not sure if that option would help you. You can certainly try running with the "-n" flag to do a dry-run and see if things look any different.

Edit: One more note, I find if I'm digging around with the stats/verbose options, tossing in the "-h" flag to get human readable units for everything is quite handy.
 

Red Squirrel

No Lifer
May 24, 2003
70,606
13,811
126
www.anyf.ca
The reason my speedup is good is because I AM using checksum. I meant if I DON'T it starts to copy stuff that did not change. Also I think accessing a file technically modifies it, because it changes the last access date and think the file system counts this as a modify. I'll play around with --modify-window maybe that is the key. Is the number I put in seconds? Tried 10 and it just went haywire wanting to copy everything.
 
Last edited:

Fallen Kell

Diamond Member
Oct 9, 1999
6,212
537
126
File access does not modify the file itself, it only modifies the file's inode (which is the meta-data about the file and updates the "access time" value). For one thing, I would fix your clock issues. Get NTP setup on each system right away.
 

Red Squirrel

No Lifer
May 24, 2003
70,606
13,811
126
www.anyf.ca
NTP *IS* setup. But due to network latency, speed of light between server connections, etc it's just impossible to keep the clocks at 100% sync. There's always going to be a few nano or microseconds difference. Even more true for offsite servers. When looking at two clocks with my eyes they look in sync to me but in reality probably arn't 100%.
 

mv2devnull

Golden Member
Apr 13, 2010
1,526
160
106
NTP *IS* setup. But due to network latency, speed of light between server connections, etc it's just impossible to keep the clocks at 100% sync. There's always going to be a few nano or microseconds difference. Even more true for offsite servers. When looking at two clocks with my eyes they look in sync to me but in reality probably arn't 100%.
I fail to see how that relates to the issue.

The file in source has a timestamp. The file in destination has a timestamp. The successful rsync on the previous time has set the destination timestamp to be identical to the timestamp of the source. This is not about clock. This is about a number stored in the file metadata.

Where that can fail is either you don't force the rsync to sync the time-metadata or the destination filesystem does not store the time-metadata.
 

Essence_of_War

Platinum Member
Feb 21, 2013
2,650
4
81
The reason my speedup is good is because I AM using checksum. I meant if I DON'T it starts to copy stuff that did not change. Also I think accessing a file technically modifies it, because it changes the last access date and think the file system counts this as a modify. I'll play around with --modify-window maybe that is the key. Is the number I put in seconds? Tried 10 and it just went haywire wanting to copy everything.

My mistake, I thought you were posting shots from the flags I had suggested. How do you know it's actually copying things that didn't change? Are you watching the itemized changes w/ the progress flags enabled? Especially if you're working with local directories and are not syncing to/from remote cifs/smb shares to FAT/NTFS filesystems, this seems pretty unlikely.

Accessing a file does not necessarily change its modification time. For example, touch a file into existence, then cat it to the terminal. You'll have accessed it, but not modified it, and if you stat the file, you'll see the atime is not the same as the mtime and ctime (which are themselves identical). If you also stat'd the file before you accessed it, you'd see all three were identical.
 

Red Squirrel

No Lifer
May 24, 2003
70,606
13,811
126
www.anyf.ca
I know since the folders are already more or less synced. If I remove --checksum it starts copying everything. This is through NFS, if it matters. Could the file/time latency be too high that by the time the file gets copied it's a few seconds newer than the source so it thinks the file changed?

Is there a way to tell it to only copy older files, maybe that will solve the issue? NTP is setup on the NFS server too as well as the client.

Files are going to the same NFS share as well, basically there's a /dev /test and /prod folder and the script just syncs a specific sub folder from dev to test and so on.
 

Essence_of_War

Platinum Member
Feb 21, 2013
2,650
4
81
If I remove --checksum it starts copying everything

To be very clear, how do you know it's copying everything? Are you watching the verbose output with itemize and progress flags? If you're just looking at hte verbose output without those two options, it won't be clear if rsync is actually copying everything, or if it is checking to see where/if the files differ due to timestamp/size mismatches.
 

Red Squirrel

No Lifer
May 24, 2003
70,606
13,811
126
www.anyf.ca
To be very clear, how do you know it's copying everything? Are you watching the verbose output with itemize and progress flags? If you're just looking at hte verbose output without those two options, it won't be clear if rsync is actually copying everything, or if it is checking to see where/if the files differ due to timestamp/size mismatches.

I know because both locations were previously synced, and I only changed one file. I only expect that 1 file to copy over, not everything. If I start seeing the names of files that I have not modified in ages then I know it's not suppose to be copying it because it has not changed in the source location.

This is an active project I'm working on so I have a general idea of what files changed and which didn't after I do a sync.
 

Fallen Kell

Diamond Member
Oct 9, 1999
6,212
537
126
I found your problem. You are not using the "-a" option on the rsync command. I didn't really look that closely at it before as I figured you had come up with the command from one of the billion examples out there on the web on how to rsync something. Without the "-a" option, you are not preserving the file's creation and modification times, and thus, it uses the time of rsync's copy to the new location as the creation and modification time and not the one from the source making all the files in your new location have different creation/modification times. Only a checksum would determine that the files are not actually different (just like you are reporting).
 
Last edited:

Red Squirrel

No Lifer
May 24, 2003
70,606
13,811
126
www.anyf.ca
Ohhhh I see. Good to know, I will try with -a. Should I delete everything in the destination then and recopy it from scratch or will it fix itself?
 

Fallen Kell

Diamond Member
Oct 9, 1999
6,212
537
126
Well, you can leave everything, and it will simply copy everything again which fixes it (just don't use the checksum option)...
 

Essence_of_War

Platinum Member
Feb 21, 2013
2,650
4
81
Well, you can leave everything, and it will simply copy everything again which fixes it (just don't use the checksum option)...

The '-a' flag shouldn't copy everything, should it? The rsync algorithm would be to read the files at both ends since the mtimes will differ looking for diffs, but when it actually computes the diffs, the only diff will be in the metadata (the mtime) so the only actual writes that will happen are the metadata updates. But yeah, I do think there is no need to delete a perfectly good backup, one use of the rsync with the '-a' should fix this.

At the risk of belaboring my previous point, OP, if you had used the itemize/progress flags like I suggested, I think you would have caught this problem. The itemized output would have indicated that the long list of files you were seeing were all simply having their mtime updated, not being copied completely.
 

Red Squirrel

No Lifer
May 24, 2003
70,606
13,811
126
www.anyf.ca
I had those flags setup, but did not really know what any of the output meant. It was pretty cryptic, probably more something that the developers would use.

Just got around to modifying the script, so basically I just removed --checksum and replaced with -a in my original posted command. It's running now, copying everything but I'll see how it goes after that. The actual copy process seems to be way faster though. Like it's not getting hung up on random small files. At least so far.
 
Last edited:

Red Squirrel

No Lifer
May 24, 2003
70,606
13,811
126
www.anyf.ca
Bahhh still not working. It's not as bad as before, but it's still copying lot of crap that did not change. It probably changed in the sense that it was accessed by the program some internal data on it changed like mtime, but it did not actually change. (it's just map files for a game, the game reads them but does not change them). It's copying them all anyway, and those are the biggest files to boot... It's still slightly faster than with checksum though so guess it will have to do. So far it seems to be the only files being copied though but maybe I need to wait longer to see if it will start doing it to others.

Worse case guess I can just add an exception for those files, but it's a pain because the whole point of this sync script is so I don't need to do anything special if I do want to commit new files to test or prod.
 

Essence_of_War

Platinum Member
Feb 21, 2013
2,650
4
81
I had those flags setup, but did not really know what any of the output meant. It was pretty cryptic, probably more something that the developers would use.
I mean, it's terse and ascii, like most unix-type terminal outputs are, but it's not a developer tool like say, triple '-v' output is. Here is a good summary of the results:

https://stackoverflow.com/questions/4493525/rsync-what-means-the-f-on-rsync-logs
The "%i" escape has a cryptic output that is 11 letters long. The general format is like the string YXcstpoguax, where Y is replaced by the type of update being done, X is replaced by the file-type, and the other letters represent attributes that may be output if they are being modified.
Explanation of attributes:
  • c -> checksum differs
  • s -> size differs
  • t -> mtime differs
  • p -> permissions differ
  • o -> owners
  • g -> group
  • u -> reserved for future use
  • a -> ACLs
  • x -> extended attributes
  • + -> is new (all +++++'s indicates a new file)
  • . -> is unchanged
so:
>f+++++++++ runuo.exe.BAK
means runuo.exe.BAK is a file (f) that doesn't exist at the target (+++++++++) and is being transmitted (>)

while:
>fcsT...... core/RunuoAoV/Items/Container.cs
means that Container.cs is a file (f) that exists at the target, but it has a checksum (c), size (s), and mtime (t,T), that differ at the target from the source. The diffs are being transmitted (>) and metadata are being updated with (T) indicating that you didn't use the "-t, --times" flags (or -a, --archive which imply -t, --times) so the new mtime will be set to the transfer time. If you had used '-t' or '-a' the mtime at the target would instead be set to mtime at the source.

Edit:

So with regards to your 2nd post, did you run the script once, or twice? I'd expect it to take a long time the first time since the '-a' flag will have to touch almost every file to get the mtimes and since the mtimes differe but the sizes are the same it will probably have to read every file also to look for diffs. I'd expect the next time you run the script to be very fast if not much has changed. If you're making small changes to huge files though, it will still probably take a while to compute diffs to transmit.

I think you're incorrect about your map files. Simply reading a file will not change the file's mtime, even opening a file for read/write will not change the mtime if no data is written to the file, or if the file isn't truncated in a way that shrinks it. Reading any non-zero amount of data from a file, or executing a file, will change the file's atime, but rsync doesn't care one bit about atimes.
 
Last edited:

Red Squirrel

No Lifer
May 24, 2003
70,606
13,811
126
www.anyf.ca
Thought I had it, but still not working right. I used it about 20 times after modifying only a few files, no issue. Go to use it again, it randomly decides to copy EVERYTHING. WTF? Why is it doing this? Is there a way to set it so the time is not as sensitive? Like maybe ignore anything that's within 10 seconds? It's the only thing I can think of, the times might not be DEAD on. I don't think computer clock crystals are temperature compensated so a slight change in temp could potentially slightly alter the time or something.

This is the command I'm using:

Code:
rsync -rv -h -a --delete-after --exclude-from="${maindir}${envfr}/${proj}/syncexcludes_src.txt" "${maindir}${envfr}/${proj}/src/" "${maindir}${env}/${proj}/src/"
 
Last edited:

Essence_of_War

Platinum Member
Feb 21, 2013
2,650
4
81
It's the only thing I can think of, the times might not be DEAD on

You don't have to guess, rsync will tell you precisely why it thinks files need to be copied if you look at the itemized output.

Also -a implies -r, so you can drop the -r
Try this:

Code:
rsync -avhin --delete-after --exclude-from="${maindir}${envfr}/${proj}/syncexcludes_src.txt" "${maindir}${envfr}/${proj}/src/" "${maindir}${env}/${proj}/src/"

The -n will just generate the file list and -i will tell you why it thinks the files need to be transferred/deleted.
 

Red Squirrel

No Lifer
May 24, 2003
70,606
13,811
126
www.anyf.ca
That *seems* to have fixed it... at least for now. I copied the list above in a comment in the file so I can refer to the flags next time it stops working.
 

Red Squirrel

No Lifer
May 24, 2003
70,606
13,811
126
www.anyf.ca
Did it again. The reasoning it's giving is these two:

t -> mtime differs
p -> permissions differ

Not sure how permissions are changing when I did not change anything on the destination.