File Synchronization with Rsync over SSH

Posted by admin on April 6, 2009 under Tech Tips | Be the First to Comment

To quickly synchronize files between two systems, rsync is an excellent tool that not only decreases the amount of time it takes to transfer files through a data deduplication algorithm, but can also be used transparently over SSH. The beauty of running rsync over SSH is that it does not require the rsyncd server to be running before a synchronization request and the connection is both authenticated and encrypted. All that is required is for the remote host you are connecting to be running the OpenSSH server component and of course the rsync application.

I use rsync the most for synchronizing my “Music” and “Documents” folders between a number of my systems at home and at work. All of these systems have these folders in the root of my home directory.

ls -ld ~/Music ~/Documents
drwxr-xr-x 16 gmendoza gmendoza 4096 2009-04-06 23:23 /home/gmendoza/Documents
drwxr-xr-x  9 gmendoza gmendoza 4096 2009-04-06 23:23 /home/gmendoza/Music

To push my recent changes from my local system (host1) to my remote system called (host2), I use the following commands.

rsync -avPe ssh ~/Music host2:~/
rsync -avPe ssh ~/Documents host2:~/

Notice, “Music” and “Documents” are specified without a trailing “/”, e.g. “Music/” or “Documents/”. This is important, because otherwise, it would copy only the contents of the folder to the remote home directory, and not the folder itself, which is described in more detail in the rsync man page.

Instead of running the above commands twice, you can also specify multiple files all in a single line.

rsync -avPe ssh ~/Music ~/Documents host2:~/

To synchronize changes made on the remote system to my local system, just reverse the commands. Notice the periods at the end of the line, which specifies the destination as the local working directory. Also, instead of wasting space by entering the host twice, you can use standard syntax to specify ranges or sets of files. In this case, I use curly brackets to specify the two directories on the remote host that share the same parent directory should be copied to my local working directory.

rsync -avPe ssh host2:~/{Music,Documents} .

I’ll also use the “delete” option to remove any files and folders the have been removed from the source system.

rsync --delete -avPe ssh ~/Music host2:~/

By default, rsync compares files extremely fast using a “quick check” algorithm based on the file size or in the last modified time (per the rsync man page). While I was updating my Music collection, I noticed that rsync was not detecting my ID3 tag modifications. By using the “-c” option, rsync will compare files using a 128 bit MD4 checksum as a more definitive change detection method. While this will slow the process down significantly, there’s obvious accuracy benefits in using the checksum method.

rsync -acvPe ssh Music host2:~/

Also, as you may have noticed rsync is strictly a unidirectional utility. This means that it only sends or receives data in a single direction, and it will clobber or delete any file or folder with the same name in the direction your are sending the data. For a great bidirectional utility, check out unison, which I will cover in an upcoming article.

Add A Comment