File Synchronization with Unison over SSH

Posted by admin on April 7, 2009 under Tech Tips | Be the First to Comment

Previously, I posted on using rsync over SSH for file synchronization. While this works very well when pushing data in one direction, it’s not well suited for synchronizing modifications that are made on both sides. An excellent bidirectional utility for that type of job is Unison, which sports many of the same benefits as rsync, but has some distinct advantages for more complex synchronization scenarios.

A basic example would be to synchronize a local directory called “MyDocs” with a remote SSH server. From the following output, you can see that this directory contains four text files.

ls -ld ~/MyDocs
drwxr-xr-x 2 gmendoza gmendoza 4096 2009-04-09 16:05 /home/gmendoza/MyDocs

ls -l ~/MyDocs
total 12
-rw-r--r-- 1 gmendoza gmendoza 31 2009-04-09 16:09 file1.txt
-rw-r--r-- 1 gmendoza gmendoza 31 2009-04-09 16:09 file2.txt
-rw-r--r-- 1 gmendoza gmendoza 31 2009-04-09 16:09 file3.txt
-rw-r--r-- 1 gmendoza gmendoza 31 2009-04-09 16:09 file4.txt

The first time you run Unison for this particular directory structure, both sides will create a local index and hash table. You’ll get a warning and will be prompted with a message, asking you to hit the space bar if you accept. If the root directory on the remote side does not exist yet, you’ll also be prompted to accept the changes.

unison MyDocs ssh://host2/MyDocs
Contacting server...
Connected [//host1//home/gmendoza/MyDocs -> //host2//home/gmendoza/MyDocs]
Looking for changes
Warning: No archive files were found for these roots, whose canonical names are:
/home/gmendoza/MyDocs
//host2//home/gmendoza/MyDocs
(snipped for brevity...)
Press return to continue.[] Waiting for changes from server
Reconciling changes

local host2
dir ----> / [f]

Proceed with propagating updates? [] y
Propagating updates

UNISON 2.27.57 started propagating changes at 16:14:30 on 09 Apr 2009
[BGN] Copying from /home/gmendoza/MyDocs to //host2//home/gmendoza/MyDocs
[END] Copying
UNISON 2.27.57 finished propagating changes at 16:14:30 on 09 Apr 2009

Saving synchronizer state
Synchronization complete (1 item transferred, 0 skipped, 0 failures)

Subsequent synchronizations are shown as the following.

unison MyDocs ssh://host2/MyDocs
Contacting server...
Connected [//host1//home/gmendoza/MyDocs -> //host2//home/gmendoza/MyDocs]
Looking for changes
Waiting for changes from server
Reconciling changes
Nothing to do: replicas have not changed since last sync.

For the following example, I have modified file1.txt on host1, and file2.txt on host2. Both file3.txt and file4.txt have been modified on each side. The great thing about unison is that when there is a conflict, you have the opportunity to view the differences and select which direction you wish to synchronize. Pressing the “x” key displays some basic information about the files that differ. In this case, I have chosen the files with the most recent timestamp. You choose the file direction by pressing the greater and less-than symbols, “>” and “<“.

unison MyDocs ssh://host2/MyDocs
(snipped)
local host2
changed <-?-> changed file3.txt [] x
local : changed file modified on 2009-04-09 at 16:16:29 size 50
host2 : changed file modified on 2009-04-09 at 16:16:43 size 55
changed <==== changed file3.txt [] <
changed <-?-> changed file4.txt [] x
local : changed file modified on 2009-04-09 at 16:17:20 size 56
host2 : changed file modified on 2009-04-09 at 16:16:59 size 41
changed ====> changed file4.txt [] >
changed ----> file1.txt [f]
<---- changed file2.txt [f]

Proceed with propagating updates? [] y
Propagating updates

UNISON 2.27.57 started propagating changes at 16:18:27 on 09 Apr 2009
[BGN] Updating file file3.txt from //host2//home/gmendoza/MyDocs to /home/gmendoza/MyDocs
[BGN] Updating file file4.txt from /home/gmendoza/MyDocs to //host2//home/gmendoza/MyDocs
[BGN] Updating file file1.txt from /home/gmendoza/MyDocs to //host2//home/gmendoza/MyDocs
[BGN] Updating file file2.txt from //host2//home/gmendoza/MyDocs to /home/gmendoza/MyDocs
[END] Updating file file3.txt
[END] Updating file file2.txt
[END] Updating file file4.txt
[END] Updating file file1.txt
UNISON 2.27.57 finished propagating changes at 16:18:27 on 09 Apr 2009

Saving synchronizer state
Synchronization complete (4 items transferred, 0 skipped, 0 failures)

Unison also has a GTK front end for the graphically inclined. Be sure to check out all the documentation for a full understanding of syntax.

File Synchronization with Rsync over SSH

Posted by admin on April 6, 2009 under Tech Tips | Be the First to Comment

To quickly synchronize files between two systems, rsync is an excellent tool that not only decreases the amount of time it takes to transfer files through a data deduplication algorithm, but can also be used transparently over SSH. The beauty of running rsync over SSH is that it does not require the rsyncd server to be running before a synchronization request and the connection is both authenticated and encrypted. All that is required is for the remote host you are connecting to be running the OpenSSH server component and of course the rsync application.

I use rsync the most for synchronizing my “Music” and “Documents” folders between a number of my systems at home and at work. All of these systems have these folders in the root of my home directory.

ls -ld ~/Music ~/Documents
drwxr-xr-x 16 gmendoza gmendoza 4096 2009-04-06 23:23 /home/gmendoza/Documents
drwxr-xr-x  9 gmendoza gmendoza 4096 2009-04-06 23:23 /home/gmendoza/Music

To push my recent changes from my local system (host1) to my remote system called (host2), I use the following commands.

rsync -avPe ssh ~/Music host2:~/
rsync -avPe ssh ~/Documents host2:~/

Notice, “Music” and “Documents” are specified without a trailing “/”, e.g. “Music/” or “Documents/”. This is important, because otherwise, it would copy only the contents of the folder to the remote home directory, and not the folder itself, which is described in more detail in the rsync man page.

Instead of running the above commands twice, you can also specify multiple files all in a single line.

rsync -avPe ssh ~/Music ~/Documents host2:~/

To synchronize changes made on the remote system to my local system, just reverse the commands. Notice the periods at the end of the line, which specifies the destination as the local working directory. Also, instead of wasting space by entering the host twice, you can use standard syntax to specify ranges or sets of files. In this case, I use curly brackets to specify the two directories on the remote host that share the same parent directory should be copied to my local working directory.

rsync -avPe ssh host2:~/{Music,Documents} .

I’ll also use the “delete” option to remove any files and folders the have been removed from the source system.

rsync --delete -avPe ssh ~/Music host2:~/

By default, rsync compares files extremely fast using a “quick check” algorithm based on the file size or in the last modified time (per the rsync man page). While I was updating my Music collection, I noticed that rsync was not detecting my ID3 tag modifications. By using the “-c” option, rsync will compare files using a 128 bit MD4 checksum as a more definitive change detection method. While this will slow the process down significantly, there’s obvious accuracy benefits in using the checksum method.

rsync -acvPe ssh Music host2:~/

Also, as you may have noticed rsync is strictly a unidirectional utility. This means that it only sends or receives data in a single direction, and it will clobber or delete any file or folder with the same name in the direction your are sending the data. For a great bidirectional utility, check out unison, which I will cover in an upcoming article.