Rsync at your company
I assume that rsync is installed on all three servers, not necessarily though in the same location, the rsync binaries need to exist on the local and remoate machines which talk to each other.My test case were two Solaris sparc servers (they had rsync pre-installed) and one Solaris x86 server where I had to manually add it (I got it from a sunfreeware site and put it into
/tools/rsync/bin/rsync
but I leave that to the user). Rsync mirror design
In order to have a proper control of what is sync-ed the main rsync process should be running on one machine only and perform the following steps.Assume you have 3 machines A, B, C.
- A->B: update files on machine B which are newer on machine A.
- B->A: update files on machine A which are newer on machine B.
- A->C: update files on machine C which are newer on machine A.
- C->A: update files on machine A which are newer on machine C.
- A->B: update files on machine B which are now newer on A (and came from C)
Here is the basic rsync command:
rsync --archive --update --verbose --stats --rsh=ssh --cvs--exclude --file-from=somefile /app/foo/conf/ server2:/app/foo/confThe --archive option is a summary option (recursion and preserving everything like timestamps, symbolic links etc. except hard links).
The --update options says to skip files from transferring which are newer on the target system.
The --verbose and --stats options are for reporting only.
The --rsh=ssh option means to use ssh as the login mechanism to the remote system.
The --cvs-exclude options excludes all CVS related files from being checked. It is basically a filter for certain file names and file extensions.
The --file-from=somefile lists a file which contains all filenames to be checked (nothing else will).
The first argument is the source directory to be checked and the second argument is the target machine server2 and directory to be checked (these could be different: no need for rsync that /app/foo/conf is in the same place).
There are a couple of noteworthy additional options:
--rsync-path=/tools/rsync/bin/rsync
tells rsync where to find rsync on the remote system.--dry-run
tells rsync to do a check only but not do a real file transfer.Note also that if 2 files are equal but have different time stamps rsync will update the time stamps so that they are in sync.
Solution
The above design has been implemented with- the script is /tools/rsync/scripts/rsync.sh on server A
- the list of files to be checked is in /tools/rsync/scripts/rsync.files (a list of config files)
- password free ssh access from machine A to the remote machines has been enabled by adding the public key (.ssh/id_rsa.pub) into the remote machines's .ssh/authorized_keys file
Cron (UNIX tool to run regular jobs)
There is a simple cron job on machine A:7,17,27,37,47,57 * * * * cd /tools/rsync/scripts; ./rsync.sh | /usr/ucb/mail -s "rsync `date`" foo@Bar.COMi.e. it runs every 10 minutes, notification is via email to 'foo' (this could be improved).
Password free ssh
In order for rsync hosts to communicate via ssh but without password one needs to generate a public/private key on the central machine and it needs to be added to the remote machines.- On machine A: generate key files .ssh/id_rsa and .ssh/id_rsa.pub
ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/app/foo/.ssh/id_rsa): [Enter return] Enter passphrase (empty for no passphrase): [Enter return] Enter same passphrase again: [Enter return] Your identification has been saved in /app/foo/.ssh/id_rsa. Your public key has been saved in /app/foo/.ssh/id_rsa.pub. The key fingerprint is: 24:ab:31:1e:f1:74:16:4d:0f:8e:70:19:1b:31:2e:db foo@machineA
- Check the public key (this is one line which I wrapped for readability)
cat $HOME/.ssh/id_rsa.pub ssh-rsa AABAB3NzaC2yc2EAAAABIwCAAIEAvpzxLumVmSRPKmgwSk9NGPUDcxfFpypUAdi3UGpZ2QSqoak QaDQyp4RPVoLA2gADjW3Y132TJZLEBCmBaX7A588XGg/svXuCnXXXuRYL0wwO8iRCleCO50mzNfY4XcOxM P62JIVdlDOMsnY/eSYpK+ex/9RomVRa/bMw9b/D/e0= foo@machineA
- Enter this line into .ssh/authorized_keys on remote machines (B and C)
This scheme works nicely for systems where there aren't that many changes happening at the same time.
Once can easily envision that if two users change the same file at the same time on machines B and C the subsequent rsync will first copy B's version to A and then C's version to A and then forward A's version (which is equal to C) to B thus wiping out the original change on B i.e. this scheme does not guarantee data consistency. I am using it to maintain certain config files where there are changes only once or twice per week and only a handful of users have access to the files.
There is also the issue of files being synced while updated by users at the same time. One would need much more clever file-locking-across-multiple-systems approaches to tackle this.
See also Rsync man page (lots of options)