Using OpenSSH with keys can facilitate secure automated backups. rsync(1), tar(1), and dump(8) are the foundation for most backup methods. It's a myth that remote root access must be allowed. If root access is needed, sudo(8) works just fine. Remember that until the backup data has been tested and shown to restore reliably it does not count as a backup copy.
Backup with rsync(1)
rsync(1) is often used to back up both locally and remotely. It is fast and flexible and copies incrementally so only the changes are transferred, thus avoiding wasting time re-copying what is already at the destination. It does that through use of its now famous algorithm. When working remotely, it needs a little help with the encryption and the usual practice is to tunnel it over SSH.
$ rsync -a email@example.com:./archive/ \ /home/fred/archive/.
But use of SSH can still be specified explicitly if additional options must be passed to the SSH client:
$ rsync -a -e 'ssh -v' \ firstname.lastname@example.org:./archive/ \ /home/fred/archive/.
For some types of data, transfer can sometimes be expedited greatly by using rsync(1) with compression, -z, if the CPUs on both ends can handle the extra work. However, it can also slow things down. So compression is something which must be tested in place to find out one way or the other whether adding it helps or hinders.
Rsync with Keys
Since rsync(1) uses SSH by default it can even authenticate using SSH keys by using the -e option to specify additional options. In that way it is possible to point to a specific SSH key file for the SSH client to use when establishing the connection.
$ rsync --exclude '*~' -avv \ -e 'ssh -i ~/.ssh/key_bkup_rsa' \ email@example.com:./archive/ \ /home/fred/archive/.
Other configuration options can also be sent to the SSH client in the same way if needed, or via the SSH client's configuration file. Furthermore, if the key is first added to an agent, then the key's passphrase only needs to be entered once. This is easy to do in an interactive session within a modern desktop environment. In an automated script, the agent will have to be set up with explicit socket names passed along to the script and accessed via the SSH_AUTH_SOCK variable.
Root Level Access for rsync(1) with sudo(8)
Sometimes the backup process needs access to a different account other than the one which can log in. That other account is often root which for reasons of least privilege is usually denied direct access via SSH. rsync(1) can invoke sudo(8) on the remote machine, if needed.
Say you're backing up from the server to the client. rsync(1) on the client uses ssh(1) to make the connection to rsync(1) on the server. rsync(1) is invoked from client with -v passed to the SSH client to see exactly what parameters are being passed to the server. Those details will be needed in order to incorporate them into the server's configuration for sudo(8). Here the SSH client is run with a single level of increased verbosity in order to show which options must be used:
$ rsync \ -e 'ssh -v \ -i ~/.ssh/key_bkup_rsa \ -t \ -l bkupacct' \ --rsync-path='sudo rsync' \ --delete \ --archive \ --compress \ --verbose \ bkupacct@server:/var/www/ \ /media/backups/server/backup/
There the argument --rsync-path tells the server what to run in place of rsync(1). In this case it runs
sudo rsync. The argument -e says which remote shell tool to use. In this case it is ssh(1). For the SSH client being called by the rsync(1) client, -i says specifically which key to use. That is independent of whether or not an authentication agent is used for ssh keys. Having more than one key is a possibility, since it is possible to have different keys for different tasks.
You can find the exact settings(s) to use in /etc/sudoers by running the SSH in verbose mode (-v) on the client. Be careful when working with patterns not to match more than is safe.
Adjusting these settings will most likely be an iterative process. Keep making changes to /etc/sudoers on the server while watching the verbose output until it works as it should. Ultimately /etc/sudoers will end up with a line allowing rsync(1) to run with a minimum of options.
Steps for rsync(1) with Remote Use of sudo(8) Over SSH
Preparation: create an account to use for the backup, create a pair of keys to use only for backup, then make sure you can log in to that account with ssh(1) with and without those keys.
$ ssh -i ~/.ssh/key_bkup_rsa firstname.lastname@example.org
The account on the server is named 'bkupacct' and the private RSA key is ~/.ssh/key_bkup_rsa on the client. On the server, the account 'bkupacct' is a member of the group 'backups'.
The public key, ~/.ssh/key_bkup_rsa.pub, must be copied to the account 'bkupacct' on server and placed in ~/.ssh/authorized_keys there.
It is essential that following directories on the server are owned by root and belong to the group backups' and not group readable, but not group writable, and definitely not world readable: ~ and ~/.ssh/. Same for the file ~/.ssh/authorized_keys there. (This assumes you are not also using ACLs) This is one way of many to set permissions on the server:
$ sudo chown root:bkupacct ~ $ sudo chown root:bkupacct ~/.ssh/ $ sudo chown root:bkupacct ~/.ssh/authorized_keys $ sudo chmod u=rwx,g=rx,o= ~ $ sudo chmod u=rwx,g=rx,o= ~/.ssh/ $ sudo chmod u=rwx,g=r,o= ~/.ssh/authorized_keys
Step 1: Configure sudoers(5) and test rsync(1) with sudo(8) on the remote host. In this case data is staying on the remote machine. The group 'backups' will temporarily need full access,
%backups ALL=(root:root) NOPASSWD: /usr/bin/rsync, in order to find and set specific options used later in locking this down.
For emphasis, that is a transitory step and that line should not be left in place for any length of time.
$ ssh -l bkupacct www.example.org sudo rsync -av:/var/www/ /tmp/
It will be necessary to tune /etc/sudoers a little at this stage. More refinements may come later. Note that there is an rsync(1) user and an ssh(1) user. The data in this case gets copied from the remote machine to the local /tmp directory.
$ rsync -e 'ssh -t -l bkupacct' --rsync-path='sudo rsync' \ -av email@example.com:/var/www/ /tmp/
Step 3: Do the same transfer again but using the key for authentication to make sure that the key works.
$ rsync -e 'ssh -i ~/.ssh/key_bkup_rsa -t -l bkupacct' --rsync-path='sudo rsync' \ -av firstname.lastname@example.org:/var/www/ /tmp/
Step 4: Adjust /etc/sudoers so that the backup account has just enough access to run rsync(1) but only in the directories it is supposed to run in and without free-rein on the system. Use the first debugging level to see the actual parameters getting passed to the remote host.
$ rsync -e 'ssh -t -v' --rsync-path='sudo rsync' \ -av email@example.com:/var/www/ /tmp/ ... debug1: Sending command: sudo rsync --server --sender -e.iLs . /var/www ...
That provides the basis of what /etc/sudoers will need configured. Here is are the settings matching the formula above, assuming the account is in the group backups:
%backups ALL=(ALL) NOPASSWD: /usr/bin/rsync --server \ --sender -vlogDtpre.if . /var/www/
At this point you are almost done, although the process can be automated much further. Be sure that the backed up data is not accessible to others once stored locally.
$ rsync -e 'ssh -t' --rsync-path='sudo rsync' \ -av firstname.lastname@example.org:/var/www/ /tmp/
Then once the settings are correct it is possible to designate a custom key for authentication,
$ rsync -e 'ssh -t ~/.ssh/mybkupkey' --rsync-path='sudo rsync' \ -av email@example.com:/var/www/ /tmp/
And then you can lock that key into just the one task by further adding restrictions in the authorized_keys file.
command="/usr/bin/rsync --server --sender -vlogDtpre.iLsfxC . ./var/www" ssh-rsa AAAAB3N...Pk=
Thus you are able to do automated remote backup using rsync(1) with root level access yet avoiding remote root login. The key function only for the backup.
Still keep close tabs on the private key since it can be used to fetch the remote backup and that may still contain sensitive information.
The process requires a lot of attention to detail, but is quite doable if taken one step at a time.
Other Implementations of the Rsync Protocol
openrsync(1) is a clean room reimplementation of version 27 of the Rsync protocol as supported by the samba.org implementation of rsync(1). It has been in OpenBSD's base system since OpenBSD version 6.5. It is invoked with a different name, so if it is on a remote system and samba.org's rsync(1) is on the local system, the --rsync-path option must be point to it by name:
$ rsync -a -v -e 'ssh -i key_rsa' \ --rsync-path=/usr/bin/openrsync \ firstname.lastname@example.org:/var/www/ \ /home/fred/www/
Backup Using tar(1)
The following will make a tarball of the directory /var/www/ and send it via stdout on the local machine into sdtin on the remote machine via a pipe into ssh(1) where, it is then directed into the file called backup.tar. Here tar(1) runs on a local machine and stores the tarball remotely:
$ tar cf - /var/www/ | ssh -l fred server.example.org 'cat > backup.tar'
There are almost limitless variations on that recipe:
$ tar zcf - /var/www/ /home/*/www/ \ | ssh -l fred server.example.org 'cat > $(date +"%Y-%m-%d").tar.gz'
That example does the same, but also gets user WWW directories, compress the tarball using gzip(1), and label the resulting file according to the current date. It can be done with keys, too:
$ tar zcf - /var/www/ /home/*/www/ \ | ssh -i key_rsa -l fred server.example.org 'cat > $(date +"%Y-%m-%d").tgz'
And going the other direction is just as easy for tar(1) to find what is on a remote machine and store the tarball locally.
$ ssh email@example.com 'tar zcf - /var/www/' > backup.tgz
Or here is a fancier example of running tar(1) on the remote machine but storing the tarball locally.
$ ssh -i key_rsa -l fred server.example.org 'tar jcf - /var/www/ /home/*/www/' \ > $(date +"%Y-%m-%d").tar.bz2
So in summary, the secret to using tar(1) for backup is the use of stdout and stdin to effect the transfer through pipes and redirects.
Backup of Files With tar(1) But Without Making A Tarball
Sometimes it is necessary to just transfer the files and directories without making a tarball at the destination. In addition to writing to stdin on the source machine, tar(1) can read from stdin on the destination machine to transfer whole directory hierarchies at once.
$ tar zcf - /var/www/ | ssh -l fred server.example.org "cd /some/path/; tar zxf -"
Or going the opposite direction, it would be the following.
$ ssh 'tar zcf - /var/www/' | (cd /some/path/; tar zxf - )
However, these still copy everything each time they are run. So rsync(1) described above in the previous section might be a better choice in many situations, since on subsequent runs it only copies the changes. Also, depending on the type of data network conditionsm, and CPUs available, compression might be a good idea either with tar(1) or ssh(1) itself.
Backup using dump
$ ssh -t source.example.org 'sudo dump -0an -f - /var/www/ | gzip -c9' > backup.dump.gz
Note that the password prompt for sudo(8) might not be visible and it must be typed blindly.
Or one can go the other direction, copying from the locate server to the remote:
$ sudo dump -0an -f - /var/www/ | gzip -c9 | ssh target.example.org 'cat > backup.dump.gz'
Note again that the password prompt might get hidden in the initial output from dump(8). However, it's still there, even if not visible.
- "How Rsync Works". Samba. http://www.samba.org/rsync/how-rsync-works.html.
- "NEWS for rsync 2.6.0 (1 Jan 2004)". Samba. 2004-01-01. https://download.samba.org/pub/rsync/src/rsync-2.6.0-NEWS. Retrieved 2020-05-02.
- "openrsync imported into the tree". Undeadly. 2019-02-11. https://undeadly.org/cgi?action=article;sid=20190211081518. Retrieved 2020-05-10.