Here we backup a linux folder /data with all of its data to a remote system. 7zip has the best compression (or some of the best), so we want to use that. However 7zip was made originally for windows so it doesnt preserve linux attributes like permissions. So first we tar up the data & then 7zip it (it can be done with 1 command). After that I show you how to rsync over that 7zip file to a remote side using rsync over ssh method (without setting up any rsync servers or anything)
Requirements:
are to have rsync and 7zip and ssh on both sides openssh-server and openssh-client
# apt-get update # apt-get install openssh-server; # I assume this is setup and thats how your accessing both remote and local system
# apt-get install openssh-client; # apt-get install p7zip-full; # apt-get install rsync;
Article:
7zip does not preserve linux permissions and attribute, but tar does.
So its best to tar and pipe output to 7z and create a tar.7z file.
NOTE: if your working with sparse files. Use bsdtar instead of tar. A 100 Gig sparse file with only 1 MB of data will take a while to tar, but with bsdtar it will be quicker. the end resulting tar file will be identical and can be extracted with regular or bsdtar.
# tar -cvf - /data/ | 7z a -mx3 -si /mnt/data.tar.7z # tar -cvf - /compressed/ | 7z a -mx0 -si /mnt/compressed.tar.7z
NOTE: -mx9 is best compression, -mx3 is okay compression, -mx0 is copy compression (doesnt compress, best to use on already compressed files or files that dont compress well such as pictures, videos, other compressed files)
If you wanted to extract it, it would look like this:
First you need to un7zip. That will give you 1 big tar file. Then you need to extract the tar file.
# cd /mnt
Lets extract data.tar.7z first (no reason)
# 7z x data.tar.7z # mkdir /mnt/data # tar xvf data.tar -C /mnt/data/
This will probably create /mnt/data/data. So to avoid that skip the mkdir and -C /mnt/data command
# 7z x compressed.tar.7z # mkdir /mnt/compressed # tar xvf PheeMacHD.tar -C /mnt/compressed
NOTE: there is no need to specify the type of compression method you used, for obvious reasons (such as you might not know the compression method as it was made by someone else)
Send your files to a server via rsync (over ssh)
All you need on the remote side is an ssh server that you can connect to and make sure that it also has a running copy of rsync.
Check this out first: https://www.digitalocean.com/community/tutorials/how-to-copy-files-with-rsync-over-ssh
We will use rsync to send these giant files over because its convinient and can be resumed.
Use -a to send over attributes and recurse thru folders (we are just copying 1 file so the recursive part doesnt matter). Look at man page or rsync –help to see all of the options that -a does
Use -v to get extra information and output.
Use -P which gives –progress and –partial. Progress shows progress per file. It cannot show progress of the whole transfer (in regards to a many file / folder transfer), its only a progress bar per file. We could use -h for human readable speeds as well.
Partial will make a partial file (with random chars appended to the right end, while its copying the file – its also made hidden with a dot prepended to left end). When its done. The partial file will be complete (it will remove those random chars at the end & remove the dot on the left side so that its not hidden anymore).
Partial files can be resumed. Without Partial files option if rsync fails or is cancelled then the transfer cancels and whatever was copied to the remote end, gets deleted. With partial files enabled, if rsync fails or is cancelled, the partial file remains (with its random chars on the right end still appended – so that you know its not a complete file) and
Rsync is nice because if you change a few bits/megs/gig around a huge file at random spots, rsync will be smart enough to locate the changes and only copy those parts. Its pretty quick. It does take time for it to realize what changed and send those parts over.
Likewise RSYNC can find out where it left off and resume
Use -e to specify that we will connect via SSH and then use rsync to copy the file from locally to the remote location. Note that you dont need an rsync server setup on the remote end. We are not using rsync server (instead one is setup up for us automatically by the local rsync & ssh connection). We are using ssh server. All that is required on the remote end is an ssh server & and an installation of rsync.
Upon connection you will be asked for a password. You can avoid that if you have keys in place.
# rsync -avzP -e "ssh -p 2222" /mnt/data.tar.7z user1@remote.server.com:/data/data.tar.7z # rsync -avzP -e "ssh -p 2222" /mnt/compressed.tar.7z user1@remote.server.com:/data/compressed.tar.7z
NOTE: because of the format [host]:/location. This uses ssh and rsync. Instead of a regular rsync server.
You dont need to specify remote file name, we can just ask it to dump to the data folder.
# rsync -avzP -e "ssh -p 2222" /mnt/data.tar.7z user1@remote.server.com:/data/ # rsync -avzP -e "ssh -p 2222" /mnt/compressed.tar.7z user1@remote.server.com:/data/
If your using regular port 22 for ssh, you can omit the “-p 2222”
# rsync -avzP -e "ssh" /mnt/data.tar.7z user1@forge.infotinks.com:/data/ # rsync -avzP -e "ssh" /mnt/compressed.tar.7z user1@forge.infotinks.com:/data/
If your going to be using this type of transfer often I recommend seting the following options as well. As well as getting the public ssh key transfered over to a new line in /home/user1/.ssh/authorized_keys
SIDENOTE: Basically the local side needs to have your private key (dont send it thats your id_rsa) and then you copy your id_rsa.pub (its 1 line of text) from the local side to the remote sides /home/user1/.ssh/authorized_keys file (append to it that 1 line)
SIDENOTE: Have a read on copying ssh keys: http://askubuntu.com/questions/4830/easiest-way-to-copy-ssh-keys-to-another-machine
SIDENOTE: Here is ways to copy SSH keys with out ssh copy function: http://www.commandlinefu.com/commands/view/188/copy-your-ssh-public-key-to-a-server-from-a-machine-that-doesnt-have-ssh-copy-id
# rsync -avzP -e "ssh -p 2222 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null" /mnt/data.tar.7z user1@remote.server.com:/data/ # rsync -avzP -e "ssh -p 2222 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null" /mnt/compressed.tar.7z user1@remote.server.com:/data/
These 2 options insure that one pesky question about known_hosts file entries being added or being different is skipped. Instead of asking it will just add the entry to ~/.ssh/known_hosts. But we dont want to mess up our ~/.ssh/known_hosts for this backup. So instead we ask it to add it to /dev/null, that way no harm is done to your actualy known_hosts file.
So it basically avoids this message from coming up:
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! Someone could be eavesdropping on you right now (man-in-the-middle attack)! It is also possible that a host key has just been changed. The fingerprint for the ECDSA key sent by the remote host is 59:77:e7:46:9d:83:de:b9:7d:0a:0f:e0:04:b0:fc:9a. Please contact your system administrator. Add correct host key in /root/.ssh/known_hosts to get rid of this message. Offending ECDSA key in /root/.ssh/known_hosts:9 ECDSA host key for [forge.infotinks.com]:2022 has changed and you have requested strict checking. Host key verification failed.
NOTE: here is an article about ignoring the authenticity of a host (I over it in the notes below either way, so this article is optional to read): http://linuxcommando.blogspot.com/2008/10/how-to-disable-ssh-host-key-checking.html. I also have my own article on this http://www.infotinks.com/ignoring-ssh-authenticity-of-host/. Essentially you just add -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no to your ssh options. So it looks like this ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no [rest of your ssh options] [user]@[server] [optional any commands that you want to run]
NOTE: of course the above message isnt that bad, you can just fix it by clearing the known_hosts file, or clearing the line its complaining about (in the above case its line number 9). With scripts that you want to depend on its best to just use the two -o options StrictHostKeyChecking and UserKnownHostsFile to avoid it
Straight from the SSH manuals:
https://www.freebsd.org/cgi/man.cgi?query=ssh_config&sektion=5&apropos=0&manpath=FreeBSD+10.2-RELEASE
another worthy read is https://www.freebsd.org/cgi/man.cgi?query=ssh&sektion=1
UserKnownHostsFile
Specifies one or more files to use for the user host key data-
base, separated by whitespace. The default is
~/.ssh/known_hosts, ~/.ssh/known_hosts2.
StrictHostKeyChecking
If this flag is set to “yes”, ssh(1) will never automatically
add host keys to the ~/.ssh/known_hosts file, and refuses to con-
nect to hosts whose host key has changed. This provides maximum
protection against trojan horse attacks, though it can be annoy-
ing when the /etc/ssh/ssh_known_hosts file is poorly maintained
or when connections to new hosts are frequently made. This
option forces the user to manually add all new hosts. If this
flag is set to “no”, ssh will automatically add new host keys
to the user known hosts files. If this flag is set to “ask”,
new host keys will be added to the user known host files only
after the user has confirmed that is what they really want to do,
and ssh will refuse to connect to hosts whose host key has
changed. The host keys of known hosts will be verified automati-
cally in all cases. The argument must be “yes”, “no”, or
“ask”. The default is “ask”.
Another option besides getting ssh keys over is to use sshpass.
# apt-get update # apt-get install sshpass
This is how it works: # sshpass -p ‘user1spassword’ ssh -p 22 user1@remote.server.com
That will connect you to user1 remote.server.com ssh server without asking for a password
To avoid that one host key question you can always do this
# sshpass -p 'user1spassword' ssh -p 2222 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null user1@remote.server.com
sshpass can also be used with rsync.
# sshpass -p 'user1password' rsync -avzP -e "ssh -p 2222 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null" /mnt/data.tar.7z user1@remote.server.com:/data/ # sshpass -p 'user1password' rsync -avzP -e "ssh -p 2222 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null" /mnt/compressed.tar.7z user1@remote.server.com:/data/
NOTE: so what happens on the remote side?
Well rsync connects from the source side via ssh, and it asks rsync (the local one, the destination side, remote.server.com ssh server) to create a server to catch the files and it looks like this with ps:
# ps ax 2882 ? Ss 0:00 rsync --server -vlogDtprze.iLsfx --partial . /data/Main/Phee/PheePicsOrganized.7z 2885 ? S 3:32 rsync --server -vlogDtprze.iLsfx --partial . /data/Main/Phee/PheePicsOrganized.7z
NOTE: there is another type of connection that rsync can do and that is if you manually setup an rsync server on the remote side using those rsync.conf configuration files and then rsync listens on some port waiting for a connection, which doesnt use ssh, however you can still use encryption to encrypt that data. this method is not covered here.