[ale] best way to copy 3Tb of data
Todor Fassl
fassl.tod at gmail.com
Tue Oct 27 09:33:37 EDT 2015
One of the researchers I support wants to backup 3T of data to his space
on our NAS. The data is on an HPC cluster on another network. It's not
an on-going backup. He just needs to save it to our NAS while the HPC
cluster is rebuilt. Then he'll need to copy it right back.
There is a very stable 1G connection between the 2 networks. We have
plenty of space on our NAS. What is the best way to do the caopy?
Ideally, it seems we'd want to have boththe ability to restart the copy
if it fails part way through and to end up with a compressed archive
like a tarball. Googling around tends to suggest that it's eitehr rsync
or tar. But with rsync, you wouldn't end up with a tarball. And with
tar, you can't restart it in the middle. Any other ideas?
Since the network connection is very stable, I am thinking of suggesting
tar.
tar zcvf - /datadirectory | ssh user at backup.server "cat > backupfile.tgz"
If the researcher would prefer his data to be copied to our NAS as
regular files, just use rsync with compression. We don't have an rsync
server that is accessible to the outside world. He could use ssh with
rsync but I could set up rsync if it would be worthwhile.
Ideas? Suggestions?
on at the far end.
He is going to need to copy the data back in a few weeks. It might even
be worthwhile to send it via tar without uncompressing/unarchiving it on
receiving end.
More information about the Ale
mailing list