[ale] best way to copy 3Tb of data
DJ-Pfulio
djpfulio at jdpfu.com
Wed Oct 28 14:43:07 EDT 2015
It is pretty old.
I've just taken to using arcfour encryption for local xfers rather than trying
to swap out one-off code.
Call me lazy. I'm willing to manually maintain 1-2 programs per box, but only
if those are THE reason for the box to exist. Everything else **must** come
from the distro.
Call me lazy, again.
On 10/28/2015 01:24 PM, Jim Kinney wrote:
> Holy Crap!! That's a huge performance hole! Many, many thanks for the
> info on this. Now to push upstream builders of ssh on CentOS to make
> sure that patch is included.
> On Wed, 2015-10-28 at 13:15 -0400, Brian Mathis wrote:
>> The default SSH implementation does have intrinsic performance
>> limits, as outlined here:
>> http://www.psc.edu/index.php/hpn-ssh
>> so not using SSH might help with the speed, as long as the loss of
>> security can be tolerated.
>>
>>
>> ~ Brian Mathis
>> @orev
>>
>>
>> On Tue, Oct 27, 2015 at 11:06 AM, Jim Kinney <jim.kinney at gmail.com>
>> wrote:
>>> The slowdown is the encypt/decrypt. The non-server method relies on
>>> ssh for transport. You can also use rsh for no security at all and
>>> it will be faster. By using the rsync server, you drop the ssh
>>> security so if the user must enter credentials to access the NAS,
>>> you might want to double check if the rsync credentials are sent as
>>> plain text in the same way as ftp.
>>> On Oct 27, 2015 10:49 AM, "Todor Fassl" <fassl.tod at gmail.com>
>>> wrote:
>>>> I know you don't need to have an rsync server to copy files via
>>>> rsync but from what I've read, rsync protocol is way fater than
>>>> ssh. And you have to have an rsync server to use the rsync
>>>> protocol, right?
>>>>
>>>> On 10/27/2015 08:46 AM, Jim Kinney wrote:
>>>>> Rsync doesn't require an rsync server. It provides a solid
>>>>> backup. Rsync
>>>>> it back and it's all golden.
>>>>>
>>>>> Tarball will need enough space to be built or will need to be
>>>>> built
>>>>> 'over the wire' using a tar||tar process.
>>>>> Second
>>>>> optional.
>>>>>
>>>>> Tar is faster but rsync is easier.
>>>>>
>>>>> A 4TB external hard drive and sneaker net also works and
>>>>> provides
>>>>> verifiable copies. Rsync to a local drive is fast especially
>>>>> with an
>>>>> external sata port.
>>>>>
>>>>> On Oct 27, 2015 9:37 AM, "Todor Fassl" <fassl.tod at gmail.com
>>>>> fassl.tod at gmail.com>> wrote:
>>>>>
>>>>> One of the researchers I support wants to backup 3T of data
>>>>> to his
>>>>> space on our NAS. The data is on an HPC cluster on another
>>>>> network.
>>>>> It's not an on-going backup. He just needs to save it to
>>>>> our NAS
>>>>> while the HPC cluster is rebuilt. Then he'll need to copy
>>>>> it right back.
>>>>>
>>>>> There is a very stable 1G connection between the 2
>>>>> networks. We have
>>>>> plenty of space on our NAS. What is the best way to do the
>>>>> caopy?
>>>>> Ideally, it seems we'd want to have boththe ability to
>>>>> restart the
>>>>> copy if it fails part way through and to end up with a
>>>>> compressed
>>>>> archive like a tarball. Googling around tends to suggest
>>>>> that it's
>>>>> eitehr rsync or tar. But with rsync, you wouldn't end up
>>>>> with a
>>>>> tarball. And with tar, you can't restart it in the middle.
>>>>> Any other
>>>>> ideas?
>>>>> Since the network connection is very stable, I am thinking
>>>>> of
>>>>> suggesting tar.
>>>>>
>>>>> tar zcvf - /datadirectory | ssh user at backup.server "cat >
>>>>> backupfile.tgz"
>>>>>
>>>>> If the researcher would prefer his data to be copied to our
>>>>> NAS as
>>>>> regular files, just use rsync with compression. We don't
>>>>> have an
>>>>> rsync server that is accessible to the outside world. He
>>>>> could use
>>>>> ssh with rsync but I could set up rsync if it would be
>>>>> worthwhile.
>>>>>
>>>>> Ideas? Suggestions?
>>>>>
>>>>>
More information about the Ale
mailing list