[ale] best way to copy 3Tb of data

DJ-Pfulio djpfulio at jdpfu.com
Wed Oct 28 14:43:07 EDT 2015


It is pretty old.

I've just taken to using arcfour encryption for local xfers rather than trying
to swap out one-off code.

Call me lazy.  I'm willing to manually maintain 1-2 programs per box, but only
if those are THE reason for the box to exist.  Everything else **must** come
from the distro.

Call me lazy, again.

On 10/28/2015 01:24 PM, Jim Kinney wrote:
> Holy Crap!! That's a huge performance hole! Many, many thanks for the
> info on this. Now to push upstream builders of ssh on CentOS to make
> sure that patch is included.
> On Wed, 2015-10-28 at 13:15 -0400, Brian Mathis wrote:
>> The default SSH implementation does have intrinsic performance
>> limits, as outlined here:
>>     http://www.psc.edu/index.php/hpn-ssh
>> so not using SSH might help with the speed, as long as the loss of
>> security can be tolerated.
>>
>>
>> ~ Brian Mathis
>> @orev
>>
>>
>> On Tue, Oct 27, 2015 at 11:06 AM, Jim Kinney <jim.kinney at gmail.com>
>> wrote:
>>> The slowdown is the encypt/decrypt. The non-server method relies on
>>> ssh for transport. You can also use rsh for no security at all and
>>> it will be faster. By using the rsync server, you drop the ssh
>>> security so if the user must enter credentials to access the NAS,
>>> you might want to double check if the rsync credentials are sent as
>>> plain text in the same way as ftp.
>>> On Oct 27, 2015 10:49 AM, "Todor Fassl" <fassl.tod at gmail.com>
>>> wrote:
>>>> I know you don't need to have an rsync server to copy files via
>>>> rsync but from what I've read, rsync protocol is way fater than
>>>> ssh. And you have to have an rsync server to use the rsync
>>>> protocol, right?
>>>>
>>>> On 10/27/2015 08:46 AM, Jim Kinney wrote:
>>>>>  Rsync doesn't require an rsync server. It provides a solid
>>>>> backup. Rsync
>>>>> it back and it's all golden.
>>>>>
>>>>> Tarball will need enough space to be built or will need to be
>>>>> built
>>>>> 'over the wire' using a tar||tar process.
>>>>> Second
>>>>> optional.
>>>>>
>>>>> Tar is faster but rsync is easier.
>>>>>
>>>>> A 4TB external hard drive and sneaker net also works and
>>>>> provides
>>>>> verifiable copies. Rsync to a local drive is fast especially
>>>>> with an
>>>>> external sata port.
>>>>>
>>>>> On Oct 27, 2015 9:37 AM, "Todor Fassl" <fassl.tod at gmail.com
>>>>> fassl.tod at gmail.com>> wrote:
>>>>>
>>>>>     One of the researchers I support wants to backup 3T of data
>>>>> to his
>>>>>     space on our NAS. The data is on an HPC cluster on another
>>>>> network.
>>>>>     It's not an on-going backup. He just needs to save it to
>>>>> our NAS
>>>>>     while the HPC cluster is rebuilt. Then he'll need to copy
>>>>> it right back.
>>>>>
>>>>>     There is a very stable 1G connection between the 2
>>>>> networks. We have
>>>>>     plenty of space on our NAS. What is the best way to do the
>>>>> caopy?
>>>>>     Ideally, it seems we'd want to have boththe ability to
>>>>> restart the
>>>>>     copy if it fails part way through and to end up with a
>>>>> compressed
>>>>>     archive like a tarball. Googling around tends to suggest
>>>>> that it's
>>>>>     eitehr rsync or tar. But with rsync, you wouldn't end up
>>>>> with a
>>>>>     tarball. And with tar, you can't restart it in the middle.
>>>>> Any other
>>>>>     ideas?
>>>>>     Since the network connection is very stable, I am thinking
>>>>> of
>>>>>     suggesting tar.
>>>>>
>>>>>     tar zcvf - /datadirectory | ssh user at backup.server "cat >
>>>>>     backupfile.tgz"
>>>>>
>>>>>     If the researcher would prefer his data to be copied to our
>>>>> NAS as
>>>>>     regular files, just use rsync with compression. We don't
>>>>> have an
>>>>>     rsync server that is accessible to the outside world. He
>>>>> could use
>>>>>     ssh with rsync but I could set up rsync if it would be
>>>>> worthwhile.
>>>>>
>>>>>     Ideas? Suggestions?
>>>>>
>>>>>



More information about the Ale mailing list