[ale] best way to copy 3Tb of data

Todor Fassl fassl.tod at gmail.com
Tue Oct 27 11:47:11 EDT 2015


Man, I wish I had your "hand" (Seinfeld reference). I'd get fired if I 
tried that.

We had another guy who kept running his code on like 5 or 6 different 
machines at a time.  I kept trying to steer him toward condor.  He 
insisted condor wouldn't work for him. How can it not?

On 10/27/2015 10:38 AM, Jim Kinney wrote:
> I implemented a cron job to delete scratch data created over 30 days
> ago. That didn't go well with the people who were eating up all space
> and not paying for hard drives. So I gave them a way to extend
> particular areas up to 90 days. Day 91 it was deleted. So they wrote a
> script to copy their internet archive around every 2 weeks to keep the
> creation date below the 30 day cut off. So I shrunk the partition of
> /scratch to about 10G larger than was currently in use. He couldn't do
> his runs to graduate in time without cleaning up his mess. It also
> pissed off other people and they yelled at him when I gave my report of
> who the storage hog was.
>
> On October 27, 2015 11:24:48 AM EDT, Todor Fassl <fassl.tod at gmail.com>
> wrote:
>
>     I dunno.  First of all, I don't have any details on what's going on on
>     the HPC cluster. All I know is the researcher says he needs to back up
>     his  3T of scratch data because they are telling him it will be erased
>     when they upgrade something or other. Also, I don't know how you can
>     have 3T of scratch data or why, if it's scratch data, it can't just be
>     deleted. I come across this all the time though. Researchers pretty
>     regularly generate 1T+ of what they insist is scratch data.
>
>     In fact, I've had this discussion with this very same researcher. He's
>     not the only one who does this but he happens to be the guy who i last
>     questioned about it. You know this "scratch" space isn't backed up or
>     anything. If the NAS burns up or if you type in the wrong rm command,
>     it's gone. No problem, it's just scratch data. Well, then how come I
>     can't just delete it when I want to re-do the network storage
>     device?
>
>     They get mad if you push them too hard.
>
>
>
>
>
>     On 10/27/2015 09:45 AM, Jim Kinney wrote:
>
>         Dumb question: Why is data _stored_ on an HPC cluster? The
>         storage for
>         an HPC should be a separate entity entirely. It's a High Performance
>         cluster, not a Large Storage cluster. Ideally, a complete
>         teardown and
>         rebuild of an HPC should have exactly zero impact on the HPC users'
>         data. Any data kept on the local space of an HPC is purely
>         scratch/temp
>         data and is disposable with the possible exception of checkpoint
>         data
>         and that should be written back to the main storage and deleted
>         once the
>         full run is completed.
>
>         On Tue, 2015-10-27 at 08:33 -0500, Todor Fassl wrote:
>
>             One of the researchers I support wants to backup 3T of data
>             to his space
>             on our NAS. The data is on an HPC cluster on another
>             network. It's not
>             an on-going backup. He just needs to save it to our NAS
>             while the HPC
>             cluster is rebuilt. Then he'll need to copy it right back.
>
>             There is a very stable 1G connection between the 2 networks.
>             We have
>             plenty of space on our NAS. What is the best way to do the
>             caopy?
>             Ideally, it seems we'd want to have boththe ability to
>             restart the copy
>             if it fails part way through and to end up with a compressed
>             archive
>             like a tarball. Googling around tends to suggest that it's
>             eitehr rsync
>             or tar. But with rsync, you wouldn't end up with a tarball.
>             And with
>             tar, you can't restart it in the middle. Any other ideas?
>             Since the network connection is very stable, I am thinking
>             of suggesting
>             tar.
>
>             tar zcvf - /datadirectory | sshuser at backup.server
>             <mailto:user at backup.server> "cat > backupfile.tgz"
>
>             If the researcher would prefer his data to be copied to our
>             NAS as
>             regular files, just use rsync with compression. We don't
>             have an rsync
>             server that is accessible to the outside world. He could use
>             ssh with
>             rsync but I could set up rsync if it would be worthwhile.
>
>             Ideas? Suggestions?
>
>
>
>             on at the far end.
>
>             He is going to need to copy the data back in a few weeks. It
>             might even
>             be worthwhile to send it via tar without
>             uncompressing/unarchiving it on
>             receiving end.
>
>
>
>             ------------------------------------------------------------------------
>
>             Ale mailing list
>             Ale at ale.org <mailto:Ale at ale.org>
>             http://mail.ale.org/mailman/listinfo/ale
>             See JOBS, ANNOUNCE and SCHOOLS lists at
>             http://mail.ale.org/mailman/listinfo
>
>
>         --
>         James P. Kinney III
>
>         Every time you stop a school, you will have to build a jail.
>         What you
>         gain at one end you lose at the other. It's like feeding a dog
>         on his
>         own tail. It won't fatten the dog.
>         - Speech 11/23/1900 Mark Twain
>
>         http://heretothereideas.blogspot.com/
>
>
>
>         ------------------------------------------------------------------------
>
>         Ale mailing list
>         Ale at ale.org
>         http://mail.ale.org/mailman/listinfo/ale
>         See JOBS, ANNOUNCE and SCHOOLS lists at
>         http://mail.ale.org/mailman/listinfo
>
>
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.

-- 
Todd


More information about the Ale mailing list