[ale] parallel processing

hirsch at zapmedia.com hirsch at zapmedia.com
Mon Jan 7 13:01:27 EST 2002


Andy,

I think your best bet is the solution using tar that I sent earlier.
You use one tar to read the files and sent the output to stdout which
is piped to another tar which reads from stdin and writes them out.

This lets you use two processes.  Maybe not ideal, but it does speed
things up if your two directories are on different hard drives.  If
your dirs are all on the same HD I doubt if you can get any speedup no
matter what you do.

In fact, I take back what I wrote above.  I thing that two processes
is ideal.  Unless you have unusual hardware, a copy is I/O bound.  You
can't copy faster than your disk can read.  So having one tar
dedicated to reading from the first disk, and one tar dedicated to
writing to the second disk is the best you can hope for.

--Michael

Zyman, Andy writes:
 > Jeff,
 > Thank You  for reply.
 > Yes, if I specify "&" the job will go in the background. So having a couple
 > of background jobs is the answer.
 >  But the reason I was asking this is:
 > 1. I don't know how many files the dir has, so I can't (?) specify 
 > "filesA .... fileB shuld be copied by this job "
 > "filesC .... fileD should be copied by this job "
 > ....
 > 
 > I was thinking about this situation - cp  ./dirA/*   ./dirB 
 > 
 > The files are big enough to drive me nuts waiting when cp will be done (
 > each about 5Gb -50GB X 10-15 files in diff. dirs)
 > 2. To copy files, I'm creating the file with locations of these files and
 > do 
 > "while read"
 > loop to copy each one in a time ( which is not efficient :< )
 > I can't really apply "&" here because I need to check that all files are
 > copied before proceeding any farther - this is mean "control point" in this
 > operation...
 > So I was thinking about smth. else, but not background....
 > 
 > Thank You
 >  Andy
 > office: 212 849 3543
 > 
 > > -----Original Message-----
 > > From: jeff hubbs [mailto:hbbs at mediaone.net]
 > > Sent: Friday, January 04, 2002 3:45 PM
 > > To: Zyman, Andy
 > > Cc: 'ale at ale.org'
 > > Subject: Re: [ale] parallel processing
 > > 
 > > 
 > > Andy -
 > > 
 > > You can execute each cp command with an ampersand ("&") at the end 
 > > (that's sometimes called "amping off") but unless the files/dirs are 
 > > really big, the cps will finish before you can type the next one - is 
 > > that what you're talking about?
 > > 
 > > - Jeff
 > > 
 > > Zyman, Andy wrote:
 > > 
 > > > Hello,
 > > > I just wonder :
 > > > Let's say in Oracle we have parallel SQL processing - I can 
 > > specify that
 > > > certain SQL should be processed in parallel. 
 > > > What about Unix? Let's say I have  a directory on Sun 
 > > server with 10 files.
 > > > I want to cp them to the diff. one. I'm doing cp xxx yyy. 
 > > Now how it will
 > > > processed and who I can see that it will be done in parallel?
 > > > 
 > > > Thank You
 > > >  Andy
 > > > 
 > > > ---
 > > > This message has been sent through the ALE general discussion list.
 > > > See http://www.ale.org/mailing-lists.shtml for more info. 
 > > Problems should be 
 > > > sent to listmaster at ale dot org.
 > > > 
 > > > 
 > > > 
 > > 
 > > 
 > > 
 > > 
 > 
 > ---
 > This message has been sent through the ALE general discussion list.
 > See http://www.ale.org/mailing-lists.shtml for more info. Problems should be 
 > sent to listmaster at ale dot org.




---
This message has been sent through the ALE general discussion list.
See http://www.ale.org/mailing-lists.shtml for more info. Problems should be 
sent to listmaster at ale dot org.






More information about the Ale mailing list