[ale] wget

Benjamin Dixon beatle at arches.uga.edu
Mon Jan 7 21:00:44 EST 2002



Not really. The idea behind wget is to search a page for links,
follow each links, back to step 1, ad infinitum. Web servers aren't setup
so you can grab just any old data on them (and rightly so). You might,
however, be able to accomplish something similar using FTP, scp, perl (or
any combination of the two ;)).

Ben

On Mon, 7 Jan 2002, Geoffrey wrote:

> Benjamin Dixon wrote:
> > 
> > Are the other two directories linked from the page you're initially
> > grabbing?
> 
> No.  So that is the key I guess.  Is there a way to retrieve such a file
> tree?
> 
> > 
> > Ben
> > 
> > On Mon, 7 Jan 2002, Geoffrey wrote:
> > 
> > > Christopher Bergeron wrote:
> > > >
> > > > wget -r -l0 http://url.com/
> > >
> > > I've tried the following:
> > >
> > > wget -r -l0 http://url.com/user/
> > >
> > > under the user directory are three directories and an index.html file.
> > >
> > > one of the directories is retrieved and the index.html file but not the
> > > other two directories.  What gives????
> > >
> > > >
> > > > be careful though, this will traverse _every_ link.  Including ad's and all
> > > > thier links.  You've been warned, but hopefully this will help.  I'm certain
> > > > the -l flag is what you want, and you can put different levels on it's
> > > > "recursivity" (if that's a word).  Check the man page for details, but look
> > > > for " -l "
> > > >
> > > > -CB
> > > >
> > > > -----Original Message-----
> > > > From: esoteric at 3times25.net [mailto:esoteric at 3times25.net]
> > > > Sent: Monday, January 07, 2002 7:58 PM
> > > > To: ALE
> > > > Subject: [ale] wget
> > > >
> > > > I'm attempting to recursively retrieve a web site using wget.  My
> > > > expectation is that it would travel down all the directories trees and
> > > > get every file.  I've tried various options, but none seem to do what I
> > > > would like.  Clues would be helpful.
> > > >
> > > > For example if I execute the following:
> > > > wget -m www.foo.net/user-dir/
> > > >
> > > > I was expecting everything under user-dir (directories and there
> > > > contents) would be retrieved.  As the -m option includes '-l inf' my
> > > > expectation was that it would traverse
> > > > www.foo.net/user-dir/dir0/...../dirN/
> > > >
> > > > Suggestions?
> > > >
> > > > --
> > > > Until later: Geoffrey           esoteric at 3times25.net
> > > >
> > > > "...the system (Microsoft passport) carries significant risks to users
> > > > that
> > > > are not made adequately clear in the technical documentation available."
> > > > - David P. Kormann and Aviel D. Rubin, AT&T Labs - Research
> > > > - http://www.avirubin.com/passport.html
> > > >
> > > > ---
> > > > This message has been sent through the ALE general discussion list.
> > > > See http://www.ale.org/mailing-lists.shtml for more info. Problems should be
> > > > sent to listmaster at ale dot org.
> > >
> > > --
> > > Until later: Geoffrey         esoteric at 3times25.net
> > >
> > > "...the system (Microsoft passport) carries significant risks to users
> > > that
> > > are not made adequately clear in the technical documentation available."
> > > - David P. Kormann and Aviel D. Rubin, AT&T Labs - Research
> > > - http://www.avirubin.com/passport.html
> > >
> > > ---
> > > This message has been sent through the ALE general discussion list.
> > > See http://www.ale.org/mailing-lists.shtml for more info. Problems should be
> > > sent to listmaster at ale dot org.
> > >
> > 
> > Today's Random Quote--------------------------------------
> > 
> >  43rd Law of Computing: Anything that can go wr .signature:
> > Segmentation violation -- Core dumped
> > 
> > -----------------------------------------------------------
> 
> --
> Until later: Geoffrey		esoteric at 3times25.net
> 
> "...the system (Microsoft passport) carries significant risks to users
> that
> are not made adequately clear in the technical documentation available."
> - David P. Kormann and Aviel D. Rubin, AT&T Labs - Research
> - http://www.avirubin.com/passport.html
> 

Today's Random Quote--------------------------------------

 43rd Law of Computing: Anything that can go wr .signature:
Segmentation violation -- Core dumped

-----------------------------------------------------------


---
This message has been sent through the ALE general discussion list.
See http://www.ale.org/mailing-lists.shtml for more info. Problems should be 
sent to listmaster at ale dot org.






More information about the Ale mailing list