[ale] wget

Geoffrey esoteric at 3times25.net
Mon Jan 7 21:04:06 EST 2002


Benjamin Dixon wrote:
> 
> Not really. The idea behind wget is to search a page for links,
> follow each links, back to step 1, ad infinitum. Web servers aren't setup
> so you can grab just any old data on them (and rightly so). You might,
> however, be able to accomplish something similar using FTP, scp, perl (or
> any combination of the two ;)).

Duh, I dare say I've done it before.  Okay so I was braindead for a
while there...  ncftp option 'get -R'  Works very nicely and has in the
past.

Trying to use a new tool, problem is, I needed a wrench and was using a
hammer...

Thanks.

> 
> Ben
> 
> On Mon, 7 Jan 2002, Geoffrey wrote:
> 
> > Benjamin Dixon wrote:
> > >
> > > Are the other two directories linked from the page you're initially
> > > grabbing?
> >
> > No.  So that is the key I guess.  Is there a way to retrieve such a file
> > tree?
> >
> > >
> > > Ben
> > >
> > > On Mon, 7 Jan 2002, Geoffrey wrote:
> > >
> > > > Christopher Bergeron wrote:
> > > > >
> > > > > wget -r -l0 http://url.com/
> > > >
> > > > I've tried the following:
> > > >
> > > > wget -r -l0 http://url.com/user/
> > > >
> > > > under the user directory are three directories and an index.html file.
> > > >
> > > > one of the directories is retrieved and the index.html file but not the
> > > > other two directories.  What gives????
> > > >
> > > > >
> > > > > be careful though, this will traverse _every_ link.  Including ad's and all
> > > > > thier links.  You've been warned, but hopefully this will help.  I'm certain
> > > > > the -l flag is what you want, and you can put different levels on it's
> > > > > "recursivity" (if that's a word).  Check the man page for details, but look
> > > > > for " -l "
> > > > >
> > > > > -CB
> > > > >
> > > > > -----Original Message-----
> > > > > From: esoteric at 3times25.net [mailto:esoteric at 3times25.net]
> > > > > Sent: Monday, January 07, 2002 7:58 PM
> > > > > To: ALE
> > > > > Subject: [ale] wget
> > > > >
> > > > > I'm attempting to recursively retrieve a web site using wget.  My
> > > > > expectation is that it would travel down all the directories trees and
> > > > > get every file.  I've tried various options, but none seem to do what I
> > > > > would like.  Clues would be helpful.
> > > > >
> > > > > For example if I execute the following:
> > > > > wget -m www.foo.net/user-dir/
> > > > >
> > > > > I was expecting everything under user-dir (directories and there
> > > > > contents) would be retrieved.  As the -m option includes '-l inf' my
> > > > > expectation was that it would traverse
> > > > > www.foo.net/user-dir/dir0/...../dirN/
> > > > >
> > > > > Suggestions?
> > > > >
> > > > > --
> > > > > Until later: Geoffrey           esoteric at 3times25.net
> > > > >
> > > > > "...the system (Microsoft passport) carries significant risks to users
> > > > > that
> > > > > are not made adequately clear in the technical documentation available."
> > > > > - David P. Kormann and Aviel D. Rubin, AT&T Labs - Research
> > > > > - http://www.avirubin.com/passport.html
> > > > >
> > > > > ---
> > > > > This message has been sent through the ALE general discussion list.
> > > > > See http://www.ale.org/mailing-lists.shtml for more info. Problems should be
> > > > > sent to listmaster at ale dot org.
> > > >
> > > > --
> > > > Until later: Geoffrey         esoteric at 3times25.net
> > > >
> > > > "...the system (Microsoft passport) carries significant risks to users
> > > > that
> > > > are not made adequately clear in the technical documentation available."
> > > > - David P. Kormann and Aviel D. Rubin, AT&T Labs - Research
> > > > - http://www.avirubin.com/passport.html
> > > >
> > > > ---
> > > > This message has been sent through the ALE general discussion list.
> > > > See http://www.ale.org/mailing-lists.shtml for more info. Problems should be
> > > > sent to listmaster at ale dot org.
> > > >
> > >
> > > Today's Random Quote--------------------------------------
> > >
> > >  43rd Law of Computing: Anything that can go wr .signature:
> > > Segmentation violation -- Core dumped
> > >
> > > -----------------------------------------------------------
> >
> > --
> > Until later: Geoffrey         esoteric at 3times25.net
> >
> > "...the system (Microsoft passport) carries significant risks to users
> > that
> > are not made adequately clear in the technical documentation available."
> > - David P. Kormann and Aviel D. Rubin, AT&T Labs - Research
> > - http://www.avirubin.com/passport.html
> >
> 
> Today's Random Quote--------------------------------------
> 
>  43rd Law of Computing: Anything that can go wr .signature:
> Segmentation violation -- Core dumped
> 
> -----------------------------------------------------------
> 
> ---
> This message has been sent through the ALE general discussion list.
> See http://www.ale.org/mailing-lists.shtml for more info. Problems should be
> sent to listmaster at ale dot org.

--
Until later: Geoffrey		esoteric at 3times25.net

"...the system (Microsoft passport) carries significant risks to users
that
are not made adequately clear in the technical documentation available."
- David P. Kormann and Aviel D. Rubin, AT&T Labs - Research
- http://www.avirubin.com/passport.html

---
This message has been sent through the ALE general discussion list.
See http://www.ale.org/mailing-lists.shtml for more info. Problems should be 
sent to listmaster at ale dot org.






More information about the Ale mailing list