[ale] More perl than linux related question
James CE Johnson
jcej at tragus.org
Mon Jun 18 17:42:30 EDT 2001
Mel Burslan wrote:
>
>Well, I should have used "web application programming" instead of perl
>but I think this does suffice as well.
>
>My question is :
>
>where can I find a perl (or any other reasonably easy to implement web
>programming language) script which can send an http://www.blah-blah.com
>query to a site and store the incoming response page to a file for
>further processing ? Once upon a time I asked the same question on a
>perl message board and someone sent me a link to a perl library site but
>I was not able to figure out the answer.
>
The answer to your specific question is taken care of in three lines:
local $ua = LWP::UserAgent->new;
my $request = HTTP::Request->new('GET',"somePage");
my $response = $ua->request($request,"localCopyOfSomePage");
Below is a quick script that you I use to suck images off of a
web-gallery. Given a list of URLs it will attempt to save any images by
searching for <a href="http://something/foo.(gif|jpg)">
YMMV
#!/usr/bin/perl
use File::Basename;
use File::Path;
use LWP::UserAgent;
use HTML::Parser;
use HTTP::Request;
use HTTP::Response;
select(STDOUT); $| = 1;
local $ua = LWP::UserAgent->new;
for my $page (@ARGV) {
(local $dir = $page) =~ s,/[^/]+$,,;
print STDERR "$page\n";
# Fetch $page and store into the file 'tmp'
my $request = HTTP::Request->new('GET',$page);
my $response = $ua->request($request,'tmp');
my $parser = HTML::Parser->new(
api_version => 3,
start_h => [\&start, "tagname, attr"],
);
$parser->parse_file('tmp');
print "\n";
}
unlink('tmp');
sub start
{
my $tag = shift;
my $attr = shift;
return if( $tag ne 'a' );
my $href = $attr->{href};
# print ".";
return if( $href !~ m/jpg$/i && $href !~ m/gif$/i );
return if( -s $href );
my $p = $href =~ m,^http://, ? $href : "$dir/$href";
my $f = basename($p);
return if( -f $f );
print STDERR " $p\n";
my $request = HTTP::Request->new('GET',$p);
my $response = $ua->request($request,"$f");
sleep(1);
}
--
To unsubscribe: mail majordomo at ale.org with "unsubscribe ale" in message body.
More information about the Ale
mailing list