[ale] Extraction of address and pages
Dylan Northrup
docx at io.com
Thu Nov 4 18:15:58 EST 2004
A long time ago, (04.11.04), in a galaxy far, far away, Christopher Fowler...:
:=I'm trying to get http://addr:port/page
:=
:=from:
:=
:=GET http://www.google.com/ HTTP/1.1
:=
:=this sucks as it is too greedy. Anyone have a suggestion.
:=$m =~ m/http:\/\/(.+)\/\s+/;
docx> cat foo.pl
#!/usr/bin/perl -w
$url[0] = 'http://www.google.com:80/gmail HTTP/1.1';
$url[1] = 'http://www.google.com/gmail HTTP/1.1';
$url[2] = 'http://www.google.com/ HTTP/1.1';
$url[3] = 'http://www.google.com/';
$url[4] = 'http://www.google.com:80/';
foreach $url (@url) {
$host_port = ''; $page = ''; $protocol = '';
($host_port, $page, $protocol) = $url =~ m#http://(.*?)/([^\s]*)\s*(.*)#;
$host = $host_port; $port = '';
($host, $port) = split /:/, $host_port if $host_port =~ /:/;
print "host: $host\nport: $port\npage: $page\nprotocol: $protocol\n--\n";
}
docx> ./foo.pl
host: www.google.com
port: 80
page: gmail
protocol: HTTP/1.1
--
host: www.google.com
port:
page: gmail
protocol: HTTP/1.1
--
host: www.google.com
port:
page:
protocol: HTTP/1.1
--
host: www.google.com
port:
page:
protocol:
--
host: www.google.com
port: 80
page:
protocol:
--
--
Dylan Northrup - docx at io.com - http://www.io.com/~docx/
"Harder to work, harder to strive, hard to be glad to be alive, but it's
really worth it if you give it a try." -- Cowboy Mouth, 'Easy'
More information about the Ale
mailing list