[ale] Parsing non-XML web pages?
Fletch
fletch at phydeaux.org
Thu Jun 13 11:48:51 EDT 2002
>>>>> "Brian" == Brian J Dowd <bdowd at dentfirst.com> writes:
Brian> Anyone know of any good tools which would help simplify the
Brian> parsing of specific data out of a downloaded html page? I'm
Brian> looking to pull out specific numbers and data out of pages
Brian> which have been wget'ted. This particular wheel *must* have
Brian> been previously invented...
Perl and HTML::TreeBuilder.
http://www.samag.com/documents/s=1272/sam05030008/
The author of that article also has a book on the same topic
coming out RSN (it supposedly has just gone to the printers, so
. . . ). Amazon alreaedy has it up for preorder, ISBN 0596001789.
--
Fletch | "If you find my answers frightening, __`'/|
fletch at phydeaux.org | Vincent, you should cease askin' \ o.O'
770 933-0600 x211(w) | scary questions." -- Jules =(___)=
770 294-0820 (m) | U
---
This message has been sent through the ALE general discussion list.
See http://www.ale.org/mailing-lists.shtml for more info. Problems should be
sent to listmaster at ale dot org.
More information about the Ale
mailing list