[ale] Parsing non-XML web pages?

Fletch fletch at phydeaux.org
Thu Jun 13 11:48:51 EDT 2002


>>>>> "Brian" == Brian J Dowd <bdowd at dentfirst.com> writes:

    Brian> Anyone know of any good tools which would help simplify the
    Brian> parsing of specific data out of a downloaded html page? I'm
    Brian> looking to pull out specific numbers and data out of pages
    Brian> which have been wget'ted. This particular wheel *must* have
    Brian> been previously invented...

        Perl and HTML::TreeBuilder.

http://www.samag.com/documents/s=1272/sam05030008/


        The author of that article also has a book on the same topic
coming out RSN (it supposedly has just gone to the printers, so
. . . ).  Amazon alreaedy has it up for preorder, ISBN 0596001789.


-- 
Fletch                | "If you find my answers frightening,       __`'/|
fletch at phydeaux.org   |  Vincent, you should cease askin'          \ o.O'
770 933-0600 x211(w)  |  scary questions." -- Jules                =(___)=
770 294-0820 (m)      |                                               U

---
This message has been sent through the ALE general discussion list.
See http://www.ale.org/mailing-lists.shtml for more info. Problems should be 
sent to listmaster at ale dot org.






More information about the Ale mailing list