[ale] Grabbing a dynamic website automatically?

johncole at mindspring.com johncole at mindspring.com
Fri Aug 23 00:15:48 EDT 2002


Howdy!

Yes, but the problem is that the website changes everyday as I have to log
into a HTTPS site. Then I have to go through a couple of licks/menus in
order to get the page I need.
Otherwise, this would work.

I did look over what someone else did for doing Cookie based wgets/curl and
with HTTPS but I don't see anywhere where it says anything bout time-access
and logging in and going through a few pages before I get to the content I
need.

Thanks for the ideas though everyone!

Thanks,
John


>At 08:50 AM 08/22/2002 -0400, you wrote:
>>Run a cronjob with Links outputting the page to a text file?
>>
>>Something like: "links -dump https://www.foo.bar/page.pl > ~/daily" done
>>at 0200, perhaps?
>>
>>-- 
>>Christopher R. Curzio     |  Quantum materiae materietur marmota monax
>>http://www.accipiter.org  |  si marmota monax materiam possit materiari?
>>:wq!
>>
>>Thus Spake <johncole at mindspring.com>:
>>Thu, 22 Aug 2002 08:31:36 -0400
>>
>>
>>> Howdy all!
>>> 
>>> What would be the best way to grab the data off of a website that is
>>> dynamic, HTTPS, and has cookies enabled.?  I'm trying to capture a
>>> single page everyday from a particular website automatically.
>>> 
>>> (in particular I'm using Redhat 7.2)
>>> 
>>> I need the page back in text format preferably (or I can convert it to
>>> text later as needed for insertion into a database.)
>>> 
>>> Thanks,
>>> John
Paypal membership: free 

Donation to Freenet: $20 

Never having to answer the question "Daddy, where were you when they took
freedom of the press away from the Internet?": Priceless. 

http://www.freenetproject.org/index.php?page=donations

---
This message has been sent through the ALE general discussion list.
See http://www.ale.org/mailing-lists.shtml for more info. Problems should be 
sent to listmaster at ale dot org.






More information about the Ale mailing list