Grabbing Info off a webpage with PERL?

d4mo · Aug 2, 2008

I want to make a script with perl that will grab the current price of oil from this site:
http://www.bloomberg.com/energy. What would be the best way to grab that with perl?

Crusty · Aug 2, 2008

Use libcurl to make your http request, and then from there it's just a simple text search with regular expressions.

QuixoticOne · Aug 3, 2008

http://search.cpan.org/dist/WWW-Mechanize/

d4mo · Aug 4, 2008

Originally posted by: Crusty
Use libcurl to make your http request, and then from there it's just a simple text search with regular expressions.

That's what I thought of but that seems really messy.

QuixoticOne, I'm not a very into PERL(or any programming for that matter). Could you explain what that perl module you linked to does exactly.

QuixoticOne · Aug 4, 2008

Sorry for being unclear; it is a PERL based library to automate client side access to WWW/HTTP servers
so that you can fetch and process content from the pages automatically, easily handling most additional requirements
like submitting authentication / login information, navigating a series of dynamic or sequential pages to get to the content,
et. al. The following examples illustrate its use better than I could easily summarize.
Using something less complex like wget / curl / libcurl is probably better if you have a simple transaction, i.e.
just retrieve a static page from a known URL, but if you have to do conditional processing or deal with multiple
forms to enter data, pull elements that are nested deeply in the page, Mechanize is often a more
general / powerful solution.

If your needs are really simple, look at the various firefox browser extensions for web automation or scraping content,
they'll involve no programming / scripting at all if they work well for you.

http://search.cpan.org/~petdan...4/lib/WWW/Mechanize.pm
http://www.perl.com/pub/a/2003/01/22/mechanize.html
http://www.perl.com/lpt/a/705
http://www.developer.com/lang/other/article.php/3454041

WWW::Mechanize, or Mech for short, helps you automate interaction with a website. It supports performing a sequence of page fetches including following links and submitting forms. Each fetched page is parsed and its links and forms are extracted. A link or a form can be selected, form fields can be filled and the next page can be fetched. Mech also stores a history of the URLs you've visited, which can be queried and revisited.

Grabbing Info off a webpage with PERL?

d4mo

Senior member

Crusty

Lifer

QuixoticOne

Golden Member

d4mo

Senior member

QuixoticOne

Golden Member

TRENDING THREADS