• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Grabbing Info off a webpage with PERL?

Use libcurl to make your http request, and then from there it's just a simple text search with regular expressions.
 
Originally posted by: Crusty
Use libcurl to make your http request, and then from there it's just a simple text search with regular expressions.

That's what I thought of but that seems really messy.

QuixoticOne, I'm not a very into PERL(or any programming for that matter). Could you explain what that perl module you linked to does exactly.
 
Sorry for being unclear; it is a PERL based library to automate client side access to WWW/HTTP servers
so that you can fetch and process content from the pages automatically, easily handling most additional requirements
like submitting authentication / login information, navigating a series of dynamic or sequential pages to get to the content,
et. al. The following examples illustrate its use better than I could easily summarize.
Using something less complex like wget / curl / libcurl is probably better if you have a simple transaction, i.e.
just retrieve a static page from a known URL, but if you have to do conditional processing or deal with multiple
forms to enter data, pull elements that are nested deeply in the page, Mechanize is often a more
general / powerful solution.

If your needs are really simple, look at the various firefox browser extensions for web automation or scraping content,
they'll involve no programming / scripting at all if they work well for you.

http://search.cpan.org/~petdan...4/lib/WWW/Mechanize.pm
http://www.perl.com/pub/a/2003/01/22/mechanize.html
http://www.perl.com/lpt/a/705
http://www.developer.com/lang/other/article.php/3454041

WWW::Mechanize, or Mech for short, helps you automate interaction with a website. It supports performing a sequence of page fetches including following links and submitting forms. Each fetched page is parsed and its links and forms are extracted. A link or a form can be selected, form fields can be filled and the next page can be fetched. Mech also stores a history of the URLs you've visited, which can be queried and revisited.

 
Back
Top