• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Webscraping question...best programs/languages?

scootermaster

Platinum Member
So there's a site I'd like to scrape, so I can put the information in my own database to get to it in different ways.

It should be relatively simple to do, given that the data is all nicely lined up in <table>/<tr>/<td> tags (although some of the columns are variable widths.

Just wondering if you out there in AT land had any luck with specific programs for this task. Just so you know, I wrote a scraper myself in Perl, and I could probably do that again, but was awhile and I don't remember where I put the code for it (i.e. I'm not a newb, I can program in just about any language. Except maybe ML. I forget all my ML)

The upshot is...if I can get this data I can launch a site that I think a lot of people will enjoy. Or at least a lot of sports fans.

Thanks!
 
You could use ColdFusion to scrape the data..it's very easy to learn, free for development use (you probably won't be able to host the site with it though) and there are tons of great screen scraping methods.
 
Originally posted by: Furor
You could use ColdFusion to scrape the data..it's very easy to learn, free for development use (you probably won't be able to host the site with it though) and there are tons of great screen scraping methods.

That's probably not any easier than using Perl or even PHP. What I'm doing is literally the simplest form of scraping; there are a number of pages (that I probably could even download, further simplifying the process) that are nothing much more than HTML tables. I want to convert that data to a mysql database. Honestly, it could be done using shell scripts, grep, and dumping to .sql files. I just figured if anyone had any experience with any specific programs (MANY are made to do just this) I might pick y'alls brains.

Does that clarify things?
 
Back
Top