Webscraping question...best programs/languages?

scootermaster

Platinum Member
Nov 29, 2005
2,411
0
0
So there's a site I'd like to scrape, so I can put the information in my own database to get to it in different ways.

It should be relatively simple to do, given that the data is all nicely lined up in <table>/<tr>/<td> tags (although some of the columns are variable widths.

Just wondering if you out there in AT land had any luck with specific programs for this task. Just so you know, I wrote a scraper myself in Perl, and I could probably do that again, but was awhile and I don't remember where I put the code for it (i.e. I'm not a newb, I can program in just about any language. Except maybe ML. I forget all my ML)

The upshot is...if I can get this data I can launch a site that I think a lot of people will enjoy. Or at least a lot of sports fans.

Thanks!
 

Furor

Golden Member
Mar 31, 2001
1,895
0
0
You could use ColdFusion to scrape the data..it's very easy to learn, free for development use (you probably won't be able to host the site with it though) and there are tons of great screen scraping methods.
 

scootermaster

Platinum Member
Nov 29, 2005
2,411
0
0
Originally posted by: Furor
You could use ColdFusion to scrape the data..it's very easy to learn, free for development use (you probably won't be able to host the site with it though) and there are tons of great screen scraping methods.

That's probably not any easier than using Perl or even PHP. What I'm doing is literally the simplest form of scraping; there are a number of pages (that I probably could even download, further simplifying the process) that are nothing much more than HTML tables. I want to convert that data to a mysql database. Honestly, it could be done using shell scripts, grep, and dumping to .sql files. I just figured if anyone had any experience with any specific programs (MANY are made to do just this) I might pick y'alls brains.

Does that clarify things?