Website scraping question

Sureshot324

Diamond Member
Feb 4, 2003
3,370
0
71
I want to write a scraper for the Battlefield: Bad Company 2 stats page, www.badcompany2.ea.com/leaderboards. Basically I want to make my own personal stats page. The problem is I can't figure out how to see the correct html source file.

When you first load the page, it shows the top 20 PS3 users. From there you can switch to PC or xbox 360, or search for a specific user, which loads the same page plus that user at the top of the list. The problem is if I do this and go 'view source' from firefox or IE, it just shows the source for the original page with the top 20 PS3 users.

I can't figure out how it's doing this. I can be looking at a page with the top 20 PC or xbox360 users, but if I go view source it just shows the PS3 users. Shouldn't it show me the HTML for the page I'm looking at right now?

Any web programming experts out there want to look at this page and figure out what's going on?
 

Sureshot324

Diamond Member
Feb 4, 2003
3,370
0
71
Did some reading and I think the reason it's doing that is because it's an AJAX site. Still trying to figure out how to parse it.
 

Markbnj

Elite Member <br>Moderator Emeritus
Moderator
Sep 16, 2005
15,682
14
81
www.markbetz.net
You have to find out what webservice it's calling and what arguments it passes, then parse the results.
 

EricMartello

Senior member
Apr 17, 2003
910
0
0
The sample parameters I provided are JSON. (But is largely irrelevant considering everything basically gets passed as GET parameters)

It's not clear what point you're trying to make.

No, it's not irrelevant. If the site returns JSON then you don't need to scrape anything. You just need to parse the JSON which is returned.
 

bhanson

Golden Member
Jan 16, 2004
1,749
0
76
No, it's not irrelevant. If the site returns JSON then you don't need to scrape anything. You just need to parse the JSON which is returned.

Oh, I see what happened. You quoted me but you meant to quote someone else. No where in my post did I mention scraping or AJAX being needed.
 

Sureshot324

Diamond Member
Feb 4, 2003
3,370
0
71
Ok so I am able to get the persona (user id) from the username by going to this link:

"http://www.badcompany2.ea.com/leaderboards/ajax?platform=" . $platform . "&sort=score&start=1&search=" . $username);

Once I have the persona I can get a players stat page from this link:

http://www.badcompany2.ea.com/stats?persona=234354084&platform=pc

This page has most of the stats, but to get your stats on each individual gun, you have to click it and it dynamically loads it. More ajax I'm guessing and I'm stumped on how to get that data. What I really don't understand is when I look at the source, the link for each gun is <a href="#".... How can every link point to just # yet they all load the stats for different guns?
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,836
4,816
75
Notice the other link properties?

<a href="#" id="ul_m416" class="weapon">

..., for example. They probably added JavaScript to be called by the OnClick event for every link of class "weapon".

Also notice the stuff inside the <span>s? It looks like it parses to something; probably the HTML on the right side, but I'm not entirely sure.
 

Sureshot324

Diamond Member
Feb 4, 2003
3,370
0
71
Notice the other link properties?

<a href="#" id="ul_m416" class="weapon">

..., for example. They probably added JavaScript to be called by the OnClick event for every link of class "weapon".

Also notice the stuff inside the <span>s? It looks like it parses to something; probably the HTML on the right side, but I'm not entirely sure.

You are correct, all the stats are in that span tag. That's gonna be annoying to parse. I didn't know you could set javascript to run on everything with a certain class. Would this be in the css file?
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,836
4,816
75
I didn't know you could set javascript to run on everything with a certain class. Would this be in the css file?

Um, no, it's in JavaScript, not CSS. ^_^

I've seen jQuery commands that could do it. At a basic JavaScript level, I imagine it's just a case of iterating over all objects with a certain class and adding that event handler to each object.