Python - Handling a javascript URL?

Question

I am trying to download the html of a page that is requested through javascript and normally, by clicking a link in the browser. I can download the first page because it has a general URL:

http://www.locationary.com/stats/hotzone.jsp?hz=1

But there are links along the bottom of the page that are numbers (1 to 10). So if you click on one, it goes to, for example, page 2:

http://www.locationary.com/stats/hotzone.jsp?ACTION_TOKEN=hotzone_jsp$JspView$NumericAction&inPageNumber=2

When I put that URL into my program and try to download the html, it gives me the html of a different page on the website and I think it is the home page.

How can I get the html of this URL that uses javascript and when there is no specific URL?

Thanks.

P.S. I am using urllib/urllib2 and cookielib.

Also, I just found something called PyQuery? Could I use that? And how would I do it?

Code:

import urllib
import urllib2
import cookielib
import re

URL = ''

def load(url):

    data = urllib.urlencode({"inUserName":"email", "inUserPass":"password"})
    jar = cookielib.FileCookieJar("cookies")
    opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(jar))
    opener.addheaders.append(('User-agent', 'Mozilla/5.0 (Windows NT 6.1; rv:13.0) Gecko/20100101 Firefox/13.0.1'))
    opener.addheaders.append(('Referer', 'http://www.locationary.com/'))
    opener.addheaders.append(('Cookie','site_version=REGULAR'))
    request = urllib2.Request("https://www.locationary.com/index.jsp?ACTION_TOKEN=tile_loginBar_jsp$JspView$LoginAction", data)
    response = opener.open(request)
    page = opener.open("https://www.locationary.com/index.jsp?ACTION_TOKEN=tile_loginBar_jsp$JspView$LoginAction").read()

    h = response.info().headers
    jsid = re.findall(r'Set-Cookie: (.*);', str(h[5]))
    data = urllib.urlencode({"inUserName":"email", "inUserPass":"password"})
    jar = cookielib.FileCookieJar("cookies")
    opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(jar))
    opener.addheaders.append(('User-agent', 'Mozilla/5.0 (Windows NT 6.1; rv:13.0) Gecko/20100101 Firefox/13.0.1'))
    opener.addheaders.append(('Referer', 'http://www.locationary.com/'))
    opener.addheaders.append(('Cookie','site_version=REGULAR; ' + str(jsid[0])))
    request = urllib2.Request("https://www.locationary.com/index.jsp?ACTION_TOKEN=tile_loginBar_jsp$JspView$LoginAction", data)
    response = opener.open(request)
    page = opener.open(url).read()
    print page

load(URL)

Could you show some code you are using?

tijko
– tijko

2012-08-15 01:41:57 +00:00
Commented Aug 15, 2012 at 1:41 — tijko
– tijko, Commented Aug 15, 2012 at 1:41

verbsintransit · Accepted Answer · 2012-08-15 02:41:35Z

1

I haven't used it myself, but I have heard good things about Requests.

answered Aug 15, 2012 at 2:41

verbsintransit

9083 gold badges8 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

fabiocerqueira · Accepted Answer · 2012-08-15 01:45:22Z

0

See the project Splinter, maybe It's useful: http://splinter.cobrateam.info/

answered Aug 15, 2012 at 1:45

fabiocerqueira

8126 silver badges12 bronze badges

Collectives™ on Stack Overflow

Python - Handling a javascript URL?

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related