how to grab from JSON in selenium python

Question

My page returns JSON http response which contains id: 14

Is there a way in selenium python to grab this? I searched the web and could not find any solutions. Now I am wondering maybe its just not possible? I could grab this id from the db but I am trying to avoid this. Please tell me if there is any ways around. Thank you

You can see the source of the page using driver.page_source. But if the format of the response is plain JSON, is it necessary to use Selenium? Or can you use something lighter-weight instead (e.g. requests, urllib2, etc...)? — Alex Woolford
– Alex Woolford, Commented Oct 30, 2014 at 19:49
Selenium is necessary here because I am running a test and its selenium based, needs that variable — Nro
– Nro, Commented Oct 31, 2014 at 15:36

Brandon Rhodes · Accepted Answer · 2016-01-27 21:01:14Z

24

The source of your difficulty is the fact that when a browser is returned raw JSON data, it wraps it in a tiny bit of HTML to make it visible to the user on the screen.

When I visit https://httpbin.org/user-agent in Firefox, for example, the following raw JSON appears in my browser window:

{"user-agent": "Mozilla/5.0 (X11; Linux x86_64; rv:42.0) Gecko/20100101 Firefox/42.0"
}

But in fact Firefox (and Chrome) has wrapped the JSON in a bit of extra HTML in order to create a document it can actually display. Here is the HTML that Firefox wraps it in, which I can see right in the JavaScript console by evaluating the expression document.documentElement.innerHTML:

<head><link rel="alternate stylesheet" type="text/css"
 href="resource://gre-resources/plaintext.css" title="Wrap Long Lines"></head>
 <body><pre>{"user-agent": "Mozilla/5.0 (X11; Linux x86_64; rv:42.0)
 Gecko/20100101 Firefox/42.0"
}
</pre></body>

Using BeautifulSoup to parse the HTML, as suggested in another answer, has two serious disadvantages: it introduces a new dependency to your project, and will also be quite slow compared to taking advantage of the fact that the browser will already have parsed the HTML for you and have the resulting DOM ready for your use.

To ask the browser to extract the JSON for you, simply ask it for the text inside of the <body> element, and all of the extra structure that the browser has added will be excluded and the pure JSON be returned:

driver.find_element_by_tag_name('body').text

Or, if you want it parsed into a Python data structure:

import json
json.loads(driver.find_element_by_tag_name('body').text)

edited Jan 27, 2016 at 21:01

answered Jan 27, 2016 at 20:32

Brandon Rhodes

91k16 gold badges110 silver badges149 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

RobinL Over a year ago

This is clearly a much better solution! p.s. love your PyCon videos Brandon

mirek Over a year ago

same selenium+splinter: br.find_by_tag('body').text (instead of br.html)

RobinL · Accepted Answer · 2014-11-29 15:56:48Z

6

You can use BeautifulSoup to parse the page and extract the json. The code you need should look something like this. You may need to change the soup.find command if the json isn't directly in the body of the response.

from bs4 import BeautifulSoup
import json

soup = BeautifulSoup(driver.page_source)
dict_from_json = json.loads(soup.find("body").text)

answered Nov 29, 2014 at 15:56

RobinL

11.7k10 gold badges58 silver badges72 bronze badges

1 Comment

Brandon Rhodes Over a year ago

Asking Python to parse the raw HTML not only requires an extra third-party library, but will be rather slow compared to just letting the browser do the parsing.

William Baker Morrison · Accepted Answer · 2021-03-06 19:26:35Z

0

The other solutions didn't work for me. I found this solution using requests to be fast and simple:

import requests
requests.get(browser.current_url).json()

answered Mar 6, 2021 at 19:26

William Baker Morrison

1,8074 gold badges22 silver badges35 bronze badges

Collectives™ on Stack Overflow

how to grab from JSON in selenium python

3 Answers 3

2 Comments

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related