2

i am scraping into certain webpage using requests and beautifulsoup libs in python

so i got the element that i want in this simple code

<script>
data = {'user':{'id':1,'name':'joe','age':18,'email':'[email protected]'}}
</script>

so i want to get the email value in variable but the whole element comes back into list and when i specify the text of that tag i can't get it into json it gives me errors in the columns so any idea ? i'll appreciate any help

1 Answer 1

1

Something simple, maybe will help you.

import json
from bs4 import BeautifulSoup

html = """
<script>
data = {'user':{'id':1,'name':'joe','age':18,'email':'[email protected]'}}
</script>
"""

soup = BeautifulSoup(html, 'html.parser')
# slices [7:] mean that we ignore the `data = `
# and replace the single quotes to double quotes for json.loads()
json_data = json.loads(soup.find('script').text.strip()[7:].replace("'", '"'))
print(json_data)
print(type(json_data))

Output

{'user': {'id': 1, 'name': 'joe', 'age': 18, 'email': '[email protected]'}}
<class 'dict'>
Sign up to request clarification or add additional context in comments.

8 Comments

u r getting close enough from what i want , and i also did this and gives me error in columns code json_data = json.loads(soup.find_all('script')[3].text.strip()[21:]) File "C:\Users\TOSHIBA\AppData\Local\Programs\Python\Python36-32\lib\json_init_.py", line 354, in loads return _default_decoder.decode(s) File "C:\Users\TOSHIBA\AppData\Local\Programs\Python\Python36-32\lib\json\decoder.py", line 342, in decode raise JSONDecodeError("Extra data", s, end) json.decoder.JSONDecodeError: Extra data: line 1 column 3548 (char 3547)
can you show the script tag that you want to scrape ?
am not sure if i can do it here because it's too long
this may help you, i think you have many dict objects, stackoverflow.com/questions/21058935/…
That's probably because of the ; at the end. Use something like [7:-1] instead of [7:]. Also, slightly more reliable than using such magic numbers is to get everything between the first { and the last } in the script tag content and parse it as JSON.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.