I am looking to parse and save the contents of json file which is embedded in the html code. However when I isolate the relevant string and try and load it with json package I receive an error JSONDecodeError: Extra data and I am unsure what is causing this.
It was suggested that the relevant code actually could contain multiple dictionaries and this might be problematic, but I'm not clear on how to proceed if this is true. My code is provided below. Any suggestions much appreciated!
from bs4 import BeautifulSoup
import urllib.request
from urllib.request import HTTPError
import csv
import json
import re
def left(s, amount):
return s[:amount]
def right(s, amount):
return s[-amount:]
def mid(s, offset, amount):
return s[offset:offset+amount]
url= "url"
from urllib.request import Request, urlopen
req = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
try:
s = urlopen(req,timeout=200).read()
except urllib.request.HTTPError as e:
print(str(e))
soup = BeautifulSoup(s, "lxml")
tables=soup.find_all("script")
for i in range(0,len(tables)):
if str(tables[i]).find("TimeLine.init")>-1:
dat=str(tables[i]).splitlines()
for tbl in dat:
if str(tbl).find("TimeLine.init")>-1:
s=str(tbl).strip()
j=json.loads(s)
slook like beforejson.loadsis called?