0

I am a beginner with Json and trying to read the Automotive 5 core Json file from here : http://jmcauley.ucsd.edu/data/amazon/, with the following code

Python code:

import json
with open('Automotive_5.json') as f:
data = json.load(f)

I keep getting JSONDecodeError: Extra data

Complete Traceback:

runfile('C:/Users/Paul/Google Drive/erg2.py', wdir='C:/Users/Paul/Google Drive')
Traceback (most recent call last):

  File "<ipython-input-122-72136ec568c5>", line 1, in <module>
    runfile('C:/Users/Paul/Google Drive/erg2.py', wdir='C:/Users/Paul/Google Drive')

  File "C:\Users\Paul\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 704, in runfile
    execfile(filename, namespace)

  File "C:\Users\Paul\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 108, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "C:/Users/Paul/Google Drive/erg2.py", line 10, in <module>
    data = json.load(f)

  File "C:\Users\Paul\Anaconda3\lib\json\__init__.py", line 296, in load
    parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)

  File "C:\Users\Paul\Anaconda3\lib\json\__init__.py", line 348, in loads
    return _default_decoder.decode(s)

  File "C:\Users\Paul\Anaconda3\lib\json\decoder.py", line 340, in decode
    raise JSONDecodeError("Extra data", s, end)

JSONDecodeError: Extra data
2
  • on this page I see only json compressed with gzip - reviews_Automotive_5.json.gz. Did you uncompress this file ? Commented Sep 10, 2019 at 3:17
  • yes i did uncompress it... OK, edited the original post with the whole error msg Commented Sep 10, 2019 at 3:27

2 Answers 2

2

I downloaded your file and tested things on my own system. I am not sure why, but you need to load each line separately. Hopefully someone else can provide the why, but this code seems to work for me. Maybe it is just too large? My editor complained about the size.

import json
data = []
with open('Automotive_5.json') as f:
  for line in f:
    data.append(json.loads(line))
    print(json.loads(line))

Read each line in the file as JSON and append it to data rather than trying to load it all at once. Runs without error for me.

Sign up to request clarification or add additional context in comments.

4 Comments

I also checked file by you was faster :) Now I checked web page and there is even information "Format is one-review-per-line in (loose) json." and there is even code which shows how to convert it to normal JSON - see "Convert to 'strict' json".
"Loose" JSON might do it. Maybe the thing isn't properly formatted?
file is not correctly formated JSON but it seems they did it intentionally.
That would do it then.
0

It looks like the problem it with the json file itself i.e. you have to read data line by line, the all data in json file should be in one object but you have separate separate object in json file.

If you want to read data in one go then put all the objects of json file in one object with comma separated.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.