0

So I am trying to parse a JSON file with Python. Every time I run my script, I get the output of [] and I am very confused as to why. Is this even a proper way to parse JSON in python?

Here is my code:

import sys
import simplejson
import difflib

filename = sys.argv[1]

data = []

f = file('output.json', "r")
lines = f.readlines()
for line in lines:
        try:
            loadLines = simplejson.loads(line)

            data.append( loadLines['executionTime'])

        except ValueError:
            pass


print data  
3
  • 1
    @MattBall it has nothing to do with the size of the file. Commented Jun 26, 2013 at 4:07
  • any chance you guys can help? Commented Jun 26, 2013 at 4:09
  • @JBernardo indeed, though the title implied that the size was the problem. Commented Jun 26, 2013 at 12:44

1 Answer 1

8

My best guess is that no line on its own is valid JSON. This will cause ValueError to be thrown every time, and you will never get to data.append(...) as an exception has always been thrown by then.

If the entire file is a JSON array like this:

[
    {
        "direction": "left",
        "time": 1
    },
    {
        "direction": "right",
        "time": 2
    }
]

Then you can simply use something like:

with open('output.json', 'r') as f:
    data = json.load(f)

If, however, it is a bunch of JSON items at the top level, not enclosed within a JSON object or array, like this:

{
    "direction": "left",
    "time": 1
}
{
    "direction": "right",
    "time": 2
}

then you'll have to go with a different approach: decoding items one-by-one. Unfortunately, we can't stream the data, so we'll first have to load all the data in at once:

with open('output.json', 'r') as f:
    json_data = f.read()

To parse a single item, we use decode_raw. That means we need to make a JSONDecoder:

decoder = json.JSONDecoder()

Then we just go along, stripping any whitespace on the left side of the string, checking to make sure we still have items, and parsing an item:

while json_data.strip():  # while there's still non-whitespace...
    # strip off whitespace on the left side of the string
    json_data = json_data.lstrip()
    # and parse an item, setting the new data to be whatever's left
    item, json_data = decoder.parse_raw(json_data)
    # ...and then append that item to our list
    data.append(item)

If you're doing lots of data collection like this, it might be worthwhile to store it in a database. Something simple like SQLite will do just fine. A database will make it easier to do aggregate statistics in an efficient way. (That's what they're designed for!) It would probably also make it faster to access the data if you're doing it frequently rather than parsing JSON a lot.

Sign up to request clarification or add additional context in comments.

7 Comments

Could you explain what this means? "If, however, it is a bunch of JSON items at the top level, not enclosed within a JSON object or array, then you'll have to go with a different approach." I made the JSON file by outputting this link bikenyc.com/stations/json everyminute with a python script and terminal
@user1887261: I've added an example to show the difference. Note that the first one is surrounded by [ and ] and has commas separating the individual items, whereas the latter has neither.
Ahh okay, the edit makes much more sense now! thank you. So my JSON is formatted like the bottom bit of code. where do i begin formulating a different approach based on this?
@user1887261: That's a little tricky. I think the simplest way would be to read the whole file's data into memory as a string and then use raw_decode repeatedly. Read the documentation for more information, but it will try to parse one item and will return to you the item and the data that's left. Simply repeat that process until you've read all the items and there's nothing left. The question that someone linked to as a “possible duplicate” may also yield some possible answers.
awesome! thanks! so i'm assuming there would have been a better way to pull the data initially...this may sound stupid, but can i just add square brackets to the file manually?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.