Python JSON encoding error

Question

I have a Python script to read the contents of a JSON file and import to a MongoDB.

I am getting the following error from it:

Traceback (most recent call last):
  File "/home/luke/projects/vuln_backend/vuln_backend/mongodb.py", line 39, in process_files
    file_content = currentFile.read()
  File "/home/luke/envs/vuln_backend/lib64/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe0 in position 14: invalid continuation byte

This is the code:

import json
import logging
import logging.handlers
import os
import glob
from logging.config import fileConfig
from zipfile import ZipFile
from pymongo import MongoClient


def process_files():
    try:
        client = MongoClient('5.57.62.97', 27017)
        db = client['vuln_sets']
        coll = db['vulnerabilities']
        basepath = os.path.dirname(__file__)
        filepath = os.path.abspath(os.path.join(basepath, ".."))
        archive_filepath = filepath + '/vuln_files/'
        archive_files = glob.glob(archive_filepath + "/*.zip")

        for file in archive_files:
            with open(file, "r") as currentFile:
                file_content = currentFile.read()
                vuln_content = json.loads(file_content)
            for item in vuln_content:
                coll.insert(item)
    except Exception as e:
        logging.exception(e)

I have tried setting the encoding to UTF8 and Windows-1252 but these do not seem to be able to read the JSON either.

How can I get it to determine which encoding is used in the JSON?

Well, you have to unzip your file before reading it... You even import the module but don't use it. — cs95
– cs95, Commented Oct 26, 2017 at 10:03
I've been staring at this for a couple of hours.....I completely forgot to add that code in! I think i've gone code blind! Thank you for pointing this out! — Luke
– Luke, Commented Oct 26, 2017 at 10:04
Can you put that as an answer so I can accept it and give the reputation? — Luke
– Luke, Commented Oct 26, 2017 at 10:41
Done, thanks. I've also tried to substantiate my answer with other "best programming" tips. Hope they help. — cs95
– cs95, Commented Oct 26, 2017 at 10:48

cs95 · Accepted Answer · 2017-10-26 10:46:28Z

1

Notice that you are trying to call json.load on a zipped file. You'll have to unzip it first, that you do using the zipfile module, like this:

with open ZipFile(file, 'r') as f:
    f.extractall(dest)

Where file is the loop variable.

Furthermore, when reading a JSON file, I'd recommend using json.load(fileobj) (1 step) over reading your file contents and calling json.loads(string_from_file) in the string (2 steps).

answered Oct 26, 2017 at 10:46

cs95

406k106 gold badges744 silver badges797 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python JSON encoding error

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related