0

hI I am trying to upload a csv file to postgresql database using python

A table called "userspk" is already created in the database called "DVD"

below are the codes

import pandas as pd
import psycopg2 as pg2
conn = pg2.connect(database='DVD', user=xxx,password=xxx)


cur = conn.cursor()

def upload_data():
    with open('/Users/Downloads/DVDlist.csv', 'r') as f:
        next(f) #skips the header row
    cur.copy_from(f, 'userspk', sep=',')
    conn.commit()

upload_data()

keep getting this error. I would have thought it should be fairly straightforward. Something wrong with codes?

/Users/pk/.conda/envs/Pk/bin/python /Users/pk/PycharmProjects/Pk/SQL_upload_file.py
Traceback (most recent call last):
  File "/Users/pk/PycharmProjects/Pk/SQL_upload_file.py", line 44, in <module>
    upload_data()
  File "/Users/pk/PycharmProjects/Pk/SQL_upload_file.py", line 37, in upload_data
    next(f)  # Skip the header row.
  File "/Users/pk/.conda/envs/Pk/lib/python3.5/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa3 in position 5718: invalid start byte

1 Answer 1

1

The error seems to be coming from the next(f) function, and so has nothing to do with psycopg2 or PostgreSQL. It looks like your file has characters which python consider to be invalid as utf-8 characters.

This file is probably in Latin-1, and that is the British pound sterling sign.

You might be able to fix it by specifying the encoding when you open the file.

open('/Users/Downloads/DVDlist.csv', 'r',encoding="latin-1")

But the rows after the header might also have some issues.

Sign up to request clarification or add additional context in comments.

3 Comments

thanks @jjanes. you were right the issues was with the data. I had a few #N/A in my file. any clue how to fix that issues. Do I need to clean the data up before uploading or can I add a code to ignore #N/A ? Also, realised if there are any blank values in the middle it gives me this error psycopg2.DataError: invalid input syntax for type double precision: "" CONTEXT: COPY userspk, line 4, column credit: "" any idea how to fix that?
Why are there blank lines? Remove them, or fix the thing that creates them. Perhaps you can create an iterator in python which behaves like a file handle, but doesn't pass along blank line. That would be a pure python question, suitable for a different question.
not blank lines, blanks cells. I cant delete the entire row if there is one blank cell. but I figured out i can upload them as null values by adding null= '' " in copy_from function

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.