0

I have created a package allowing a user to write data to either a sqlite or Postgres db. I created a module for connecting to the db and a separate module that provides the writing functionality. In the latter module the write is a straightforward pandas internal function call:

indata.to_sql('pay_' + table, con, if_exists='append', index=False)

Writing to an sqlite db (with connection using 'sqlite3') is successful however when writing to a Postgres db I get the following error:

Traceback (most recent call last):
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 1778, in execute
    ps = cache['ps'][key]
KeyError: ("SELECT name FROM sqlite_master WHERE type='table' AND name=?;", ((705, 0, <function Connection.__init__.<locals>.text_out at 0x7fc3205fb510>),))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pandas/io/sql.py", line 1595, in execute
    cur.execute(*args)
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 861, in execute
    self._c.execute(self, operation, args)
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 1837, in execute
    self.handle_messages(cursor)
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 1976, in handle_messages
    raise self.error
pg8000.core.ProgrammingError: {'S': 'ERROR', 'V': 'ERROR', 'C': '42P01', 'M': 'relation "sqlite_master" does not exist', 'P': '18', 'F': 'parse_relation.c', 'L': '1180', 'R': 'parserOpenTable'}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pandas/io/sql.py", line 1610, in execute
    raise_with_traceback(ex)
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pandas/compat/__init__.py", line 46, in raise_with_traceback
    raise exc.with_traceback(traceback)
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pandas/io/sql.py", line 1595, in execute
    cur.execute(*args)
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 861, in execute
    self._c.execute(self, operation, args)
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 1837, in execute
    self.handle_messages(cursor)
  File "/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pg8000/core.py", line 1976, in handle_messages
    raise self.error
pandas.io.sql.DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': {'S': 'ERROR', 'V': 'ERROR', 'C': '42P01', 'M': 'relation "sqlite_master" does not exist', 'P': '18', 'F': 'parse_relation.c', 'L': '1180', 'R': 'parserOpenTable'}

I traced the error to the following file:

/anaconda3/envs/PCAN_v1/lib/python3.7/site-packages/pandas/io/sql.py

What seems to be happening is that the '.to_sql' function is configured to try to write to a db named 'sqlite_master' at this point in the 'sql.py' file:

    def has_table(self, name, schema=None):
    # TODO(wesm): unused?
    # escape = _get_valid_sqlite_name
    # esc_name = escape(name)

    wld = "?"
    query = (
        "SELECT name FROM sqlite_master " "WHERE type='table' AND name={wld};"
    ).format(wld=wld)

    return len(self.execute(query, [name]).fetchall()) > 0

Looking more closely at the errors you can see that the connection is correctly made to the db but that pandas is looking for an sqlite db:

enter image description here

I know that the db name was one I used several half a year ago when I first started working with sqlite so I'm thinking that somewhere I set a configuration value. So:

  1. is my reasoning correct?
  2. if so, how do I change the configuration?
  3. if not, what is possibly going on?

1 Answer 1

3

Per pandas.DataFrame.to_sql documentation:

con : sqlalchemy.engine.Engine or sqlite3.Connection

Using SQLAlchemy makes it possible to use any DB supported by that library. Legacy support is provided for sqlite3.Connection objects.

This means only SQLite allows a raw connection for the to_sql method. All other RDBMs including Postgres must use an SQLAlchemy connection for this method to create structures and append data. Do note: read_sql does not require SQLAlchemy since it does not make persistent changes.

Therefore, this raw DB-API connection cannot work:

import psycopg2
con = psycopg2.connect(host="localhost", port=5432, dbname="mydb", user="myuser", password="mypwd")

indata.to_sql('pay_' + table, con, if_exists='append', index=False)

However, this SQLAlchemy connection can work:

from sqlalchemy import create_engine    

engine = create_engine('postgresql+psycopg2://myuser:mypwd@localhost:5432/mydb')

indata.to_sql('pay_' + table, engine, if_exists='append', index=False)

Better use SQLAlchemy for both databases, here for SQLite:

engine = create_engine("sqlite:///path/to/mydb.db")
Sign up to request clarification or add additional context in comments.

1 Comment

Ah I was just coming to that conclusion after finally finding and reading through this: gist.github.com/peteWT/3410263. Tnx!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.