8

I am trying to send monthly data to a MySQL database using Python's pandas to_sql command. My program runs one month of data at a time and I want to append the new data onto the existing database. However, Python gives me an error:

_mysql_exceptions.OperationalError: (1050, "Table 'cps_basic_tabulation' already exists")

Here is my code for connecting and exporting:

conn = MySQLdb.connect(host     = config.get('db', 'host'),
                       user     = config.get('db', 'user'),
                       passwd   = config.get('db', 'password'),
                       db       = 'cps_raw') 

combined.to_sql(name            = "cps_raw.cps_basic_tabulation",
               con              = conn,
               flavor           = 'mysql', 
               if_exists        = 'append')

I have also tried using:

from sqlalchemy import create_engine

Replacing conn = MySQLdb.connect... with:

engine = mysql+mysqldb://<user>:<password>@<host>[:<port>]/<dbname>

conn   = engine.connect().connection

Any ideas on why I cannot append to a database?

Thanks!

2
  • Which version of pandas are you using? Commented Nov 5, 2014 at 20:09
  • Have you tried passing con=engine instead of con=conn? Commented Nov 5, 2014 at 20:12

2 Answers 2

7

Starting from pandas 0.14, you have to provide directly the sqlalchemy engine, and not the connection object:

engine = create_engine("mysql+mysqldb://<user>:<password>@<host>[:<port>]/<dbname>")
combined.to_sql("cps_raw.cps_basic_tabulation", engine, if_exists='append')
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you. Also @joris thank you. I updated pandas 'sudo pip install --upgrade pandas', between both of these fixes, everything worked. However, I believe the main fix was updating pandas as I tried combinations of connecting to MySQL with no avail.
It is good possible that updating pandas fixed your issue, but I should note that it is however recommended to use the sqlalchemy in any case, as using MySQL connection object directly is deprecated and will be removed in a future pandas version.
1

Since I had the same error message and stumbled across this post I leave this here for others to find.

I found two ways to solve the duplicated table creation although I lack the insight as to why this solves it:

  1. Either pass the database name in the url when creating a connection
  2. or pass the database name as a schema in pd.to_sql.

Doing both does not hurt. Also, a few years later it is (again?) possible to pass the pure connection to pandas. My guess would be that in the previous answer by joris the first of my solution cases might have implicitly solved the problem.

```
#create connection to MySQL DB via sqlalchemy & pymysql
user = credentials['user']
password = credentials['password']
port = credentials['port']
host = credentials['hostname']
dialect = 'mysql'
driver = 'pymysql'
db_name = 'test_db'

# setup SQLAlchemy   
from sqlalchemy import create_engine 
cnx = f'{dialect}+{driver}://{user}:{password}@{host}:{port}/' 
engine = create_engine(cnx) 

# create database
with engine.begin() as con:
    con.execute(f"CREATE DATABASE {db_name}")

############################################################
# either pass the db_name  vvvv - HERE- vvvv after creating a database
cnx = f'{dialect}+{driver}://{user}:{password}@{host}:{port}/{db_name}'      
############################################################
engine = create_engine(cnx) 

table = 'test_table'
col = 'test_col'
with engine.begin() as con:
    # this would work here instead of creating a new engine with a new link
    # con.execute(f"USE {db_name}")
    con.execute(f"CREATE TABLE {table} ({col} CHAR(1));")

# insert into database
import pandas as pd
df = pd.DataFrame({col : ['a','b','c']})

with engine.begin() as con:
    # this has no effect here
    # con.execute(f"USE {db_name}")
    df.to_sql(
        name= table,
        if_exists='append',
# passing con = cnx here would equally work
        con=con, 
############################################################
# or pass it as a schema vvvv - HERE - vvvv
        #schema=db_name,
############################################################
        index=False
    )```

Tested with python version 3.8.13, sqlalchemy 1.4.32 and pandas 1.4.2. Same problem might have appeared here and here.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.