How to insert pandas dataframe via mysqldb into database?

Question

I can connect to my local mysql database from python, and I can create, select from, and insert individual rows.

My question is: can I directly instruct mysqldb to take an entire dataframe and insert it into an existing table, or do I need to iterate over the rows?

In either case, what would the python script look like for a very simple table with ID and two data columns, and a matching dataframe?

Do you need to use mysqldb, or are you ok with another MySQL connector? — Franck Dernoncourt
– Franck Dernoncourt, Commented Jan 23, 2018 at 2:11

Franck Dernoncourt · Accepted Answer · 2018-01-22 21:04:34Z

113

Update:

There is now a to_sql method, which is the preferred way to do this, rather than write_frame:

df.to_sql(con=con, name='table_name_for_df', if_exists='replace', flavor='mysql')

Also note: the syntax may change in pandas 0.14...

You can set up the connection with MySQLdb:

from pandas.io import sql
import MySQLdb

con = MySQLdb.connect()  # may need to add some other options to connect

Setting the flavor of write_frame to 'mysql' means you can write to mysql:

sql.write_frame(df, con=con, name='table_name_for_df', 
                if_exists='replace', flavor='mysql')

The argument if_exists tells pandas how to deal if the table already exists:

if_exists: {'fail', 'replace', 'append'}, default 'fail'
     fail: If table exists, do nothing.
     replace: If table exists, drop it, recreate it, and insert data.
     append: If table exists, insert data. Create if does not exist.

Although the write_frame docs currently suggest it only works on sqlite, mysql appears to be supported and in fact there is quite a bit of mysql testing in the codebase.

edited Jan 22, 2018 at 21:04

Franck Dernoncourt

84.7k81 gold badges374 silver badges556 bronze badges

answered May 10, 2013 at 7:58

Andy Hayden

378k110 gold badges640 silver badges546 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

elPastor Over a year ago

Andy - any thoughts on how to handle this with pandas 0.20.2's deprecation of the 'mysql' flavor?

Andy Hayden Over a year ago

@pshep123 use SQLAlchemy: stackoverflow.com/a/30653988/1240268 stackoverflow.com/a/29356874/1240268 pandas.pydata.org/pandas-docs/stable/… (an engine rather than connection)

elPastor Over a year ago

Excellent, not sure why I didn't figure that from my searches, but that did the trick, thank you!

Vasin Yuriy Over a year ago

Such approach (sql.write_frame) is deprecated - stackoverflow.com/questions/38487878/…

mLstudent33 Over a year ago

What is the default schema? From pandas docs: "schemastr, optional Specify the schema (if database flavor supports this). If None, use default schema."

|

Franck Dernoncourt · Accepted Answer · 2018-01-23 02:19:26Z

30

Andy Hayden mentioned the correct function (to_sql). In this answer, I'll give a complete example, which I tested with Python 3.5 but should also work for Python 2.7 (and Python 3.x):

First, let's create the dataframe:

# Create dataframe
import pandas as pd
import numpy as np

np.random.seed(0)
number_of_samples = 10
frame = pd.DataFrame({
    'feature1': np.random.random(number_of_samples),
    'feature2': np.random.random(number_of_samples),
    'class':    np.random.binomial(2, 0.1, size=number_of_samples),
    },columns=['feature1','feature2','class'])

print(frame)

Which gives:

   feature1  feature2  class
0  0.548814  0.791725      1
1  0.715189  0.528895      0
2  0.602763  0.568045      0
3  0.544883  0.925597      0
4  0.423655  0.071036      0
5  0.645894  0.087129      0
6  0.437587  0.020218      0
7  0.891773  0.832620      1
8  0.963663  0.778157      0
9  0.383442  0.870012      0

To import this dataframe into a MySQL table:

# Import dataframe into MySQL
import sqlalchemy
database_username = 'ENTER USERNAME'
database_password = 'ENTER USERNAME PASSWORD'
database_ip       = 'ENTER DATABASE IP'
database_name     = 'ENTER DATABASE NAME'
database_connection = sqlalchemy.create_engine('mysql+mysqlconnector://{0}:{1}@{2}/{3}'.
                                               format(database_username, database_password, 
                                                      database_ip, database_name))
frame.to_sql(con=database_connection, name='table_name_for_df', if_exists='replace')

One trick is that MySQLdb doesn't work with Python 3.x. So instead we use mysqlconnector, which may be installed as follows:

pip install mysql-connector==2.1.4  # version avoids Protobuf error

Output:

Note that to_sql creates the table as well as the columns if they do not already exist in the database.

edited Jan 23, 2018 at 2:19

answered Jan 23, 2018 at 2:10

Franck Dernoncourt

84.7k81 gold badges374 silver badges556 bronze badges

2 Comments

Pyd Over a year ago

\AppData\Roaming\Python\Python37\site-packages\sqlalchemy\engine\url.py in __init__(self, drivername, username, password, host, port, database, query)      69         self.host = host      70         if port is not None: ---> 71             self.port = int(port)      72         else:      73             self.port = None  ValueError: invalid literal for int() with base 10: ''

problem with port number, how to mention port

Conor Cosnett Over a year ago

@pyd the format of the string to inlclude port is as follows:

'mysql+mysqldb://{user}:{password}@{server}:{port}/{database}'.format(user='<user>', password='<password>', server='<server>', port='<port>',  database=<'database'>)

Rafael Valero · Accepted Answer · 2018-01-05 15:29:22Z

5

You can do it by using pymysql:

For example, let's suppose you have a MySQL database with the next user, password, host and port and you want to write in the database 'data_2', if it is already there or not.

import pymysql
user = 'root'
passw = 'my-secret-pw-for-mysql-12ud'
host =  '172.17.0.2'
port = 3306
database = 'data_2'

If you already have the database created:

conn = pymysql.connect(host=host,
                       port=port,
                       user=user, 
                       passwd=passw,  
                       db=database,
                       charset='utf8')

data.to_sql(name=database, con=conn, if_exists = 'replace', index=False, flavor = 'mysql')

If you do NOT have the database created, also valid when the database is already there:

conn = pymysql.connect(host=host, port=port, user=user, passwd=passw)

conn.cursor().execute("CREATE DATABASE IF NOT EXISTS {0} ".format(database))
conn = pymysql.connect(host=host,
                       port=port,
                       user=user, 
                       passwd=passw,  
                       db=database,
                       charset='utf8')

data.to_sql(name=database, con=conn, if_exists = 'replace', index=False, flavor = 'mysql')

Similar threads:

answered Jan 5, 2018 at 15:29

Rafael Valero

2,83623 silver badges29 bronze badges

3 Comments

Chris Johnson Over a year ago

The name parameter should be the table name, not the database name.

Steve Byrne Over a year ago

The flavor kwarg for to_sql no longer is permitted.

yerty Over a year ago

Thank you! The index=False helped me solve the following error:

(mysql.connector.errors.ProgrammingError) 1170 (42000): BLOB/TEXT column "index" used in key specification without a key length

IdoS · Accepted Answer · 2021-01-24 23:36:20Z

3

This should do the trick:

import pandas as pd
import pymysql
pymysql.install_as_MySQLdb()
from sqlalchemy import create_engine

# Create engine
engine = create_engine('mysql://USER_NAME_HERE:PASS_HERE@HOST_ADRESS_HERE/DB_NAME_HERE')

# Create the connection and close it(whether successed of failed)
with engine.begin() as connection:
  df.to_sql(name='INSERT_TABLE_NAME_HERE/INSERT_NEW_TABLE_NAME', con=connection, if_exists='append', index=False)

answered Jan 24, 2021 at 23:36

IdoS

5902 gold badges11 silver badges22 bronze badges

Comments

bumbles · Accepted Answer · 2016-10-09 15:55:02Z

2

The to_sql method works for me.

However, keep in mind that the it looks like it's going to be deprecated in favor of SQLAlchemy:

FutureWarning: The 'mysql' flavor with DBAPI connection is deprecated and will be removed in future versions. MySQL will be further supported with SQLAlchemy connectables. chunksize=chunksize, dtype=dtype)

answered Oct 9, 2016 at 15:55

bumbles

1911 gold badge1 silver badge12 bronze badges

Comments

Martin Thoma · Accepted Answer · 2017-10-17 13:28:19Z

2

Python 2 + 3

Prerequesites

Pandas
MySQL server
sqlalchemy
pymysql: pure python mysql client

Code

from pandas.io import sql
from sqlalchemy import create_engine

engine = create_engine("mysql+pymysql://{user}:{pw}@localhost/{db}"
                       .format(user="root",
                               pw="your_password",
                               db="pandas"))
df.to_sql(con=engine, name='table_name', if_exists='replace')

edited Oct 17, 2017 at 13:28

answered Oct 17, 2017 at 12:57

Martin Thoma

139k174 gold badges687 silver badges1.1k bronze badges

Comments

Rahul Dey · Accepted Answer · 2020-12-02 11:06:33Z

1

This has worked for me. At first I've created only the database, no predefined table I created.

from platform import python_version
print(python_version())
3.7.3

path='glass.data'
df=pd.read_csv(path)
df.head()


!conda install sqlalchemy
!conda install pymysql

pd.__version__
    '0.24.2'

sqlalchemy.__version__
'1.3.20'

restarted the Kernel after installation.

from sqlalchemy import create_engine
engine = create_engine('mysql+pymysql://USER:PASSWORD@HOST:PORT/DATABASE_NAME', echo=False)

try:
df.to_sql(name='glasstable',con=engine,index=False, if_exists='replace')
print('Sucessfully written to Database!!!')

except Exception as e:
    print(e)

answered Dec 2, 2020 at 11:06

Rahul Dey

711 silver badge9 bronze badges

Comments

waitingkuo · Accepted Answer · 2013-05-10 07:18:20Z

0

You might output your DataFrame as a csv file and then use mysqlimport to import your csv into your mysql.

EDIT

Seems pandas's build-in sql util provide a write_frame function but only works in sqlite.

I found something useful, you might try this

edited May 10, 2013 at 7:18

answered May 10, 2013 at 6:56

waitingkuo

94.5k28 gold badges119 silver badges122 bronze badges

1 Comment

Stefan Over a year ago

Thanks, this is how I've been doing this so far. I'm looking for a way to directly insert into mysql without the csv detour.

s.katz · Accepted Answer · 2019-05-17 11:01:12Z

-1

df.to_sql(name = "owner", con= db_connection, schema = 'aws', if_exists='replace', index = >True, index_label='id')

answered May 17, 2019 at 11:01

s.katz

1

Collectives™ on Stack Overflow

How to insert pandas dataframe via mysqldb into database?

9 Answers 9

Update:

6 Comments

2 Comments

3 Comments

Comments

Comments

Python 2 + 3

Prerequesites

Code

Comments

Comments

EDIT

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

9 Answers 9

Update:

6 Comments

2 Comments

3 Comments

Comments

Comments

Python 2 + 3

Prerequesites

Code

Comments

Comments

EDIT

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related