python reading a single .db file as a pandas dataframe

Question

I'm writing a workflow that first converts a flat csv into a .db file for downstream processing. The file seems to write ok but when I try to read it again it throws errors saying that the table doesn't exist. Here is my code:

import pandas as pd
import sqlite3

# simulate some data
df=pd.DataFrame({'DATE':['01-01-2000','01-02-2000','01-03-2000','01-04-2000','01-05-2000'], 
                    'A1':[1,1,1,1,1],
                    'A2':[1,2,3,4,5],
                    'B':[2,2,3,3,2]})
df.to_csv('my_test.csv', index=False)

# function to write csv to .db file
def pandas_csv_to_db_converter(csvfile, table_name):


    df = pd.read_csv(csvfile)
    conn = sqlite3.connect(table_name)
    df.to_sql(table_name, conn, if_exists='append', index=False)

    return

pandas_csv_to_db_converter('my_test.csv', 'my_test.db')

So far so good, I have a csv and a .db file of similar sizes. So I want to read in the table to make sure its all ok.

def pandas_db_reader(db_file):

    conn = sqlite3.connect(db_file)
    c = conn.cursor()
    c.execute("SELECT name FROM sqlite_master WHERE type='table';")
    names = [tup[0] for tup in c.fetchall()]
    print(names)

    table = pd.read_sql_query("SELECT * from {}".format(db_file), conn)
    print(table)
    conn.close()
    return

out_db = 'my_test.db'
pandas_db_reader(out_db)

The string 'my_test.db' is printed as a table in a list on the console, indicating that it is a legitimate table, but then the code throws an error:

    cur.execute(*args)
sqlite3.OperationalError: no such table: my_test.db

During handling of the above exception, another exception occurred:

    cur.execute(*args)
pandas.io.sql.DatabaseError: Execution failed on sql 'SELECT * from my_test.db': no such table: my_test.db

Its not very intuitive why this error is occurring. I've looked at quite a lot of tutorials and stack overflow questions on how to do this bit its doesn't seem to run. I don't have much experience using sql so any help would be really helpful! Thanks in advance!

EDIT: typo edited :

pandas_csv_to_db_converter('my_test.csv', my_test_db.db)

is now

pandas_csv_to_db_converter('my_test.csv', 'my_test.db')

basically since you already have the dataframe, you can push it into sqlite tables like this: engine = sqlalchemy.create_engine('sqlite:///my_db.sqlite') df.to_sql('Resolved', engine, if_exists='append') — MEdwin
– MEdwin, Commented Jun 24, 2019 at 14:44
Thanks for your help, the csv comes from elsewhere, I was just making some test data for a working example. — user3062260
– user3062260, Commented Jun 24, 2019 at 15:03

Wytamma Wirth · Accepted Answer · 2019-06-24 15:05:57Z

2

You're passing the db_file as the tablename in pandas_db_reader(). You need to pass the correct TABLE_NAME variable to SQL query below.

table = pd.read_sql_query("SELECT * from {}".format(TABLE_NAME), conn)

For whatever reason (see here) my_test_db.db is not a valid table name. Change the table name value in pandas_csv_to_db_converter and pandas_db_reader i.e.

...
df.to_sql("test", conn, if_exists='append', index=False)
...

and

...
table = pd.read_sql_query("SELECT * from {}".format("test"), conn)
...

edited Jun 24, 2019 at 15:05

answered Jun 24, 2019 at 13:14

Wytamma Wirth

5633 silver badges12 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

user3062260 Over a year ago

How would I find out what the TABLE_NAME is? This is basically what I thought which is why I put some lines into 'pandas_db_reader' to find out the names - I guess the list 'names' its not giving me what I thought it was!

user3062260 Over a year ago

FYI i already tried: c.execute("SELECT * FROM INFORMATION_SCHEMA.TABLES ") and c.execute("SELECT * FROM sys.Tables ")

Wytamma Wirth Over a year ago

Your table name is defined here pandas_csv_to_db_converter('my_test.csv', my_test_db.db). where is the my_test_db object from? Note: my_test_db.db is not being passed as a string.

user3062260 Over a year ago

Ah sorry my_test_db.db is a typo on stackoverflow, it should be 'my_test.db' everywhere. I'll edit the question for consistency, its correct in my program and still fails to run.

user3062260 Over a year ago

Thanks for your help, this test example now works. I think the problem was starting in 'pandas_csv_to_db_converter', I was passing in full file names. If I moved into the working directory and run the whole workflow from there with just the filename without the full path it works fine! - It looks like it was using the full file path as the table name which is why I couldn't recover the table in ''pandas_db_reader", either that or maybe it was saving to working memory rather than to a file which gets deleted when the function has finished running. Anyway many thanks!

|

Collectives™ on Stack Overflow

python reading a single .db file as a pandas dataframe

1 Answer 1

6 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related