How to create sql alchemy connection for pandas read_sql with sqlalchemy+pyodbc and multiple databases in MS SQL Server?

Question

I am trying to use 'pandas.read_sql_query' to copy data from MS SQL Server into a pandas DataFrame. I need to do multiple joins in my SQL query. The tables being joined are on the same server but in different databases. The query I am passing to pandas works fine inside MS SQL Server Management Studio. In a Jupyter Notebook I tried to query data like so (to make things readable the query itself is simplified to just 2 joins and generic names are used):

import pandas as pd
import sqlalchemy as sql
import pyodbc

server = '100.10.10.10'
driver = 'SQL+Server+Native+Client+11.0'
myQuery = '''SELECT first.Field1, second.Field2
           FROM db1.schema.Table1 AS first
           JOIN db2.schema.Table2 AS second
           ON first.Id = second.FirstId
           '''
engine = sql.create_engine('mssql+pyodbc://{}?driver={}'.format(server, driver))
df = pd.read_sql_query(myQuery, engine)

This does not work and returns an error:

DBAPIError: (pyodbc.Error) ('IM010', '[IM010] [Microsoft][��������� ��������� ODBC] ������� ������� ��� ��������� ������ (0) (SQLDriverConnect)')

It seems that the problem is in the engine which does not include information about the database, because everything works fine with the next kind of code, where I include database in the engine:

myQuery = 'select Field1 from schema.Table1'
db = 'db1'
engine = sql.create_engine('mssql+pyodbc://{}/{}?driver={}'.format(server, db, driver))
df = pd.read_sql_query(myQuery, engine)

but breaks like the code with joins above if I don't include database in the engine, but add it to the query like so:

myQuery = 'select Field1 from db1.schema.Table1'
engine = sql.create_engine('mssql+pyodbc://{}?driver={}'.format(server, 
driver))
df = pd.read_sql_query(myQuery, engine)

So how should I specify the pandas.read_sql_query 'sql' and 'con' parameters in this case when I need to join tables from different databases but the same server?

P.S. I only have read access to this server I am connecting to. I can not create new tables or views or anything like that.

Update: The MS SQL Server version is 2008 R2.

Update 2: I am using Python 3.6 and Windows 10.

Try SQL+Server+Native+Client+10.0 as a driver... Related question — MaxU - stand with Ukraine
– MaxU - stand with Ukraine, Commented Apr 25, 2017 at 16:05
@MaxU I just tried to do what you suggested. The result is the same. Moreover, this change in the driver breakes the working code wich has database specified in the engine. It gives a similar error, except the 'IM010' in the error message change to 'IM002'. — Sergey Zakharov
– Sergey Zakharov, Commented Apr 26, 2017 at 6:49
@MaxU I also tried just SQL+Server. As a result, the working code worked, but the breaking code returns an error with 'IM002'. — Sergey Zakharov
– Sergey Zakharov, Commented Apr 26, 2017 at 7:17

Sergey Zakharov · Accepted Answer · 2018-02-13 08:55:18Z

So I have found a workaround: use pymssql instead of pyodbc (both in the import statement and in the engine). It lets you build your joins using database names and without specifying them in the engine. And there is no need to specify a driver in this case.

There might be a problem if you are using Python 3.6 which is not supported by pymssql oficially yet, but you can find unofficial wheels for your Python 3.6 here. It works as is supposed to with my queries.

Here is the original code with joins, rebuilt to work with pymssql:

import pandas as pd
import sqlalchemy as sql
import pymssql

server = '100.10.10.10'
myQuery = '''SELECT first.Field1, second.Field2
           FROM db1.schema.Table1 AS first
           JOIN db2.schema.Table2 AS second
           ON first.Id = second.FirstId'''
engine = sql.create_engine('mssql+pymssql://{}'.format(server))
df = pd.read_sql_query(myQuery, engine)

As for the unofficial wheels, you need to download the file for Python 3.6 from the link I gave above, then cd to the download folder and run pip install wheels where 'wheels' is the name of the wheels file.

UPDATE:

Actually, it is possible to use pyodbc too. I am not sure if this should work for any SQL Server setup, but everything worked for me after I had set 'master' as my database in the engine. The resulting code would look like this:

import pandas as pd
import sqlalchemy as sql
import pyodbc

server = '100.10.10.10'
driver = 'SQL+Server'
db = 'master'
myQuery = '''SELECT first.Field1, second.Field2
           FROM db1.schema.Table1 AS first
           JOIN db2.schema.Table2 AS second
           ON first.Id = second.FirstId'''
engine = sql.create_engine('mssql+pyodbc://{}/{}?driver={}'.format(server, db, driver))
df = pd.read_sql_query(myQuery, engine)

Sachin Nikumbh · Accepted Answer · 2022-09-16 10:20:13Z

1

The following code is working for me. I am using SQL server with SQLAlchemy

import pyodbc
import pandas as pd
cnxn = pyodbc.connect('DRIVER=ODBC Driver 17 for SQL Server;SERVER=your_db_server_id,your_db_server_port;DATABASE=pangard;UID=your_db_username;PWD=your_db_password')
query = "SELECT * FROM database.tablename;"
df = pd.read_sql(query, cnxn)
print(df)

edited Sep 16, 2022 at 10:20

answered Jan 31, 2022 at 14:57

Sachin Nikumbh

1,07912 silver badges13 bronze badges

1 Comment

Hendy Over a year ago

It doesn't look like you're using sqlalchemy to me.

Collectives™ on Stack Overflow

How to create sql alchemy connection for pandas read_sql with sqlalchemy+pyodbc and multiple databases in MS SQL Server?

2 Answers 2

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related