3

I have a python script that returns a pandas dataframe and I want to run the script in a Jupyter notebook and then save the results to a variable.

The data are in a file called data.csv and a shortened version of the dataframe.py file whose results I want to access in my Jupyter notebook is:

# dataframe.py
import pandas as pd
import sys

def return_dataframe(file):
    df = pd.read_csv(file)
    return df

if __name__ == '__main__':
    return_dataframe(sys.argv[1])

I tried running:

data = !python dataframe.py data.csv

in my Jupyter notebook but data does not contain the dataframe that dataframe.py is supposed to return.

1 Answer 1

1

This is how I did it:

# dataframe.py 
import pandas as pd
import sys

def return_dataframe(f): # don't shadow built-in `file`
    df = pd.read_csv(f)
    return df

if __name__ == '__main__':
    return_dataframe(sys.argv[1]).to_csv(sys.stdout,index=False)

Then in the notebook you need to convert an 'IPython.utils.text.SList' into a DataFrame as shown in the comments to this question: Convert SList to Dataframe:

data = !python3 dataframe.py data.csv
df = pd.DataFrame(data=data)[0].str.split(',',expand=True)

If the DataFrame is already going to be put into CSV format then you could simply do this in the notebook:

df = pd.read_csv('data.csv')
Sign up to request clarification or add additional context in comments.

4 Comments

The real script is a rather long data wrangling and cleaning process, I just left in the bottom line which is returning the dataframe
Makes sense. I hope the answer helps your process.
Any advice on how to keep the original dtypes of the different columns
Instead of saving and reading from CSV you can use pickle: stackoverflow.com/a/51177054/42346 which will preserve dtypes.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.