38

Say I have two dataframes, df1 and df2 that share the same index. df1 is sorted in the order that I want df2 to be sorted.

df=pd.DataFrame(index=['Arizona','New Mexico', 'Colorado'],columns=['A','B','C'], data=[[1,2,3],[4,5,6],[7,8,9]])
print df

            A  B  C
Arizona     1  2  3
New Mexico  4  5  6
Colorado    7  8  9


df2=pd.DataFrame(index=['Arizona','Colorado', 'New Mexico'], columns=['D'], data=['Orange','Blue','Green'])
print df2
                 D
Arizona     Orange
Colorado      Blue
New Mexico   Green

What is the best / most efficient way of sorting the second dataframe by the index of the first?

One option is just joining them, sorting, and then dropping the columns:

df.join(df2)[['D']]

                 D
Arizona     Orange
New Mexico   Green
Colorado      Blue

Is there a more elegant way of doing this?

Thanks!

1 Answer 1

56

reindex would work - be aware that it will create missing values for index values that are in df, but not in df2.

In [18]: df2.reindex(df.index)
Out[18]: 
                 D
Arizona     Orange
New Mexico   Green
Colorado      Blue
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you, this is exactly what I was looking for.
I tried this on two dataframes that share the same multi index and it's not working

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.