29

enter image description here

Say I have two dataframes, df1 and df2 in the picture above. I want to sort the df1 based on the Col2 of df2.

So the end result of the df1 should look like the third dataframe in the picture, where the Col2 is in the same values, order in df2.

2 Answers 2

48

You can use combination of set_index and reindex for this.

Try this code :

df1 = pd.DataFrame({'Col1': ['a','b','c','d','e'], 'Col2': 
['chocolate','chicken','pizza','icecream','cake'] })
Out :
  Col1       Col2
0    a  chocolate
1    b    chicken
2    c      pizza
3    d   icecream
4    e       cake
df2 = pd.DataFrame({'Col1': ['f','g','h','i','j'], 'Col2': ['chicken','cake','icecream','chocolate','pizza'] })
Out :
  Col1       Col2
0    f    chicken
1    g       cake
2    h   icecream
3    i  chocolate
4    j      pizza
df1 = df1.set_index('Col2')
df1 = df1.reindex(index=df2['Col2'])
df1 = df1.reset_index()
Out :
        Col2 Col1
0    chicken    b
1       cake    e
2   icecream    d
3  chocolate    a
4      pizza    c
Sign up to request clarification or add additional context in comments.

3 Comments

Your suggestion also works to sorting a Dataframe with an external list. It solved my problem! Thank you!
Note that this will note work if the values are not unique: "cannot reindex on an axis with duplicate labels"
@Brndn Please try the solution I posted.
0

Please try this:

import pandas as pd

# Assume we have the following two dataframes:
df1 = pd.DataFrame({'Col1': ['a','b','c','d','e', 'f'], 
                    'Col2': ['chocolate','chicken','pizza','icecream','cake', 'pizza'],
                    'Col3': [1, 2, 3, 4, 5, 6],
                    'Col4': ['x', 'y', 'z', 'xx', 'yy', 'zz']})
df2 = pd.DataFrame({'Col1': ['f','g','h','i','j'], 
                    'Col2': ['chicken','cake','icecream','chocolate','pizza']})

# Create a dictionary from df2 with 'Col2' as keys and 'Col1' as values
order_dict = df2.set_index('Col2')['Col1'].to_dict()

# Create a new column in df1 that maps the values in 'Col2' to the values in the dictionary
df1['sort_column'] = df1['Col2'].map(order_dict)

# Sort df1 based on the new column, then drop the new column
df1 = df1.sort_values('sort_column').drop(columns='sort_column')

print(df1)

Out:

  Col1       Col2  Col3 Col4
1    b    chicken     2    y
4    e       cake     5   yy
3    d   icecream     4   xx
0    a  chocolate     1    x
2    c      pizza     3    z
5    f      pizza     6   zz

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.