2

I have the following data frame

df = pd.DataFrame([['1','aa', 'fff'], ['1', 'aa', 'ggg'], ['1', 'aa', 'eee'],
           ['2','aa', 'eee'], ['2', 'aa', 'ggg'], ['2', 'aa', 'fff'],
           ['3','bb', 'hhh'], ['3', 'bb', 'mmm'], ['3', 'bb', 'kkk'],
           ['3', 'bb', 'jjj'], ['4','bb', 'kkk'], ['4', 'bb', 'mmm'],
           ['4', 'bb', 'hhh'], ['4', 'bb', 'jjj'], ['5','aa', 'ggg'],
           ['5', 'aa', 'eee'], ['5', 'aa', 'fff']], columns=['foo', 'bar','name_input'])

Now, I need to sort values in column "name_input" based on a condition. The condition is

  • for bar == aa , then row values == ['eee', 'fff', 'ggg'] and for bar == bb, then row values == ['hhh' ,'jjj', 'kkk','mmm']

In the end, I am aiming to have my output as following

df = pd.DataFrame([['1','aa', 'eee'], ['1', 'aa', 'fff'], ['1', 'aa', 'ggg'],
           ['2','aa', 'eee'], ['2', 'aa', 'fff'], ['2', 'aa', 'ggg'],
           ['3','bb', 'hhh'], ['3', 'bb', 'jjj'], ['3', 'bb', 'kkk'],
           ['3', 'bb', 'mmm'], ['4','bb', 'hhh'], ['4', 'bb', 'jjj'],
           ['4', 'bb', 'kkk'], ['4', 'bb', 'mmm'], ['5','aa', 'eee'],
           ['5', 'aa', 'fff'], ['5', 'aa', 'ggg']], columns=['foo', 'bar','name_input'])

I tried reorder index by rows. However it doesn't seem to work.

df = df.pivot(index="foo", columns="bar", values="name_input")

Any help is much appreciated!

2 Answers 2

4

As far as I understood, you would want to groupby and then convert the desired (to be sorted) column to categorical , then sort_values:

d = {'aa':['eee', 'fff', 'ggg'],'bb':['hhh' ,'jjj', 'kkk','mmm']} #dict of the conditions

final = pd.concat(g.reset_index().assign(name_input = 
            pd.Categorical(g.reset_index()['name_input'],d.get(i),ordered=True))
           .sort_values('name_input') for i,g in 
           df.set_index('name_input').groupby(['foo','bar'])).reindex(df.columns,axis=1)

  foo bar name_input
2   1  aa        eee
0   1  aa        fff
1   1  aa        ggg
0   2  aa        eee
2   2  aa        fff
1   2  aa        ggg
0   3  bb        hhh
3   3  bb        jjj
2   3  bb        kkk
1   3  bb        mmm
2   4  bb        hhh
3   4  bb        jjj
0   4  bb        kkk
1   4  bb        mmm
1   5  aa        eee
2   5  aa        fff
0   5  aa        ggg
Sign up to request clarification or add additional context in comments.

2 Comments

nice, I was thinking of converting the df to dict and then using a custom sort with the sorted class but this is better
@Datanovice this follows a similar logic but little different. Thanks :)
0

Why not simply

from pandas.util.testing import assert_frame_equal

dt = df.sort_values(by=['foo', 'bar', 'name_input']).reset_index().drop('index', axis=1, inplace=False)

try:
    assert_frame_equal(dt, df)
    print("True")
except:  
    print("False")

>>>True

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.