5

How can I sort a multi-index dataframe while respecting the organization of levels?

E.g. given the following df, say we sort it according to C (e.g. in descending order):

                   C         D  E
A    B                           
bar  one   -0.346528  1.528538  1
     three -0.136710 -0.147842  1
flux six    0.795641 -1.610137  1
     three  1.051926 -1.316725  2
foo  five   0.906627  0.717922  0
     one   -0.152901 -0.043107  2
     two    0.542137 -0.373016  2
     two    0.329831  1.067820  1

We should get:

                   C         D  E
A    B                           
bar  three -0.136710 -0.147842  1
     one   -0.346528  1.528538  1
flux three  1.051926 -1.316725  2
     six    0.795641 -1.610137  1
foo  five   0.906627  0.717922  0
     two    0.542137 -0.373016  2
     two    0.329831  1.067820  1
     two   -0.152901 -0.043107  2

Note that what I mean by "respecting its index structure" is sorting the leafs of the dataframe without changing the ordering of higher-level indices. In other words, I want to sort the second level while keeping the ordering of the the first level untouched.

What about doing the same in ascending order?

I read these two threads (yes, with the same title):

but they sort the dataframes according to a different criteria (e.g. index names, or a specific column in a group).

1 Answer 1

5

.reset_index, then sort based on columns A and C and then set the index back; This will be more efficient than the earlier groupby solution:

>>> df.reset_index().sort(columns=['A', 'C'], ascending=[True, False]).set_index(['A', 'B'])
                C      D  E
A    B                     
bar  three -0.137 -0.148  1
     one   -0.347  1.529  1
flux three  1.052 -1.317  2
     six    0.796 -1.610  1
foo  five   0.907  0.718  0
     two    0.542 -0.373  2
     two    0.330  1.068  1
     one   -0.153 -0.043  2

earlier solution: .groupby(...).apply is relatively slow, and may not scale very well:

>>> df['arg-sort'] = df.groupby(level='A')['C'].apply(pd.Series.argsort)
>>> f = lambda obj: obj.iloc[obj.loc[::-1, 'arg-sort'], :]
>>> df.groupby(level='A', group_keys=False).apply(f)
                C      D  E  arg-sort
A    B                               
bar  three -0.137 -0.148  1         1
     one   -0.347  1.529  1         0
flux three  1.052 -1.317  2         1
     six    0.796 -1.610  1         0
foo  five   0.907  0.718  0         1
     two    0.542 -0.373  2         2
     two    0.330  1.068  1         0
     one   -0.153 -0.043  2         3
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks. In your first solution, why do I need to sort by A, as well as C?
@user815423426 otherwise it would loose the ordering of first level

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.