Sorting a multi-index while respecting its index structure

Question

How can I sort a multi-index dataframe while respecting the organization of levels?

E.g. given the following df, say we sort it according to C (e.g. in descending order):

                   C         D  E
A    B                           
bar  one   -0.346528  1.528538  1
     three -0.136710 -0.147842  1
flux six    0.795641 -1.610137  1
     three  1.051926 -1.316725  2
foo  five   0.906627  0.717922  0
     one   -0.152901 -0.043107  2
     two    0.542137 -0.373016  2
     two    0.329831  1.067820  1

We should get:

                   C         D  E
A    B                           
bar  three -0.136710 -0.147842  1
     one   -0.346528  1.528538  1
flux three  1.051926 -1.316725  2
     six    0.795641 -1.610137  1
foo  five   0.906627  0.717922  0
     two    0.542137 -0.373016  2
     two    0.329831  1.067820  1
     two   -0.152901 -0.043107  2

Note that what I mean by "respecting its index structure" is sorting the leafs of the dataframe without changing the ordering of higher-level indices. In other words, I want to sort the second level while keeping the ordering of the the first level untouched.

What about doing the same in ascending order?

I read these two threads (yes, with the same title):

but they sort the dataframes according to a different criteria (e.g. index names, or a specific column in a group).

behzad.nouri · Accepted Answer · 2014-10-14 02:24:31Z

5

.reset_index, then sort based on columns A and C and then set the index back; This will be more efficient than the earlier groupby solution:

>>> df.reset_index().sort(columns=['A', 'C'], ascending=[True, False]).set_index(['A', 'B'])
                C      D  E
A    B                     
bar  three -0.137 -0.148  1
     one   -0.347  1.529  1
flux three  1.052 -1.317  2
     six    0.796 -1.610  1
foo  five   0.907  0.718  0
     two    0.542 -0.373  2
     two    0.330  1.068  1
     one   -0.153 -0.043  2

earlier solution: .groupby(...).apply is relatively slow, and may not scale very well:

>>> df['arg-sort'] = df.groupby(level='A')['C'].apply(pd.Series.argsort)
>>> f = lambda obj: obj.iloc[obj.loc[::-1, 'arg-sort'], :]
>>> df.groupby(level='A', group_keys=False).apply(f)
                C      D  E  arg-sort
A    B                               
bar  three -0.137 -0.148  1         1
     one   -0.347  1.529  1         0
flux three  1.052 -1.317  2         1
     six    0.796 -1.610  1         0
foo  five   0.907  0.718  0         1
     two    0.542 -0.373  2         2
     two    0.330  1.068  1         0
     one   -0.153 -0.043  2         3

edited Oct 14, 2014 at 2:24

answered Oct 14, 2014 at 2:16

behzad.nouri

78.5k18 gold badges130 silver badges127 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Amelio Vazquez-Reina Over a year ago

Thanks. In your first solution, why do I need to sort by A, as well as C?

behzad.nouri Over a year ago

@user815423426 otherwise it would loose the ordering of first level

Collectives™ on Stack Overflow

Sorting a multi-index while respecting its index structure

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related