10

I have a very large multi index dataframe with about 500 columns and each column has 2 sub columns.

The dataframe df looks as:

                  B2                  B5             B3
bkt              A1      A2           A2      A1     Z2      C1
Date                                                                        
2019-06-11       0.8     0.2          -6.0    -0.8   -4.1    -0.6    
2019-06-12       0.8     0.2          -6.9    -1.6   -5.3    -1.2    

df.columns
MultiIndex(levels=[['B2', 'B5', 'B3', .....], ['A1', 'A2' ......]],
           labels=[[1, 1, ....], [1, 0, ....]],
           names=[None, 'bkt'])

I am trying to sort only the column names and keep the values as it is within each column to get the following desired output:

                 B2                  B3             B5
bkt              A1      A2          C1      Z2     A1      A2
Date                                                                        
2019-06-11       ..
2019-06-12       ..

.. represents the values from the original dataframe. I just didn't retype them.

Setup

df = pd.DataFrame([
    [.8, .2, -6., -.8, -4.1, -.6],
    [.8, .2, -6.9, -1.6, -5.3, -1.2]
],
    pd.date_range('2019-06-11', periods=2, name='Date'),
    pd.MultiIndex.from_arrays([
        'B2 B2 B5 B5 B3 B3'.split(),
        'A1 A2 A2 A1 Z2 C1'.split()
    ], names=[None, 'bkt'])
)
1
  • 2
    sort_index(level=0, axis=1)? Commented Jun 10, 2019 at 14:05

2 Answers 2

13

Using sort_index and assign it back

df.columns=df.sort_index(axis=1,level=[0,1],ascending=[True,False]).columns

And from piR , we do not need create the copy of df, just do modification with the columns

df.columns=df.columns.sort_values(ascending=[True, False])
Sign up to request clarification or add additional context in comments.

8 Comments

Do this to avoid creating a new dataframe and only focus on sorting the index itself. df.columns = df.columns.sort_values(ascending=[True, False])
@piRSquared that is true :-)
I am not sure @piRSquared solution works however. Your original solution does work.
also, this only rearranges to columns and does not move the column data along with it. To do that it should be df.columns=df.sort_index(axis=1,level=[0,1],ascending=[True,False]).columns
WARNING if you apply this: this move columns header but not the values inside ! It only rearrange header, not moving the values!! That is usually not what you want...
|
6

This should be done using sort_index to move both the column names and data:

df.sort_index(axis=1, level=[0, 1], ascending=[True, False], inplace=True)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.