How to manually sort columns in multi index dataframe?

Question

I have a multi-index data frame and I want to sort the columns not by alphabet order (axis=1), but in a custom order. I used unstack in order to turn the df to multi-index and I used sort_index to sort:

df = df.unstack().swaplevel(1,0, axis=1).sort_index(axis=1, level=0)

I want my metrics columns will be sorted as I wish and not by alphabet, for example: metric2, metric3, metric1 in chair index and in table index (and more).

dim3          chair            table
              metric1 metric2   metric3 metric1 metric2 metric3
dim1    dim2                        
a       day1    1.0   10.0      123.0    NaN     NaN     NaN
b       day2    NaN   NaN       NaN 2.0  20.0    456.0

Please don't mind the null's, it's only an example.

Quickbeam2k1 · Accepted Answer · 2019-09-01 20:01:21Z

Adapting from the pandas documentation

import pandas as pd
import numpy as np
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
          ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second']) 
df = pd.DataFrame(np.random.randn(3, 8), index=['A', 'B', 'C'], columns=index)
df
first        bar                 baz                 foo                 qux  \
second       one       two       one       two       one       two       one   
A       0.033707  0.681401 -0.999368 -0.015942 -0.417583 -0.233212 -0.072706   
B       1.140347 -0.759089 -0.278175 -0.848010 -0.642824 -0.902858  0.117839   
C      -0.370039 -0.425074 -0.404409 -1.090386 -0.985019 -0.971178  0.924350   

first             
second       two  
A      -0.850698  
B       0.377443  
C      -1.129125

Now check

df.columns.tolist()
[('bar', 'one'),
 ('bar', 'two'),
 ('baz', 'one'),
 ('baz', 'two'),
 ('foo', 'one'),
 ('foo', 'two'),
 ('qux', 'one'),
 ('qux', 'two')]

rearrange to your liking and use .loc

df.loc[:,[('bar', 'one'),
('baz', 'one'),
 ('bar', 'two'),
('foo', 'one'),
 ('foo', 'two'),
 ('qux', 'two'),
 ('baz', 'two'),
 ('qux', 'one')
] ]
first        bar       baz       bar       foo                 qux       baz  \
second       one       one       two       one       two       two       two   
A       0.033707 -0.999368  0.681401 -0.417583 -0.233212 -0.850698 -0.015942   
B       1.140347 -0.278175 -0.759089 -0.642824 -0.902858  0.377443 -0.848010   
C      -0.370039 -0.404409 -0.425074 -0.985019 -0.971178 -1.129125 -1.090386   

first        qux  
second       one  
A      -0.072706  
B       0.117839  
C       0.924350

This approach should give you the maximum amount of control.

Adapting this approach to your data frame, it looks like this:

df = df.unstack().swaplevel(1,0, axis=1).loc[:, [('chair', 'metric2'),
        ('chair', 'metric3'), ('chair', 'metric1'),('table', 'metric2'),
        ('table', 'metric3'), ('table', 'metric1')]]

Collectives™ on Stack Overflow

How to manually sort columns in multi index dataframe?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related