1

I have a multi-index data frame and I want to sort the columns not by alphabet order (axis=1), but in a custom order. I used unstack in order to turn the df to multi-index and I used sort_index to sort:

df = df.unstack().swaplevel(1,0, axis=1).sort_index(axis=1, level=0)

I want my metrics columns will be sorted as I wish and not by alphabet, for example: metric2, metric3, metric1 in chair index and in table index (and more).

dim3          chair            table
              metric1 metric2   metric3 metric1 metric2 metric3
dim1    dim2                        
a       day1    1.0   10.0      123.0    NaN     NaN     NaN
b       day2    NaN   NaN       NaN 2.0  20.0    456.0

Please don't mind the null's, it's only an example.

1 Answer 1

1

Adapting from the pandas documentation

import pandas as pd
import numpy as np
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
          ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second']) 
df = pd.DataFrame(np.random.randn(3, 8), index=['A', 'B', 'C'], columns=index)
df
first        bar                 baz                 foo                 qux  \
second       one       two       one       two       one       two       one   
A       0.033707  0.681401 -0.999368 -0.015942 -0.417583 -0.233212 -0.072706   
B       1.140347 -0.759089 -0.278175 -0.848010 -0.642824 -0.902858  0.117839   
C      -0.370039 -0.425074 -0.404409 -1.090386 -0.985019 -0.971178  0.924350   

first             
second       two  
A      -0.850698  
B       0.377443  
C      -1.129125  

Now check

df.columns.tolist()
[('bar', 'one'),
 ('bar', 'two'),
 ('baz', 'one'),
 ('baz', 'two'),
 ('foo', 'one'),
 ('foo', 'two'),
 ('qux', 'one'),
 ('qux', 'two')]

rearrange to your liking and use .loc

df.loc[:,[('bar', 'one'),
('baz', 'one'),
 ('bar', 'two'),
('foo', 'one'),
 ('foo', 'two'),
 ('qux', 'two'),
 ('baz', 'two'),
 ('qux', 'one')
] ]
first        bar       baz       bar       foo                 qux       baz  \
second       one       one       two       one       two       two       two   
A       0.033707 -0.999368  0.681401 -0.417583 -0.233212 -0.850698 -0.015942   
B       1.140347 -0.278175 -0.759089 -0.642824 -0.902858  0.377443 -0.848010   
C      -0.370039 -0.404409 -0.425074 -0.985019 -0.971178 -1.129125 -1.090386   

first        qux  
second       one  
A      -0.072706  
B       0.117839  
C       0.924350

This approach should give you the maximum amount of control.

Adapting this approach to your data frame, it looks like this:

df = df.unstack().swaplevel(1,0, axis=1).loc[:, [('chair', 'metric2'),
        ('chair', 'metric3'), ('chair', 'metric1'),('table', 'metric2'),
        ('table', 'metric3'), ('table', 'metric1')]]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.