1

I have this multi level dataframe

import pandas as pd
sites = pd.Index(['CCC', 'RRR','TTT', 'SSS'], name='SITELIST')
vvv = pd.DataFrame({'KK':[1,2,3,4],'GG':[4,3,4,5], 'RR':[1,5,1,8]}, index = sites)
ttt = pd.DataFrame({'KK':[11,22,33,34],'GG':[23,33,34,35], 'RR':[1,2,2,1]}, index = sites)
df = pd.concat([vvv, ttt], axis=0, keys=["VVV", "TTT"], names=['Sensor', 'Site'])
df = df.swaplevel().sort_index(level=0).reindex(['VVV','TTT'],level=1)
print(df)

dataframe :

             KK  GG  RR
Site Sensor            
CCC  VVV      1   4   1
     TTT     11  23   1
RRR  VVV      2   3   5
     TTT     22  33   2
SSS  VVV      4   5   8
     TTT     34  35   1
TTT  VVV      3   4   1
     TTT     33  34   2

How do i sort the dataframe index level0 'Site' using each value on index level1 'VVV' at column RR The expected result as such:

             KK  GG  RR
Site Sensor            
CCC  VVV      1   4   1 <-
     TTT     11  23   1
TTT  VVV      3   4   1 <-
     TTT     33  34   2
RRR  VVV      2   3   5 <-
     TTT     22  33   2
SSS  VVV      4   5   8 <-
     TTT     34  35   1

I have tried to sort with

print(df.sort_values('RR'))

But get unexpected

             KK  GG  RR
Site Sensor            
CCC  VVV      1   4   1
     TTT     11  23   1
SSS  TTT     34  35   1
TTT  VVV      3   4   1
RRR  TTT     22  33   2
TTT  TTT     33  34   2
RRR  VVV      2   3   5
SSS  VVV      4   5   8

Thank you.

1
  • I need the index level0 which is 'Site' is sorted by the value of col RR but on;y the one in index level1 VVV (or each row VVV). If I just sort_value the col RR; then value on row TTT will also be sorted, thus create something such as: print(df.sort_values('RR')) Commented Sep 20, 2020 at 18:31

1 Answer 1

2

IIUC you can first reset_index to sort and get the correct order of site:

df = (df.reset_index().sort_values(["Sensor", "RR", "Site"], ascending=(False, True, True)))

   Site Sensor  KK  GG  RR
0  CCC    VVV   1   4   1
6  TTT    VVV   3   4   1
2  RRR    VVV   2   3   5
4  SSS    VVV   4   5   8
1  CCC    TTT  11  23   1
5  SSS    TTT  34  35   1
3  RRR    TTT  22  33   2
7  TTT    TTT  33  34   2

From here using the result to construct an ordered pd.Categorical so you can obtain the final result:

df["Site"] = pd.Categorical(df["Site"], categories=df["Site"].unique(), ordered=True)

print (df.sort_values("Site").set_index(["Site","Sensor"]))

             KK  GG  RR
Site Sensor            
CCC  VVV      1   4   1
     TTT     11  23   1
TTT  VVV      3   4   1
     TTT     33  34   2
RRR  VVV      2   3   5
     TTT     22  33   2
SSS  VVV      4   5   8
     TTT     34  35   1
Sign up to request clarification or add additional context in comments.

1 Comment

I was thinking it should just be as df.sort_index('Site', by=[index['Sensor'], col['RR']]) should be more intuitive :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.