1

I obtained a multi index in pandas by running series.describe() for a grouped dataframe. How can I sort these series by modelName.mean and only keep sepcific fields?multi index This

summary.sortlevel(1)['kappa']

sorts them but retains all the other fields like count. How can I only keep mean and std?

edit

this is a textual representation of the df.

                                             kappa
modelName                                         
biasTotal                          count  5.000000
                                   mean   0.526183
                                   std    0.013429
                                   min    0.507536
                                   25%    0.519706
                                   50%    0.525565
                                   75%    0.538931
                                   max    0.539175
biasTotalWithDistanceMetricAccount count  5.000000
                                   mean   0.527275
                                   std    0.014218
                                   min    0.506428
                                   25%    0.520438
                                   50%    0.529771
                                   75%    0.538475
                                   max    0.541262
lightGBMbiasTotal                  count  5.000000
                                   mean   0.531639
                                   std    0.013819
                                   min    0.513363

1 Answer 1

1

You can do it this way:

Data:

In [77]: df
Out[77]:
                        0
level_1 level_0
a       25%      2.000000
        50%      4.000000
        75%      7.000000
        count    5.000000
        max      7.000000
        mean     4.400000
        min      2.000000
        std      2.509980
b       25%      2.000000
        50%      6.000000
        75%      8.000000
        count    5.000000
        max      8.000000
        mean     5.000000
        min      1.000000
        std      3.316625
c       25%      3.000000
        50%      4.000000
        75%      5.000000
        count    5.000000
        max      8.000000
        mean     4.000000
        min      0.000000
        std      2.915476
d       25%      4.000000
        50%      8.000000
        75%      8.000000
        count    5.000000
        max      9.000000
        mean     6.000000
        min      1.000000
        std      3.391165

Solution:

In [78]: df.loc[pd.IndexSlice[:, ['mean','std']], :]
Out[78]:
                        0
level_1 level_0
a       mean     4.400000
        std      2.509980
b       mean     5.000000
        std      3.316625
c       mean     4.000000
        std      2.915476
d       mean     6.000000
        std      3.391165

Setup:

df = (pd.DataFrame(np.random.randint(0,10,(5,4)),columns=list('abcd'))
        .describe()
        .stack()
        .reset_index()
        .set_index(['level_1','level_0'])
        .sort_index()
)
Sign up to request clarification or add additional context in comments.

8 Comments

when I add a .sortlevel(1) to your df the whole df is sorted but what I rather would like to achieve is that only mean is used for sorting
@GeorgHeiler, can you post your DF in text form (for example output of print(summary)) so i could reproduce it?
@MaU sure, please see the edit. As you can see my df's means are not ordered by default as the ones in your example. I would like to order by the mean, but preserve the "stackedness" e.g. the std which goes with the respective mean
@GeorgHeiler, i'm afraid you either have to sort your index (all levels) or use df.reset_index() and work as with a normal (single level indexed) DF
I see. but a reset index produces 2 records per row e.g. one for mean, one for std in a separate column called level_1 How can I sort this column only by the mean value, but keep the relationship between these 2 rows e.g. have largest mean, accompanying variance, next mean with next variance ,...
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.