2

I have a multi-indexed pandas dataframe that looks like this:

Antibody                 Time Repeats           
Akt                      0    1         1.988053
                              2         1.855905
                              3         1.416557
                         5    1         1.143599
                              2         1.151358
                              3         1.272172
                         10   1         1.765615
                              2         1.779330
                              3         1.752246
                         20   1         1.685807
                              2         1.688354
                              3         1.614013
                         .....        ....
                         0    4         2.111466
                              5         1.933589
                              6         1.336527
                         5    4         2.006936
                              5         2.040884
                              6         1.430818
                         10   4         1.398334
                              5         1.594028
                              6         1.684037
                         20   4         1.529750
                              5         1.721385
                              6         1.608393

(Note that I've only posted one antibody, there are many analogous entries under the antibody index) but they all have the same format. Despite missing out the entries in the middle for the sake of space you can see that I have 6 experimental repeats but they are not organized properly. My question is: how would I get the DataFrame to aggregate all the repeats. So the output would look something like this:

Antibody                 Time Repeats           
Akt                      0    1         1.988053
                              2         1.855905
                              3         1.416557
                              4         2.111466
                              5         1.933589
                              6         1.336527
                         5    1         1.143599
                              2         1.151358
                              3         1.272172
                              4         2.006936
                              5         2.040884
                              6         1.430818
                         10   1         1.765615
                              2         1.779330
                              3         1.752246
                              4         1.398334
                              5         1.594028
                              6         1.684037
                         20   1         1.685807
                              2         1.688354
                              3         1.614013
                              4         1.529750
                              5         1.721385
                              6         1.60839
                         .....        ....

Thanks in advance

1
  • can you try df.sort_index(level=[0,1]) Commented Nov 28, 2016 at 11:55

1 Answer 1

2

I think you need sort_index:

df = df.sort_index(level=[0,1,2])
print (df)
Antibody  Time  Repeats
Akt       0     1          1.988053
                2          1.855905
                3          1.416557
                4          2.111466
                5          1.933589
                6          1.336527
          5     1          1.143599
                2          1.151358
                3          1.272172
                4          2.006936
                5          2.040884
                6          1.430818
          10    1          1.765615
                2          1.779330
                3          1.752246
                4          1.398334
                5          1.594028
                6          1.684037
          20    1          1.685807
                2          1.688354
                3          1.614013
                4          1.529750
                5          1.721385
                6          1.608393
Name: col, dtype: float64

Or you can omit parameter levels:

df = df.sort_index()
print (df)
Antibody  Time  Repeats
Akt       0     1          1.988053
                2          1.855905
                3          1.416557
                4          2.111466
                5          1.933589
                6          1.336527
          5     1          1.143599
                2          1.151358
                3          1.272172
                4          2.006936
                5          2.040884
                6          1.430818
          10    1          1.765615
                2          1.779330
                3          1.752246
                4          1.398334
                5          1.594028
                6          1.684037
          20    1          1.685807
                2          1.688354
                3          1.614013
                4          1.529750
                5          1.721385
                6          1.608393
Name: col, dtype: float64
Sign up to request clarification or add additional context in comments.

5 Comments

Hi jezrael, thanks for the response. This was my approach too but for me I get a split on level 0. So the first three repeats are at the top of the df and the second three in the middle
How does work sort only for second and third level? df = df.sort_index(level=[1,2])
I've just realized that my problem was with not having inplace=True so I was just returning the old frame. This is working now. Thanks for the help.
Yes, or assign or inplace is neccesary.
Glad can help you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.