3

I have a pandas DataFrame that looks similar to this:

          player     frameID    x          y  
  
0         Tom        0          1          3
1         Tom        1          2          3
2         Tom        2          1          3
3         John       0          4          2
4         John       1          3          1
5         John       2          2          2
6         Greg       0          5          3
7         Greg       1          3          2
8         Greg       2          2          1
.         .          .          .          .
.         .          .          .          .
.         .          .          .          .

And I want to format it so that it looks like this:

          player  Tom           John          Greg
frameID    
                  x      y      x      y      x      y

0                 1      3      4      2      5      3
1                 2      3      3      1      3      2
2                 1      3      2      2      2      1
.                 .      .      .      .      .      .
.                 .      .      .      .      .      .
.                 .      .      .      .      .      .

However, I have no clue how to go about the multi-indexing. As you can see, I want to take two of the columns and place one as an index on the columns and one as an index on the rows. Any help would be greatly appreciated.

2
  • as an aside: why do you want a multiindex? I've found almost everything can be done with column values instead (unless you need better performance) Commented Nov 21, 2020 at 7:46
  • The data is sports movement data, and each line is a frame of a sports play. I wanted it in this format because I need all the players' positions at each frame for use in a clustering algorithm (i.e. in this format each row now has exactly the data I need). Commented Nov 21, 2020 at 9:53

1 Answer 1

3

Let's create a multilevel index then use stack + unstack to reshape the dataframe:

df.set_index(['frameID', 'player']).stack().unstack([1, 2])

player    Tom     John    Greg   
          x  y    x  y    x  y
frameID                       
0         1  3    4  2    5  3
1         2  3    3  1    3  2
2         1  3    2  2    2  1
Sign up to request clarification or add additional context in comments.

3 Comments

nice! When would one use a multilevel (column) index?
Thanks @anon01 From the pandas documentation MultiIndex is generally used when data has logically related structure as it allow you to do grouping, selection, and reshaping operations in a more concise way. I suggest you to check the documentation you can also refer to this answer which nicely explains when to use multiindex.
Man you sure are a lifesaver. Thank you so much.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.