0

I have a pandas DataFrame with MultiIndex. I want to sort the values of a column, and compare values in index level0. If the value is the maximum, the id should be 1, and if the value is the secondary, the id should be 2. Finally, output its sorted id.

For example:

arrays = [['bar', 'bar','bar', 'baz', 'baz', 'foo', 'foo','foo', 'foo','qux', 'qux'],
      ['one', 'two', 'three','one', 'two', 'one', 'two','three', 'four',  'one', 'two']]
df = pd.DataFrame(np.random.randn(11), index=arrays,columns=['values'])
df

output:

            values
bar one     -1.098567
    two     -0.936011
    three   -0.654245
baz one     -0.637409
    two     -0.439939
foo one      0.238114
    two      1.146573
    three   -0.512294
    four    -0.611913
qux one     -0.481083
    two      0.515961

Finally, I want this:

            values      sort
bar one     -1.098567      3
    two     -0.936011      2
    three   -0.654245      1
baz one     -0.637409      2
    two     -0.439939      1
foo one      0.238114      2
    two      1.146573      1
    three   -0.512294      3
    four    -0.611913      4
qux one     -0.481083      2
    two      0.515961      1
4
  • 1
    did you mean like this: stackoverflow.com/questions/49264510/… Commented Aug 4, 2018 at 3:18
  • 1
    As a side note, you may want to avoid naming a column 'values': it is already an attribute that lets you access the underlying NumPy array Commented Aug 4, 2018 at 3:37
  • You may also want to give the seed when using np.random so that others can easily recreate your dataframe values. Commented Aug 4, 2018 at 4:30
  • Thanks,@BradSolomon Commented Aug 4, 2018 at 4:45

1 Answer 1

1

Group on the first level (i.e. level 0), and then rank them in descending order.

>>> df.assign(sort=df.groupby(level=0).rank(ascending=False))
             values  sort
bar one   -1.098567     3
    two   -0.936011     2
    three -0.654245     1
baz one   -0.637409     2
    two   -0.439939     1
foo one    0.238113     2
    two    1.146573     1
    three -0.512295     3
    four  -0.611913     4
qux one   -0.481083     2
    two    0.515961     1
Sign up to request clarification or add additional context in comments.

2 Comments

GREAT...but one thing is that when two values are the same, the output will be .5. Why?
There are different methods that give different behaviors. Have a look at the documentation (I provided the link in the question). The methods are "average" (which is the default value) along with "min", "max", "first" and "dense".

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.