1

I need to sort a dataframe by a custom aggregation function, for example, the sum of their values, similarly to the sorted function with the key argument:

sorted([(1, 10), (1, 2), (2, 3)], key=sum)

which gives:

[(1, 2), (2, 3), (1, 10)]

I know that in pandas I could create a new aggregate column and sort by column:

df = pd.DataFrame([(1, 10), (1, 2), (2, 3)])
df[2] = df.sum(axis=1)
df.sort_values(2).drop(2, axis=1)

But as you can see it is way less elegant than the python solution with sorted(). Since sort_values() does not take a key argument, what would be a way to sort values in a dataframe by key without creating new columns?

2 Answers 2

4

No need to add a dummy column, just use the result of df.sum(1) to index your dataframe:

df.loc[df.sum(1).argsort()]
# Use @jezraels answer if the index is not range(len(df.index))

   0   1
1  1   2
2  2   3
0  1  10
Sign up to request clarification or add additional context in comments.

1 Comment

Nice and simple, thanks! Accepted @jezrael's answer as more general
3

Use Series.argsort with Series.iloc for general solution working with any index values:

print (df.iloc[df.sum(axis=1).argsort()])
   0   1
1  1   2
2  2   3
0  1  10

1 Comment

How would this be possible for operations where pandas does a sort internally, say during a groupby or pivot?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.