1

I am trying to find a generic way to sort a DataFrame on multiple columns, where each column is sorted by a different arbitrary sort function.

For example, for input I might have

df = pd.DataFrame([[2,"Basic",6],[1,"Intermediate",9],[2,"Intermediate",6],[0,"Advanced",6],[0,"Basic",2],[1, 'Advanced', 6], [0,"Basic",3], ], columns=['Hour','Level','Value'])

        Hour    Level   Value
0   2   Basic           6
1   1   Intermediate    9
2   2   Intermediate    6
3   0   Advanced        6
4   0   Basic           2
5   1   Advanced        6
6   0   Basic           3

and I want my output to be

    Hour    Level   Value
0   0   Advanced        6
1   0   Basic           3
2   0   Basic           2
3   1   Advanced        6
4   1   Intermediate    9
5   2   Intermediate    6
6   2   Basic           6

I might have a function map as such

lambdaMap = {
"Hour": lambda x: x,
"Level": lambda x: [['Advanced', 'Intermediate', 'Basic'].index(l) for l in x]
"Value": lambda x: -x
}

I can apply any one of the sorting functions individually:

sortValue="Hour"
df.sort_values(by=sortValue, key=lambdaMap[sortValue])

I could create a loop to apply each sort successively:

for (column, func) in lambdaSort.items():
    df = df.sort_values(by=column, key=func)

But none of these will create the output I'm looking for. Is this even possible? There are a lot of examples with how to achieve similar things for specific instances, but I'm curious if there is a way to achieve this generically, for use in the creation of API and/or general support libraries.

1 Answer 1

1

you can convert to categorical and do a sort:

df['Level'] = pd.Categorical(df['Level'],['Advanced', 'Intermediate', 'Basic'],
              ordered=True)
out = df.sort_values(['Hour','Level','Value'],ascending=[True,True,False])

print(out)

   Hour         Level  Value
3     0      Advanced      6
6     0         Basic      3
4     0         Basic      2
5     1      Advanced      6
1     1  Intermediate      9
2     2  Intermediate      6
0     2         Basic      6
Sign up to request clarification or add additional context in comments.

2 Comments

This will solve this specific instance, but I'm more interested in a general solution that will take any arbitrary object of fields and functions. For instance, what if I had a more complicated sort function that wasn't categorial or numerical. I am open to suggestions for a object structure that might help achieve this: e.g. including information about what type of sort each function is. Sill, curious how this would work for any sort function, no matter how complex.
@crackernutter There is no builtin for such. Its very specific. Though from pandas 1.1, you get a key argument much like python's sorted()

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.