1
df = pd.DataFrame([["Alpha", 3, 2, 4], ["Bravo", 2, 3, 1], ["Charlie", 4, 1, 3], ["Delta", 1, 4, 2]], 
              columns = ["Company", "Running", "Combat", "Range"])
print(df)
  Company Running   Combat  Range
0   Alpha      3      2       4
1   Bravo      2      3       1
2   Charlie    4      1       3
3   Delta      1      4       2

Hi, I am trying to sort the the following dataframe so the rows would be arranged such that the best performing across the three columns would be at the top. In this case would be Bravo company as it is 2 in running, 3 in drills and 1 in range.

Would this approach work if the list have a lot more companies and it is hard to know the exact "best performing company"?

I have tried:

df_sort = df.sort_values(['Running', 'Combat', 'Range'], ascending=[True, True, True])

current output:

    Company Running Combat  Range
1   Delta      1      4     2
0   Bravo      2      3     1
3   Alpha      3      2     4
2   Charlie    4      1     3

but it doesn't turn out how I wanted it to be. Can this be done through pandas? I was expecting the output to be:

Company Running Combat  Range
0   Bravo   2     3     1
1   Delta   1     4     2
2   Charlie 4     1     3
3   Alpha   3     2     4
8
  • 1
    Can you please share your expected output? Commented Jun 2, 2020 at 15:59
  • 1
    is row 0 the list of companies? if so, how is charlie 2 in running? which column is drills? Commented Jun 2, 2020 at 16:04
  • 4
    your dataframe structure looks weird to me. is that what is actually is? Can you correct the dataframe structure if not? Commented Jun 2, 2020 at 16:05
  • 1
    Does this answer your question? How to sort a dataFrame in python pandas by two or more columns? Commented Jun 2, 2020 at 16:11
  • Yes I would inspect your dfwith df.head(), I don't think the data are being represented how you think Commented Jun 2, 2020 at 16:14

1 Answer 1

1

If want sorting by means per rows first create mean, then add Series.argsort for positions of sorted values and last change order of values by DataFrame.iloc:

df1 = df.iloc[df.mean(axis=1).argsort()]
print (df1)
   Company  Running  Combat  Range
1    Bravo        2       3      1
3    Delta        1       4      2
2  Charlie        4       1      3
0    Alpha        3       2      4

EDIT: If need remove some columns before by DataFrame.drop:

cols = ['Overall','Subordination']
df2 = text_df.iloc[text_df.drop(cols, axis=1).mean(axis=1).argsort()]
print (df2)
   Company  Running  Combat  Overall Subordination  Range
1    Bravo        2       3     0.70          Poor      1
3    Delta        1       4     0.83          Good      2
2  Charlie        4       1     0.81          Good      3
0    Alpha        3       2     0.91     Excellent      4
Sign up to request clarification or add additional context in comments.

9 Comments

Hi, I am trying to understand the line of code. Does it mean df.mean(axis=1) will calculate the mean of all the columns before using .argsort() to move the rows according to the mean values?
@MooseCakeRunner - exactly you are right. I understand question need sorting by mean, or I am wrong?
Yes that is correct, think my initial wasn't clear. This method of sorting works if there are more columns of ranking and more rows of companies (i.e. hundreds/thousands more)? I do not need to manually name the columns and rows to be sorted?
@MooseCakeRunner - Sure, if need mean per rows number of columns is not important.
how about if there are other columns in between the columns which are like text strings, how would I go about that situation then?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.