0

I have a pandas Dataframe:

       A                 B     C
0    10006               3  9415640
1    10006               8       90
2    10010              10     8028
3    10010              12  1514942
4    10010              14  3098177
5    10010              15   271445
6    10010              16  1539139
7    10010              17    48939
8    10010              19     6220
9    10010               2     5710
10   21019               1      120     

What I would like to do is, for every unique element in column A, keep only the row with the biggest value in column C.

For the example above the desired output is:

       A                 B     C
0    10006               3  9415640
1    10010              14  3098177
2    21019               1      120 

4 Answers 4

1

You can use boolean indexing and grab max of column C grouped by A:

df[df['C'] == df.groupby('A')['C'].transform('max')]

Prints:

>>> df

        A   B        C
0   10006   3  9415640
4   10010  14  3098177
10  21019   1      120
Sign up to request clarification or add additional context in comments.

Comments

1

You can sort by values in your C column, then group them by A and get the first occurence.

df.sort_values(['C'], ascending=False).groupby('A').first()

Comments

1

Use this code:

your dataframe : data

Code:

data[data.groupby(['A'])['C'].transform(max) == data['C']]

Output:

    A       B   C
    10006   3   9415640
    10010   14  3098177
    21019   1   120

Comments

1

Group by A and keep the 'max'. You can try this code:

df.groupby('A').max('C')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.