python dataframe: delete rows with condition

Question

I have a pandas Dataframe:

       A                 B     C
0    10006               3  9415640
1    10006               8       90
2    10010              10     8028
3    10010              12  1514942
4    10010              14  3098177
5    10010              15   271445
6    10010              16  1539139
7    10010              17    48939
8    10010              19     6220
9    10010               2     5710
10   21019               1      120

What I would like to do is, for every unique element in column A, keep only the row with the biggest value in column C.

For the example above the desired output is:

       A                 B     C
0    10006               3  9415640
1    10010              14  3098177
2    21019               1      120

sophocles · Accepted Answer · 2021-06-09 11:24:13Z

1

You can use boolean indexing and grab max of column C grouped by A:

df[df['C'] == df.groupby('A')['C'].transform('max')]

Prints:

>>> df

        A   B        C
0   10006   3  9415640
4   10010  14  3098177
10  21019   1      120

answered Jun 9, 2021 at 11:24

sophocles

13.9k3 gold badges18 silver badges37 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Pypas · Accepted Answer · 2021-06-09 11:19:44Z

1

You can sort by values in your C column, then group them by A and get the first occurence.

df.sort_values(['C'], ascending=False).groupby('A').first()

edited Jun 9, 2021 at 11:19

answered Jun 9, 2021 at 11:11

Pypas

214 bronze badges

Comments

Subbu VidyaSekar · Accepted Answer · 2021-06-09 11:19:47Z

1

Use this code:

your dataframe : data

Code:

data[data.groupby(['A'])['C'].transform(max) == data['C']]

Output:

    A       B   C
    10006   3   9415640
    10010   14  3098177
    21019   1   120

answered Jun 9, 2021 at 11:19

Subbu VidyaSekar

2,6413 gold badges25 silver badges45 bronze badges

Comments

Felipe Cavalcante da Rocha · Accepted Answer · 2021-06-09 11:34:31Z

1

Group by A and keep the 'max'. You can try this code:

df.groupby('A').max('C')

answered Jun 9, 2021 at 11:34

Felipe Cavalcante da Rocha

111 bronze badge

Collectives™ on Stack Overflow

python dataframe: delete rows with condition

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related