6

I have a Pandas Dataframe with car data. I want to find the best selling two Models for each Maker, and then rank the Makers, descending.

Maker   Model   No Sold(,000s)
Ford    Kuga    35
Ford    Focus   47
Ford    Ka          31
Ford    Fiesta      68
Ford    Mondeo      55
Ford    S-Max       34
Ford    Galaxy      23
Nissan  Leaf        28
Nissan  Micra       31
Nissan  Note            43
Nissan  Pulsar      23
Nissan  Juke            57
Nissan  Qashqai     62
Nissan  X-Trail         38
Honda   Jazz            24
Honda   Civic           32
Honda   HRV         33
Honda   CRV         29
Honda   Accord          30
Honda   NSX         15
Toyota  Aygo            44
Toyota  Auris           45
Toyota  Avensis         35
Toyota  Prius           32
Toyota  Rav4            29
Toyota  Land Cruiser    14
Citroen C1          40
Citroen C3  25
Citroen C4  46
Citroen DS3 35    
Citroen DS4 31
Citroen DS5 25    
Audi    A1  23
Audi    A3  47
Audi    A4  30
Audi    A6  20
Audi    A8  18
BMW 1 Series    36
BMW 2 Series    20
BMW 3 Series    53
BMW 4 Series    21
BMW 5 Series    27
BMW 6 Series    24
BMW 7 Series    16

Sorry, not sure of how to put Dataframe in here.

3 Answers 3

5

use groupby + nlargest

df.set_index('Model').groupby('Maker')['No Sold(,000s)'].nlargest(2)

Maker    Model  
Audi     A3         47
         A4         30
Citroen  C4         46
         C1         40
Ford     Fiesta     68
         Mondeo     55
Honda    HRV        33
         Civic      32
Nissan   Qashqai    62
         Juke       57
Toyota   Auris      45
         Aygo       44
Name: No Sold(,000s), dtype: int64
Sign up to request clarification or add additional context in comments.

Comments

3

Alternative solution:

In [222]: df.sort_values(['Maker', 'No Sold(,000s)'], ascending=[1,0]) \
            .groupby('Maker', as_index=False).head(2)
Out[222]:
      Maker     Model  No Sold(,000s)
33     Audi        A3              47
34     Audi        A4              30
39      BMW  3 Series              53
37      BMW  1 Series              36
28  Citroen        C4              46
26  Citroen        C1              40
3      Ford    Fiesta              68
4      Ford    Mondeo              55
16    Honda       HRV              33
15    Honda     Civic              32
12   Nissan   Qashqai              62
11   Nissan      Juke              57
21   Toyota     Auris              45
20   Toyota      Aygo              44

PS please be aware: @piRSquared's solution is more idiomatic and should be faster

Comments

1

I believe you could also do:

df[df.groupby(by=['maker'])["no sold(000's)"].rank() <= 2]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.