Pandas: Sorting rows based on a value of a column

Question

I have a dataframe df like this :

ID    NAME    AGE
-----------------
M43   ab      32
M32   df      12
M54   gh      34
M43   ab      98
M43   ab      36
M43   cd      32
M32   cd      39
M43   ab      67

I need to sort the rows based on the ID column.
The output df_grouped should look like :

ID    NAME    AGE
-----------------
M43   ab      32
M43   ab      98
M43   ab      36
M43   cd      32
M43   ab      67
M32   df      12
M32   cd      39
M54   gh      34

I tried something like :

df_grouped = df.group_by(df.ID)

for id in list(df.ID.unique()):
   grouped_df_list.append(df_grouped.get_group(id))

Is there any better way to do this ?

That doesn't look like grouping - more like sorting... isn't df.sort_values('ID') what you're after? — Jon Clements
– Jon Clements, Commented Feb 6, 2018 at 17:34
Unfortunately my example looks like that, the ID column has - say 6 unique entries, I need to group rows in these six chunks. — deadbug
– deadbug, Commented Feb 6, 2018 at 17:36
Add more data and show an sample output with grouping of six, please. — Scott Boston
– Scott Boston, Commented Feb 6, 2018 at 17:39
You want to have rows with identical IDs adjacent to each other but retain the order they originally appeared in the frame right? If so - your code example makes more sense - just a fairly poor choice of sample data and lack of explanation :) — Jon Clements
– Jon Clements, Commented Feb 6, 2018 at 17:51

jpp · Accepted Answer · 2018-02-06 17:42:01Z

7

You can sort by multiple columns using pd.DataFrame.sort_values:

df = df.sort_values(['ID', 'NAME'])

By default, the argument ascending is set to True.

answered Feb 6, 2018 at 17:42

jpp

166k37 gold badges301 silver badges362 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Jon Clements · Accepted Answer · 2018-02-06 18:41:33Z

1

You can use pd.factorize to turn the key into a unique number which represents the order it appeared, then argsort that to get the positions to index into your frame, eg:

Given:

     0   1   2
0  M43  ab  32
1  M32  df  12
2  M54  gh  34
3  M43  ab  98
4  M43  ab  36
5  M43  cd  32
6  M32  cd  39
7  M43  ab  67

Then:

new_df = df.loc[pd.factorize(df[0])[0].argsort()]
# might want to consider df.reindex() instead depending...

You get:

     0   1   2
0  M43  ab  32
3  M43  ab  98
4  M43  ab  36
5  M43  cd  32
7  M43  ab  67
1  M32  df  12
6  M32  cd  39
2  M54  gh  34

answered Feb 6, 2018 at 18:41

Jon Clements

143k34 gold badges254 silver badges288 bronze badges

Collectives™ on Stack Overflow

Pandas: Sorting rows based on a value of a column

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related