0

I have a dataframe that looks like this:

class_id dims
94369_GA_30122 95
27369_GA_30122 14
78369_CA_30122 27
30472_MN_55121 16

and the dataframe goes on... I want to sort my column class_id numerically ascending, that is itt must look like

class_id dims
27369_GA_30122 14
30472_MN_55121 16
78369_CA_30122 27
94369_GA_30122 95

can anyone tell me how can I achieve this?

2 Answers 2

1

I believe this should do the trick:

data = {"class_id": ["94369_GA_30122", "27369_GA_30122", "78369_CA_30122", "30472_MN_55121"],
        "dims": [95, 14, 27, 16]}
df = pd.DataFrame(data)

df = df.sort_values("class_id")

Out:
         class_id  dims
1  27369_GA_30122    14
3  30472_MN_55121    16
2  78369_CA_30122    27
0  94369_GA_30122    95

Edit: You can also add these lines to only sort on the first set of numbers.

df["sorting"] = df["class_id"].str.split("_", n=1).str[0]    # Extracting only the first set of numbers
df = df.sort_values("sorting")
df = df.drop("sorting", axis=1)    # To drop the column again

https://pandas.pydata.org/docs/reference/api/pandas.Series.str.split.html

Sign up to request clarification or add additional context in comments.

Comments

0

If you want to sort the value by class_id :

df.sort_values(by=['class_id'])

If you want to sort the value by dims :

df.sort_values(by=['dims'])

Tf you want to sort based on both you can use :

df.sort_values(by=['class_id', 'dims'])

you can refer from this site - https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sort_values.html

2 Comments

I tried using this method df.sort_values(by=['class_id']) but it sorts class_id like 100000_WI_53929, 100004_IL_61364 and not like 1_NH_03275, 2_IL_61270
Then try to use df.sort_values(by=['class_id', 'dims']) it will sort based on value in both column

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.