2

I have a DataFrame below:

col1

Numb10
Numb11
Numb12
Numb7
Numb8

How can I sort with number order:

col1

Numb7
Numb8
Numb10
Numb11
Numb12

I tried but got error TypeError: cannot convert the series to <class 'int'>.

df.sort_values(by = "col1", key = (lambda x: int(x[4:])))

Update with one missing in col1

2 Answers 2

7

key in sort_values takes the Series as parameter instead of individual element. From the docs:

Apply the key function to the values before sorting. This is similar to the key argument in the builtin sorted() function, with the notable difference that this key function should be vectorized. It should expect a Series and return a Series with the same shape as the input. It will be applied to each column in by independently.

In your case, you can use .str and astype for slicing and type convertion:

df.sort_values(by='col1', key=lambda s: s.str[4:].astype(int))
     col1
3   Numb7
4   Numb8
0  Numb10
1  Numb11
2  Numb12
Sign up to request clarification or add additional context in comments.

Comments

1

Your x[4:] might not always be integers. You can verify with

# convert to numerical values, float, not integers
extracted_nums = pd.to_numeric(df['col1'].str[4:], errors='coerce')

# check for invalid values
# if not `0` means you have something that are not numerical
print(extracted_nums.isna().any())

# sort by values
df.loc[extracted_nums.sort_values().index]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.