I am using python (jupyter notebook) for some analysis. I would like to sort my values in my panda data frame by the function sort_values(). First it looked like it was working correctly, however it is only working for sorting digits with 2 characters (see figure). What could I do to sort the values correctly for countries > 99?
1 Answer
There is problem values are strings, so lexicographically sorting.
So need first convert to numeric:
df4 = df4.astype(int)
Sample:
df4 = pd.Series(['102','11','10','10', '119', '14'])
print (df4)
0 102
1 11
2 10
3 10
4 119
5 14
dtype: object
print (df4.sort_values())
2 10
3 10
0 102
1 11
4 119
5 14
dtype: object
df4 = df4.astype(int)
print (df4.sort_values())
2 10
3 10
1 11
5 14
0 102
4 119
dtype: int32

intfirst,df4 = df4.astype(int){}button in the toolbar). Otherwise answerers need to manually type any code they need from you, instead of being able to copy & paste.