I have a Pandas DataFrame df with a DateTime column ('DateTime') and a column with numeric values ('load'). I want to sort the DataFrame based on the DateTime.
Hence I used the following code:
df.sort_values('DateTime')
However, the sorting is obviously not correct (I do have entries for every hour of the year).
DateTime load
0 2017-01-04 00:00:00 52223.4500
1 2017-01-04 01:00:00 51392.4225
2 2017-01-04 02:00:00 51523.6875
3 2017-01-04 03:00:00 52356.4525
4 2017-01-04 04:00:00 54685.1125
5 2017-01-04 05:00:00 60150.9925
6 2017-01-04 06:00:00 66820.7375
7 2017-01-04 07:00:00 70047.9175
8 2017-01-04 08:00:00 71457.6350
9 2017-01-04 09:00:00 72288.9975
10 2017-01-04 10:00:00 73059.6850
11 2017-01-04 11:00:00 72965.4000
12 2017-01-04 12:00:00 71860.8625
13 2017-01-04 13:00:00 70186.3825
14 2017-01-04 14:00:00 69362.5425
15 2017-01-04 15:00:00 70146.8800
16 2017-01-04 16:00:00 71641.2275
17 2017-01-04 17:00:00 70686.6700
18 2017-01-04 18:00:00 69214.0275
19 2017-01-04 19:00:00 65552.7600
20 2017-01-04 20:00:00 62177.0875
21 2017-01-04 21:00:00 60257.1750
22 2017-01-04 22:00:00 56170.3500
23 2017-01-04 23:00:00 52265.3050
24 2017-01-15 00:00:00 46725.7725
25 2017-01-15 01:00:00 45447.4650
26 2017-01-15 02:00:00 44887.1600
27 2017-01-15 03:00:00 44230.0025
28 2017-01-15 04:00:00 43838.2300
29 2017-01-15 05:00:00 42747.1475
... ... ...
8730 2017-12-28 02:00:00 40675.2025
8731 2017-12-28 03:00:00 42022.7050
8732 2017-12-28 04:00:00 44010.7025
8733 2017-12-28 05:00:00 46842.8875
8734 2017-12-28 06:00:00 51119.2625
8735 2017-12-28 07:00:00 55059.5600
8736 2017-12-28 08:00:00 58077.6375
8737 2017-12-28 09:00:00 59538.5075
8738 2017-12-28 10:00:00 60753.6975
8739 2017-12-28 11:00:00 60720.7275
8740 2017-12-28 13:00:00 58208.7925
8741 2017-12-28 12:00:00 59299.2325
8742 2017-12-28 15:00:00 58370.4075
8743 2017-12-28 16:00:00 61120.1675
8744 2017-12-28 17:00:00 61194.5025
8745 2017-12-28 18:00:00 59644.1900
8746 2017-12-28 19:00:00 56113.4500
8747 2017-12-28 20:00:00 53672.4725
8748 2017-12-28 21:00:00 52312.3350
8749 2017-12-28 22:00:00 48750.4325
8750 2017-12-28 23:00:00 45816.2225
8751 2017-12-29 00:00:00 43684.6650
8752 2017-12-29 01:00:00 42797.5800
8753 2017-12-29 02:00:00 42608.9925
8754 2017-12-29 03:00:00 43510.8925
8755 2017-12-29 04:00:00 44424.2175
8756 2017-12-29 05:00:00 46470.2750
8757 2017-12-29 06:00:00 50801.7100
8758 2017-12-29 07:00:00 54854.4375
8759 2017-12-29 08:00:00 56226.2575
I think that the columns are in the correct data type:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8760 entries, 0 to 8759
Data columns (total 2 columns):
DateTime 8760 non-null datetime64[ns]
load 8760 non-null float64
dtypes: datetime64[ns](1), float64(1)
memory usage: 136.9 KB
If I search for the min or max value in my DateTime column, I find the correct entries. Only the sorting seems not to work. What can I try?
df.loc[df['DateTime'].idxmax()]
DateTime 2017-12-31 23:00:00
load 43802.8
Name: 8706, dtype: object
df.loc[df['DateTime'].idxmin()]
DateTime 2017-01-01 00:00:00
load 43202.4
Name: 48, dtype: object
df = df.sort_values('DateTime')(ordf.sort_values('DateTime', inplace=True))