3

I have a dataframe where the column names are times (0:00, 0:10, 0:20, ..., 23:50). Right now, they're sorted in a string order (so 0:00 is first and 9:50 is last) but I want to sort them after time (so 0:00 is first and 23:50 is last).

If time is a column, you can use

df = df.sort(columns='Time',key=float)

But 1) that only works if time is a column itself, rather than the column names, and 2) sort() is deprecated so I try to abstain from using it.

I'm trying to use

df = df.sort_index(axis = 1)

but since the column names are in string format, they get sorted according to a string key. I've tried

df = df.sort_index(key=float, axis=1) 

but that gives an error message:

Traceback (most recent call last):
  File "<ipython-input-112-5663f277da66>", line 1, in <module>
      df.sort_index(key=float, axis=1)
TypeError: sort_index() got an unexpected keyword argument 'key'

Does anyone have ideas for how to fix this? So annoying that sort_index() - and sort_values() for that matter - don't have the key argument!!

4
  • 1
    Please show some example data. Commented May 4, 2017 at 15:32
  • df[sorted(df,key=pd.to_datetime)] should do. Commented May 4, 2017 at 15:39
  • Abdou: Your solution gave an error: OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1-01-01 00:00:00 Commented May 4, 2017 at 15:48
  • 1
    @J.Dahlgren, my example below works just fine with the provided data. Provide share your actual headers or the first 5 or 10 rows of your data. Also, 1-01-01 00:00:00 is clearly not a date or time. Commented May 4, 2017 at 15:53

5 Answers 5

3

Try sorting the columns with the sorted builtin function and passing the output to the dataframe for indexing. The following should serve as a working example:

import pandas as pd


records = [(2, 33, 23, 45), (3, 4, 2, 4), (4, 5, 7, 19), (4, 6, 71, 2)]
df = pd.DataFrame.from_records(records, columns = ('0:00', '23:40', '12:30', '11:23'))
df
#    0:00  23:40  12:30  11:23
# 0     2     33     23     45
# 1     3      4      2      4
# 2     4      5      7     19
# 3     4      6     71      2

df[sorted(df,key=pd.to_datetime)]

#    0:00  11:23  12:30  23:40
# 0     2     45     23     33
# 1     3      4      2      4
# 2     4     19      7      5
# 3     4      2     71      6

I hope this helps

Sign up to request clarification or add additional context in comments.

1 Comment

I wanted to make sure this worked with a time such as '9:10' too and it did. I did google a lot before posting this question so I'm surprised I didn't come across the sorted() function.
2

Just prepend a leading zero to one-digit hours. This should be the simplest solution as you can simply sort lexically then.

E.g. 5:30 -> 05:30.

2 Comments

Breaking out column names, extracting the one-digit hours, prepending a zero, and replacing the column names with the new prepended ones seems like not a very smooth solution.
You do not have to actually change the data. You can just use a sorting function that respects this, i.e. call reindex_axis from pandas on sorted(cmp=x) where x sorts the data respecting the implicit leading zero. This basically is just a check whether the third character is a colon or not. I doubt you can make it more efficient. Any conversion routines are bound to be more expensive.
2

Here is a working demo, which implements @MartinKrämer's idea:

import re

In [259]: df
Out[259]:
   23:40  0:00  19:19  12:30  09:00  11:23
0     33     2      1     23     12     45
1      4     3      1      2     13      4
2      5     4      1      7     14     19
3      6     4      1     71     14      2

In [260]: df.rename(columns=lambda x: re.sub(r'^(\d{1})\:', r'0\1:', x)).sort_index(axis=1)
Out[260]:
   00:00  09:00  11:23  12:30  19:19  23:40
0      2     12     45     23      1     33
1      3     13      4      2      1      4
2      4     14     19      7      1      5
3      4     14      2     71      1      6

1 Comment

This works too, but I think Abdou's solution is cleaner.
0

I know this question is a few years old, but since it's the top Google result for this question, I wanted to provide the root cause of the error.

The 'key' argument was added to sort_values in version 1.1.0. See the note in the documentation linked below.

pandas.DataFrame.sort_values

This feature will very like work as you intended if you upgrade to 1.1.0 or higher.

Comments

0

It seems sort_values() with key may not work. However, sort_index() with key can do the thing. Referring Abdou enter image description here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.