38

New to Pandas, so maybe I'm missing a big idea? I have a Pandas DataFrame of register transactions with shape like (500,4):

Time              datetime64[ns]
Net Total                float64
Tax                      float64
Total Due                float64

I'm working through my code in a Python3 Jupyter notebook. I can't get past sorting any column. Working through the different code examples for sort, I'm not seeing the output reorder when I inspect the df. So, I've reduced the problem to trying to order just one column:

df.sort_values(by='Time')
# OR
df.sort_values(['Total Due'])
# OR
df.sort_values(['Time'], ascending=True)

No matter which column title, or which boolean argument I use, the displayed results never change order.

Thinking it could be a Jupyter thing, I've previewed the results using print(df), df.head(), and HTML(df.to_html()) (the last example is for Jupyter notebooks). I've also rerun the whole notebook from import CSV to this code. And, I'm also new to Python3 (from 2.7), so I get stuck with that sometimes, but I don't see how that's relevant in this case.

Another post has a similar problem, Python pandas dataframe sort_values does not work. In that instance, the ordering was on a column type string. But as you can see all of the columns here are unambiguously sortable.

Why does my Pandas DataFrame not display new order using sort_values?

3
  • 5
    IIUC try this: df = df.sort_values(['Total Due']) or df.sort_values(['Total Due'], inplace=True) Commented Mar 5, 2017 at 20:36
  • @MaxU That did it. You know what. I was making a new DataFrame with each transform--except this one; That was easy. Answer with this and I'll mark it answered--done. Commented Mar 5, 2017 at 20:40
  • This works i had a big challenge also Commented May 29 at 14:58

2 Answers 2

97

df.sort_values(['Total Due']) returns a sorted DF, but it doesn't update DF in place.

So do it explicitly:

df = df.sort_values(['Total Due'])

or

df.sort_values(['Total Due'], inplace=True)

NOTE: the Pandas core team discourages the use of inplace=True parameter because it should be deprecated in the future versions of Pandas.

Sign up to request clarification or add additional context in comments.

1 Comment

Regarding the note, there's nothing written in the documentation that discourages the use of inplace=True pandas.pydata.org/docs/reference/api/…
1

My problem, fyi, was that I wasn't returning the resulting dataframe, so PyCharm wasn't bothering to update said dataframe. Naming the dataframe after the return keyword fixed the issue.

Edit: I had return at the end of my method instead of return df, which the debugger must of noticed, because df wasn't being updated in spite of my explicit, in-place sort.

5 Comments

I am using PyCharm and facing the same problem. I tried naming dataframe at several different steps but can not solve this issue. Could please give more specifics about how you resolved your problem.
@r_hudson I added an Edit.
Thanks @gherson for the explanation. However, I wasted some 2 hours today and my issue is still not resolved. I am sorting along axis=1 and also specify by = column_names and for some reason dataframe is not sorting. Surprisingly it was working fine yesterday.
I tell my students to get a trivial instance working then divide and conquer: add ~half of the non-working code and, depending if it still works, add or subtract code, until the point of contention is isolated, and tweaked until working.
@r_hudson I was in the same boat as yours. In my case the issue came out to be that although the values of my column seemed to be numeric, but they were not. So the following trick worked for me: df["time"] = pd.to_numeric(df["time"]) df.sort_values(by = ['time'], ascending = True, inplace=True)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.