How to use pandas dataframe set_index()

Question

Let us create a pandas dataframe with two columns:

lendf = pd.read_csv('/git/opencv-related/experiments/audio_and_text_files_lens.csv',
        names=['path','duration'])

Here is the default numerically incrementing index:

Let's change the index to allow searching by the path attribute:

lendf.set_index(['path'])

But the index did not change??

How about invoking reindex() ?

lendf.reindex()

Still no change!

Note that I had been referencing the source code sphinx https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.set_index.html: here is an excerpt:

So then what am I misunderstanding about pandas indexing - and how should the search/indexing by path be set up?

Iain Shelvington · Accepted Answer · 2020-04-22 00:57:53Z

2

You need to pass inplace=True otherwise set_index will return a new dataframe not alter the existing one

lendf.set_index(['path'], inplace=True)

answered Apr 22, 2020 at 0:57

Iain Shelvington

32.5k3 gold badges36 silver badges55 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

WestCoastProjects Over a year ago

OK I had just realized that: I had specifically been looking to see if this were in-place but did not see df = df.set_index('month') in the sphinx. Not the way I would write the docs but will be aware of this going forward

Iain Shelvington Over a year ago

@javadba I agree that the docs leave something to be desired. Prefixing the method with "set_" even suggests that it would be in-place, why would setting something return a new instance?

Collectives™ on Stack Overflow

How to use pandas dataframe set_index()

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related