0

Let us create a pandas dataframe with two columns:

lendf = pd.read_csv('/git/opencv-related/experiments/audio_and_text_files_lens.csv',
        names=['path','duration'])

Here is the default numerically incrementing index:

enter image description here

Let's change the index to allow searching by the path attribute:

lendf.set_index(['path'])

But the index did not change??

enter image description here

How about invoking reindex() ?

lendf.reindex()

enter image description here

Still no change!

Note that I had been referencing the source code sphinx https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.set_index.html: here is an excerpt:

enter image description here

So then what am I misunderstanding about pandas indexing - and how should the search/indexing by path be set up?

1 Answer 1

2

You need to pass inplace=True otherwise set_index will return a new dataframe not alter the existing one

lendf.set_index(['path'], inplace=True)
Sign up to request clarification or add additional context in comments.

2 Comments

OK I had just realized that: I had specifically been looking to see if this were in-place but did not see df = df.set_index('month') in the sphinx. Not the way I would write the docs but will be aware of this going forward
@javadba I agree that the docs leave something to be desired. Prefixing the method with "set_" even suggests that it would be in-place, why would setting something return a new instance?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.