5

I have a pandas dataframe with a multiindex. Unfortunately one of the indices gives years as a string

e.g. '2010', '2011'

how do I convert these to integers?

More concretely

MultiIndex(levels=[[u'2010', u'2011'], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]],
       labels=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 
 10, 11, 12, , ...]], names=[u'Year', u'Month'])

.

df_cbs_prelim_total.index.set_levels(df_cbs_prelim_total.index.get_level_values(0).astype('int'))

seems to do it, but not inplace. Any proper way of changing them?

Cheers, Mike

1
  • 1
    can you just convert them before assigning them to your index? that would seem to be the least painful method Commented Nov 24, 2014 at 17:18

1 Answer 1

3

Will probably be cleaner to do this before you assign it as index (as @EdChum points out), but when you already have it as index, you can indeed use set_levels to alter one of the labels of a level of your multi-index. A bit cleaner as your code (you can use index.levels[..]):

In [165]: idx = pd.MultiIndex.from_product([[1,2,3], ['2011','2012','2013']])

In [166]: idx
Out[166]:
MultiIndex(levels=[[1, 2, 3], [u'2011', u'2012', u'2013']],
           labels=[[0, 0, 0, 1, 1, 1, 2, 2, 2], [0, 1, 2, 0, 1, 2, 0, 1, 2]])

In [167]: idx.levels[1]
Out[167]: Index([u'2011', u'2012', u'2013'], dtype='object')    

In [168]: idx = idx.set_levels(idx.levels[1].astype(int), level=1)

In [169]: idx
Out[169]:
MultiIndex(levels=[[1, 2, 3], [2011, 2012, 2013]],
           labels=[[0, 0, 0, 1, 1, 1, 2, 2, 2], [0, 1, 2, 0, 1, 2, 0, 1, 2]])

You have to reassign it to save the changes (as is done above, in your case this would be df_cbs_prelim_total.index = df_cbs_prelim_total.index.set_levels(...))

Sign up to request clarification or add additional context in comments.

1 Comment

Note that the levels method of a pandas MultiIndex seems to sort the individual levels lexicographically (even if these are of int type). This may reorder the index labels for a given level without maintaining the MultiIndex tuples (e.g., if you have indices 1-100, these will be reordered as 1,10,100,2,20,3..., independently of other levels). In this case one should explicitly sort the converted index within set_levels(). In the example above (input cell [168]): idx = idx.set_levels(idx.levels[1].astype(int).sort_values(), level=1).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.