0

I want to select all rows with a particular index. My DataFrame look like this:

>>> df
                            Code
Patient Date                        
1       2003-01-12 00:00:00  a
        2003-02-13 00:00:00  b
        2003-02-14 00:00:00  ba
2       2001-1-17 22:00:00  z
        2002-1-21 00:00:00  d
        2003-1-21 00:00:00  a
        2005-12-1 00:00:00  ba

Selecting one of the first (Patient) index works:

>>> df.loc[1]
                            Code
Patient Date                        
1       2003-01-12 00:00:00  a
        2003-02-13 00:00:00  b
        2003-02-14 00:00:00  ba

But selecting multiple of the first (Patient) index does not:

>>> df.loc[[1, 2]]
                            Code
Patient Date                        
1       2003-01-12 00:00:00  a
2       2001-1-17 22:00:00  z

However, I would like to get the entire dataframe (as the result would be if [1,1,1,2] i.e, the original dataframe).

When using a single index it works fine. For example:

>>> df.reset_index().set_index("Patient").loc[[1, 2]]
                   Date     Code
Patient                          
1       2003-01-12 00:00:00  a
        2003-02-13 00:00:00  b
        2003-02-14 00:00:00  ba
2       2001-1-17 22:00:00  z
        2002-1-21 00:00:00  d
        2003-1-21 00:00:00  a
        2005-12-1 00:00:00  ba

TL;DR Why do I have to repeat the index when using multiple indexes but not when I use a single index?

EDIT: Apparently it can be done similar to:

>>> df.loc[df.index.get_level_values("Patient").isin([1, 2])]

But this seems quite dirty to me. Is this the way - or is any other, better, way possible?

5
  • df.loc[1][1:2] should select two lines Commented Jun 5, 2014 at 10:32
  • what version pandas? can u show df.info()? Commented Jun 5, 2014 at 10:54
  • @Jeff I also get this with 0.13 and 0.14. Only in 0.14 you can do df.loc[([1,2],),:] to get what you want. Commented Jun 5, 2014 at 11:00
  • by the way, is it possible to achieve the same without KeyErrors. I.e. if i do df.loc[([1,3],),:] when 3 is missing - only the result for 1 is returned Commented Jun 5, 2014 at 12:44
  • this was a non-trivial bug, see here: github.com/pydata/pandas/pull/7350 Commented Jun 5, 2014 at 12:45

1 Answer 1

1

For Pandas verison 0.14 the recommended way, according to the above comment, is:

df.loc[([1,2],),:]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.