24

When I have a dataframe

df = DataFrame({'A': [5, 6, 3, 4], 'B': [1, 2, 3, 5]})
df

   A  B
0  5  1
1  6  2
2  3  3
3  4  5

I can use

df[df['A'].isin([3, 6])]

in order to select rows having the passed values.

Is there also a way to keep the order of the input list?

So that my output is not:

   A  B
1  6  2
2  3  3

but

   A  B
1  3  3
2  6  2
3
  • 3
    Doing df[...] with boolean indexing keeps the order of the DataFrame, regardless of whether the ... part involves isin or not. You would have to reorder your DataFrame separately, before or after applying isin. Commented May 1, 2014 at 18:38
  • ok, isn't there a way to reorder the output by using the input list as rule? Commented May 1, 2014 at 18:45
  • 1
    No, because isin is only for checking whether each item at a time "is in" the list at all, not where it is in the list. It doesn't pay attention to the list's order. Like I said, you would need to do the ordering in a separate step. Commented May 1, 2014 at 19:15

7 Answers 7

8

This question is a bit old, but I stumbled into having to do this. This is how I resolved the problem. I believe it's quite a generic and simple solution that hasn't been proposed yet here, and that actually doesn't use the isin() method:

df.set_index('A').loc[[3,6]].reset_index()

With the example provided:

>>> df = pd.DataFrame({'A': [5, 6, 3, 4], 'B': [1, 2, 3, 5]})
>>> df.set_index('A').loc[[3,6]].reset_index()
   A  B
0  3  3
1  6  2

Of course, this has the disadvantage that it loses the original index. To preserve the index you could also:

>>> df.reset_index().set_index('A').loc[[3,6]].reset_index().set_index('index')
       A  B
index      
2      3  3
1      6  2
Sign up to request clarification or add additional context in comments.

1 Comment

This answer was exactly what I was looking for! Perfect.
4

This is a bit long, but it works. isin(), then sort_values() based on the list.

df = pandas.DataFrame({'A' : [5,6,3,4], 'B' : [1,2,3,5]})
mylist = [3,6]
ndf =  df[df['A'].isin(mylist)]
ndf['sort_cat'] = pandas.Categorical(ndf['A'], categories=mylist, ordered=True)
ndf.sort_values('sort_cat', inplace=True)
ndf.reset_index(inplace=True)
print ndf
   A  B sort_cat
2  3  3        3
1  6  2        6

(I based this answer on sort pandas dataframe based on list)

Comments

4

Another option which filters and sorts in one shot

import pandas as pd
from functools import reduce
reduce(pd.DataFrame.append, map(lambda i: df[df.A == i], [3, 6]))

1 Comment

This works, but takes a long time (7-10 seconds) on a list with 7000-ish indices
3

You can make the input list a dataframe and use the merge function. I've found this particularly useful for large input lists where order matters.

For example:

df = pd.DataFrame({'A': [5, 6, 3, 4], 'B': [1, 2, 3, 5]})
input = pd.DataFrame({'input': [3, 6]})
output = input.merge(df, left_on='input', right_on='A').loc[:, ['A', 'B']]
print(output)

   A  B
0  3  3
1  6  2

There are 2 caveats. First, you have to specify which column of df you are searching for the match using the 'right_on' input to the merge function. Secondly, the indices of the resulting output dataframe are re-indexed.

Comments

3

This is the best solution I found:

 df.iloc[pd.Index(df.A).get_indexer([3,6])]

Result:

>>> df.iloc[pd.Index(df.A).get_indexer([3,6])]
   A  B
2  3  3
1  6  2

Credit: @cs95

Comments

0

isin is a set operation, and pandas aligns the input, so order of the input set is normally in the same order as the reference frame

You could if you REALLY want to do this:

In [15]: df.take(df['A'][df['A'].isin([3,6])].order().index)
Out[15]: 
   A  B
2  3  3
1  6  2

[2 rows x 2 columns]

2 Comments

I ended up using [x for (y,x) in sorted(zip(this,list(df.B[df.A.isin([3,6])])))] giving me the wanted result. Unfortunately I could not achieve the same by your solution.
Nikita's solution fails for me with NameError: name 'this' is not defined I found that Jeff's solution works for the example df, but not for a longer df.
0

It is not the same but in my problem this solution provides me the data frame in the same order as the list in the "isin" function which is what I wanted. Take a look here

How to maintain order when selecting rows in pandas dataframe?

Perhaps it could help you.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.