16

I have a DataFrame like the following:

import numpy as np
import pandas as pd
import string
import random

random.seed(42)

df = pd.DataFrame({'col1': list(string.ascii_lowercase)[:11],
                   'col2':[random.randint(1,100) for x in range(11)]})

df

   col1 col2
0   a   64
1   b   3
2   c   28
3   d   23
4   e   74
5   f   68
6   g   90
7   h   9
8   i   43
9   j   3
10  k   22

I'm trying to create a new dataframe based on the filtering the rows of the previous dataframe that match a list of values. I have tried the next piece of code:

df_filt = df[df['col1'] in ['a','c','h']]

But I get an error. I'm expecting the next result:

df_filt

   col1 col2
0   a   64
1   c   28
2   h   9

I'm looking for a flexible solution that allows to filter based on more elements of the matching list than the ones presented in the example.

1
  • 2
    use isin to solve this, df['col1'].isin(['a','c','h']) Commented Oct 31, 2018 at 11:12

2 Answers 2

38

You can use pandas.Series.isin for compound "in"-checks.

Input dataframe:

>>> df
>>> 
   col1  col2
0     a    64
1     b     3
2     c    28
3     d    23
4     e    74
5     f    68
6     g    90
7     h     9
8     i    43
9     j     3
10    k    22

Output dataframe:

>>> df[df['col1'].isin(['a', 'c', 'h'])]
>>> 
  col1  col2
0    a    64
2    c    28
7    h     9
Sign up to request clarification or add additional context in comments.

Comments

5

Use isin

df_filt = df[df.col1.isin(['a','c','h'])]

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.