Approach #1
Here's a vectorized approach using NumPy's broadcasting -
def broadcasting_based(df, tuples):
idx = np.array(tuples)
mask = (df.B.values == idx[:, None, 0]) & (df.C.values == idx[:, None, 1])
return df[~mask.any(0)]
Sample run -
In [224]: df
Out[224]:
A B C
0 6 4 4
1 2 0 3
2 8 3 4
3 7 8 3
4 6 7 8
5 3 3 2
6 5 4 2
7 2 4 7
8 6 1 6
9 1 1 1
In [225]: tuples = [(3,4),(7,8),(1,6)]
In [226]: broadcasting_based(df,tuples)
Out[226]:
A B C
0 6 4 4
1 2 0 3
3 7 8 3
5 3 3 2
6 5 4 2
7 2 4 7
9 1 1 1
Approach #2 : To cover a generic number of columns
For a case like this, one could collapse the information from different columns into one single entry that would represent the uniqueness among all columns. This could be achieved by considering each row as indexing tuple. Thus, basically each row would become one entry. Similarly, each entry from the list of tuple that is to be matched could be reduced to a 1D array with each tuple becoming one scalar each. Finally, we use np.in1d to look for the correspondence, get the valid mask and have the desired rows removed dataframe, Thus, the implementation would be -
def linear_indexing_based(df, tuples):
idx = np.array(tuples)
BC_arr = df[['B','C']].values
shp = np.maximum(BC_arr.max(0)+1,idx.max(0)+1)
BC_IDs = np.ravel_multi_index(BC_arr.T,shp)
idx_IDs = np.ravel_multi_index(idx.T,shp)
return df[~np.in1d(BC_IDs,idx_IDs)]