2

Take a sample dataset:

df = pd.DataFrame([['Mexico', 'Chile'], ['Nicaragua', 'Nica'], ['Colombia', 'Mex']], columns = ["col1", "col2"])

The dataframe looks like this:

I have two columns. I want to check to see if the values in column two exist in column one. This includes checking for partial strings.

The desired output is:

enter image description here

I am able to compare the whole value of each row in column two, but this does not account for partial strings:

df['compare'] = np.where(df['col2'].isin(df['col1']), 'yes', 'no')

I am also able to check if a single value exists within a column, which checks for partial strings but does not include every row in the 'col2' column.

df['compare'] = df['col1'].str.contains('Mex')

How can I do both at the same time?

1 Answer 1

2

This looks like an expensive operation. You can try:

df['col2'].apply(lambda x: 'Yes' if df['col1'].str.contains(x).any() else 'No')

Output:

0     No
1    Yes
2    Yes
Name: col2, dtype: object
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.