I am new to Python and have a question regarding matching strings in a list to a column in a df.
When I run the following commands, I would like a new column named "Match" to be created, and if there is a match between the string in the list and the string in the column the value in the "Match" column and corresponding row should be True, if no match, then False. The desired outcome would be False, False, True, False, False. Since the string "Honda" is not an exact match of "Honda Civic" it should not be True. Same with "Toy" is not the exact match of "Toyota Corolla".
Creating df:
Cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4', np.nan],
'Price': [22000,25000,27000,35000, 29000],
'Liscence Plate': ['ABC 123', 'XYZ 789', 'CBA 321', 'ZYX 987', 'DEF 456']}
df = DataFrame(Cars,columns= ['Brand', 'Price', 'Liscence Plate'])
I then create a list of the values I would like to search for, joined by |.
search_for_these_values = ['Honda', 'Toy', 'Ford Focus', 'Audi A4 2019']
pattern = '|'.join(search_for_these_values)
Here I have tried the str.match command and are given True, True, True, False, False.
df['Match'] = df["Brand"].str.match(pattern, na=False)
Here I have created a loop using the == operator and are given False, False, False, False, False.
for i in range(0,len(pattern)):
df['Match'] = df['Brand'] == pattern[i]
Thank you for the help!
df.Brand.isin(search_for_these_values)