I am trying to remove rows from a large data frame based on whether each row has certain values in either of two different columns.
I will have a Series called "finalists". Finalists with be a series of names that will be imported from a different part of the code and will change each time its run.
ex)
finalists = ["Company A", "Company F", "Product S"... etc]
The dataframe will be about 1,000 rows long and 200 columns wide
Simplifying it, the dataframe would look something like this:
| category | score | description | company_name | product_name | comments |
|---|---|---|---|---|---|
| "----" | 2.8 | "----" | Company A | Product A | "----" |
| "----" | 1.2 | "----" | Company B | Product B | "----" |
| "----" | 2.4 | "----" | Company C | Product C | "----" |
I need to keep the rows where either the company_name column or product_name column is one of the values in the Finalists Series (or remove rows where it isn't).
I tried doing something like this:
results = finalists.isin(app_data["company_name"]) or finalists.isin(app_data["product_name"])
but got an error that the answer was ambiguous