Could someone please help me out, I'm trying to remove the need to iterate through the dataframe and know it is likely very easy for someone with the knowledge.
Dataframe:
id racecourse going distance runners draw draw_bias
0 253375 178 Standard 7.0 13 2 0.50
1 253375 178 Standard 7.0 13 11 0.25
2 253375 178 Standard 7.0 13 12 1.00
3 253376 178 Standard 6.0 12 2 1.00
4 253376 178 Standard 6.0 12 8 0.50
... ... ... ... ... ... ... ...
378867 4802789 192 Standard 7.0 16 11 0.50
378868 4802789 192 Standard 7.0 16 16 0.10
378869 4802790 192 Standard 7.0 16 1 0.25
378870 4802790 192 Standard 7.0 16 3 0.50
378871 4802790 192 Standard 7.0 16 8 1.00
378872 rows × 7 columns
What I need is to add a new column with the count of unique races (id) by the conditions defined below. This code works as expected but it is sooo slow....
df['race_count'] = None
for i, row in df.iterrows():
df.at[i, 'race_count'] = df.loc[(df.racecourse==row.racecourse)&(df.going==row.going)&(df.distance==row.distance)&(df.runners==row.runners), 'id'].nunique()