For loop with f-string with pandas dataframe

Question

I need to try create two loops (must be separate):

LOOP 1) for each fruit:

keep rows if that fruit is True
remove rows with duplicate dates (either row can be deleted)
save the result of the above as a dataframe for each fruit

LOOP 2) for each dataframe created, graph date on fruit_score:

    concat   apple_score  banana_score       date        apple      banana  
1   apple     0.400         0.400        2010-02-12      True        False  
2   banana    0.530         0.300        2010-01-12      False       True   
3   kiwi      0.532          0.200       2010-03-03      False       False  
4   bana      0.634         0.100        2010-03-03      False       True

I tried:

fruits = ['apple',  'banana',   'orange']
for fruit in fruits:
    selected_rows = df[df[ fruit ] == True ]
    df_f'{fruit}' = selected_rows.drop_duplicates(subset='date')

for fruit in fruits:
    df_f'{fruit}'.plot(x="date", y=(f'{fruit}_score'), kind="line")

Are you trying to programatically define the name of a variable ? you're expecting to get a variable called df_apple for example ? — Youyoun
– Youyoun, Commented Jul 24, 2020 at 9:07
You could use a dict instead of getting a variable name based on the for loop: stackoverflow.com/a/11553769/1735729 — Stergios
– Stergios, Commented Jul 24, 2020 at 9:09
Use a dict then, fruits_df = {} and in your for loop use fruits_df[fruit] = ... — Youyoun
– Youyoun, Commented Jul 24, 2020 at 9:11
@Manakin i dont think that will work cause he got "bana" in concat but the column banana is set to true. + he wishes to drop duplicated by date between same fruit, the other one will drop duplicated for all fruits that have same date. Hes not looping on dataframe, but on fruits. — Youyoun
– Youyoun, Commented Jul 24, 2020 at 9:13
@Youyoun you can subset on more than one column, just add fruits to .drop_duplicates nothing complex here, no need to iterate over the list either. — Umar.H
– Umar.H, Commented Jul 24, 2020 at 9:17

Jack Fleeting · Accepted Answer · 2020-07-25 10:51:54Z

3

You should do something along the lines suggested by @youyoun:

dfs = {}
fruits = ['apple',  'banana']
for fruit in fruits:
    selected_rows = df[df[ fruit ] == True ].drop_duplicates(subset='date')
    dfs[f'df_{fruit}'] = selected_rows

for a,v in dfs.items():
    print(a)
    print(v)

Output:

df_apple
  concat  apple_score  banana_score        date  apple  banana
1  apple          0.4           0.4  2010-02-12   True   False
df_banana
   concat  apple_score  banana_score        date  apple  banana
2  banana        0.530           0.3  2010-01-12  False    True
4    bana        0.634           0.1  2010-03-03  False    True

answered Jul 25, 2020 at 10:51

Jack Fleeting

25k6 gold badges27 silver badges49 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Umar.H Over a year ago

even simplier you could do dfs = {fruit, data for fruit,data in df.groupby('fruit').unique()} or something along those lines.

Collectives™ on Stack Overflow

For loop with f-string with pandas dataframe

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related