0

I have dataframe, called df

Category    block   array_size  num_node    num_task    time
 DATA         2     100             1          1        0.104
 DATA         2     100             1          2        0.348
 DATA         2     100             1          1        2.837 
 DATA         2     1000            1          1        29.188
 DATA         2     1000            1          1       284.087

With this dataframe, I want to find out mean value of each configuration.
So the variables I want to have are(df_foo_{#block}_{#array_size}_{#num_task}),

df_foo_2_100_1 = df.loc[
    (df["num_task"] == 1) & 
    (df["block"] == 2) & 
    (df["array_size"] == 100)]["time"].mean()
df_foo_2_1000_1 = df.loc[
        (df["num_task"] == 1) & 
        (df["block"] == 2) & 
        (df["array_size"] == 1000)]["time"].mean()

How can I automatically create these variables by using the loop?

Thanks!

1 Answer 1

2

You can do groupby

df.loc[(df["num_task"] == 1) & (df["block"] == 2)].groupby('array_size').time.mean()
Out[206]: 
array_size
100       1.4705
1000    156.6375
Name: time, dtype: float64

Seems like you need

df.groupby(['num_task','block','array_size']).time.mean()
Out[208]: 
num_task  block  array_size
1         2      100             1.4705
                 1000          156.6375
2         2      100             0.3480
Name: time, dtype: float64
Sign up to request clarification or add additional context in comments.

1 Comment

Then, I have to create all different variables first, then assign the values which generated by groupby ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.