0

i have dataframe called df_civic with columns - state ,rank, make/model, model year, thefts. I want to calculate AVG and STD of thefts for each model year.

All years that are in dataframe are taken with: years_civic = list(pd.unique(df_civic['Model Year']))

My loop looks like this:

for civic_year in years_civic:
    f = df_civic['Model Year'] == civic_year
    civic_avg = df_civic[f]['Thefts'].mean()
    civic_std = df_civic[f]['Thefts'].std()
    civic_std= np.round(car_std,2)
    civic_avg= np.round(car_avg,2)
    print(civic_avg, civic_std, np.sum(f))

However output is not what i need, only output that is correct is the one from np.sum(f)

Now output looks like this:

9.0 20.51 1
9.0 20.51 1
9.0 20.51 1
9.0 20.51 1
9.0 20.51 13
9.0 20.51 15
9.0 20.51 3
9.0 20.51 2
3
  • Please include sample data and format your question according to tips provided in this post: stackoverflow.com/a/20159305 Commented Jan 6, 2021 at 17:22
  • @Aleksander, you can use triple ``` code ```, to mark a code block over multiple lines. Took a while to edit your 100s of code blocks and <br>s :) .. Also, you can simple move a new line to another line without using <br>. Its allowed in markdown. Check how I edited your question to format your question better next time. Cheers. Commented Jan 6, 2021 at 17:22
  • Hi, sorry i'll use correct ones next time! Commented Jan 6, 2021 at 17:26

1 Answer 1

1

Pandas provides many powerful functions for aggregating your data. It's usually better to first think of these functions before using for loops.

For instance, you can use:

import pandas as pd
import numpy as np

df_civic.groupby("Model Year").agg({"theft": ["mean", np.std]})

More doc here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.agg.html

Regarding your code, there is something weird, car_std and car_avg are not defined.

Sign up to request clarification or add additional context in comments.

4 Comments

Why hurry? You haven't told the OP where he might be going wrong, and while yours likely a better solution, why offer an untested one?
@navneethc can you indicate what I'm doing wrong, I'd love to know that as well honestly.
@AleksanderKuś Can you post sample data?
I checked the code, loop itself is correct - it's working, I didn't define car_std and car_avg correctly and it took value from other loop. Thanks guys.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.