1

I have a dataframe that looks like this:

DayOfWeek  Sunday  Monday  Tuesday  Wednesday  Thursday  Friday  Saturday
00            0.0     0.0      0.0       19.0       0.0     4.0       0.0
01            0.0     0.0      0.0        0.0       0.0     7.0       0.0
07            0.0     0.0      3.0        5.0       3.0     0.0       1.0
08            0.0    17.0     16.0        8.0      10.0     1.0       0.0
09           10.0    48.0     30.0       86.0      12.0     3.0       0.0
10           70.0    58.0      3.0       36.0      52.0    70.0       0.0
11           32.0    26.0      0.0       20.0      38.0    42.0       0.0
12           21.0     9.0     83.0       32.0     129.0    57.0       0.0
13           53.0    51.0     55.0       36.0      18.0    32.0       0.0
14           64.0    62.0     24.0       21.0      53.0    61.0       0.0
15           46.0   121.0     37.0       31.0      58.0    54.0       0.0
16           95.0   139.0     86.0       58.0      79.0    11.0       0.0
17          113.0    56.0     73.0      146.0      78.0    17.0       0.0

and I want to make it as precentage, so I want to sum each column, and in each cell I want to divide in the sum of the column so I did this code:

df_day = df_day.apply(lambda x: round(100 * x / df_day.groupby('DayOfWeek').size().sum()))

but it doesn't work...

any ideas please?

1 Answer 1

3

I think you need divide by div summed columns by sum, then multiple by mul and if necessary round:

print (df_day.sum())
Sunday       504.0
Monday       587.0
Tuesday      410.0
Wednesday    498.0
Thursday     530.0
Friday       359.0
Saturday       1.0
dtype: float64

print (df_day.div(df_day.sum(), axis=1).mul(100).round(0))
           Sunday  Monday  Tuesday  Wednesday  Thursday  Friday  Saturday
DayOfWeek                                                                
0             0.0     0.0      0.0        4.0       0.0     1.0       0.0
1             0.0     0.0      0.0        0.0       0.0     2.0       0.0
7             0.0     0.0      1.0        1.0       1.0     0.0     100.0
8             0.0     3.0      4.0        2.0       2.0     0.0       0.0
9             2.0     8.0      7.0       17.0       2.0     1.0       0.0
10           14.0    10.0      1.0        7.0      10.0    19.0       0.0
11            6.0     4.0      0.0        4.0       7.0    12.0       0.0
12            4.0     2.0     20.0        6.0      24.0    16.0       0.0
13           11.0     9.0     13.0        7.0       3.0     9.0       0.0
14           13.0    11.0      6.0        4.0      10.0    17.0       0.0
15            9.0    21.0      9.0        6.0      11.0    15.0       0.0
16           19.0    24.0     21.0       12.0      15.0     3.0       0.0
17           22.0    10.0     18.0       29.0      15.0     5.0       0.0

Slowier solution with apply:

print (df_day.apply(lambda x: round(100 * x / df_day.sum()), axis=1))

Timings:

In [171]: %timeit (df_day.div(df_day.sum(), axis=1).mul(100).round(0))
1000 loops, best of 3: 1.89 ms per loop

In [172]: %timeit (df_day.apply(lambda x: round(100 * x / df_day.sum()), axis=1))
100 loops, best of 3: 5.18 ms per loop
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.