-3

I'm trying to parse a csv file in python and print the sum of order_total for each day. Below is the sample csv file

  order_total   created_datetime                                                                                                
24.99   2015-06-01 00:00:12                                                                                             
0   2015-06-01 00:03:15                                                                                             
164.45  2015-06-01 00:04:05                                                                                             
24.99   2015-06-01 00:08:01                                                                                             
0   2015-06-01 00:08:23                                                                                             
46.73   2015-06-01 00:08:51                                                                                             
0   2015-06-01 00:08:58                                                                                             
47.73   2015-06-02 00:00:25                                                                                             
101.74  2015-06-02 00:04:11                                                                                             
119.99  2015-06-02 00:04:35                                                                                             
38.59   2015-06-02 00:05:26                                                                                             
73.47   2015-06-02 00:06:50                                                                                             
34.24   2015-06-02 00:07:36                                                                                             
27.24   2015-06-03 00:01:40                                                                                             
82.2    2015-06-03 00:12:21                                                                                             
23.48   2015-06-03 00:12:35 

My objective here is to print the sum(order_total) for each day. For example the result should be

2015-06-01 -> 261.16
2015-06-02 -> 415.75
2015-06-03 -> 132.92

I have written the below code - its does not perform the logic yet, but I'm trying to see if its able to parse and loop as required by printing some sample statements.

def sum_orders_test(self,start_date,end_date):
        initial_date = datetime.date(int(start_date.split('-')[0]),int(start_date.split('-')[1]),int(start_date.split('-')[2]))
        final_date = datetime.date(int(end_date.split('-')[0]),int(end_date.split('-')[1]),int(end_date.split('-')[2]))
        day = datetime.timedelta(days=1)
        with open("file1.csv", 'r') as data_file:
            next(data_file)
            reader = csv.reader(data_file, delimiter=',')
            order_total=0
            if initial_date <= final_date:
                for row in reader:
                    if str(initial_date) in row[1]:
                        print 'initial_date : ' + str(initial_date)
                        print 'Date : ' + row[1]
                        order_total = order_total + row[0]
                    else:
                        print 'Else'
                        print 'Date ' + str(row[1]) + 'Total ' +str(order_total)
                        order_total=0
                        initial_date = initial_date + day                                                                                           

based on my current logic I'm running into this issue -

  1. its not priting the correct sum for each date
  2. 2015-06-01 : 261.16
  3. 2015-06-02 : 368.03 (should be 415.75 )
  4. 2015-06-03 : Null

Calling the function using sum_orders_test('2015-06-01','2015-06-03');

I know there is some silly logical issue, but being new to programming and python I'm unable to figure it out.

4
  • 1
    You use delimiter=',' but in your .csv there sre no commas Commented Sep 3, 2017 at 11:00
  • Please edit the previous post or comment on existing answers. Don't repost Commented Sep 3, 2017 at 11:23
  • @cricket_007 oh, omg, same question again... Flagging it, don't want to lose reputation by downvoting such guy's questions... Commented Sep 3, 2017 at 11:23
  • 2
    And it's not a possible duplicate, it is the same! please flag! Commented Sep 3, 2017 at 11:23

2 Answers 2

0

Short solution using pandas library:

import pandas as pd

df = pd.read_table('yourfile.csv', sep=r'\s{2,}', engine='python')
sums = df.groupby(df.created_datetime.str[:11]).sum()

print(sums)

The output:

                  order_total
created_datetime             
2015-06-01             261.16
2015-06-02             415.76
2015-06-03             132.92

  • df.created_datetime.str[:11] - considering only date value (i.e. yyyy-mm-dd) from created_datetime column as grouping value

  • .sum() - summing up grouped values

Sign up to request clarification or add additional context in comments.

1 Comment

Btw, this is a duplicate post. stackoverflow.com/a/46021592/2308683
0

Solution using a dictionary:

data = [
(24.99   ,'2015-06-01 00:00:12'),
(0       ,'2015-06-01 00:03:15'),
(164.45  ,'2015-06-01 00:04:05'),
(24.99   ,'2015-06-01 00:08:01'),
(0       ,'2015-06-01 00:08:23'),
(46.73   ,'2015-06-01 00:08:51'),
(0       ,'2015-06-01 00:08:58'),
(47.73   ,'2015-06-02 00:00:25'),
(101.74  ,'2015-06-02 00:04:11'),
(119.99  ,'2015-06-02 00:04:35'),
(38.59   ,'2015-06-02 00:05:26'),
(73.47   ,'2015-06-02 00:06:50'),
(34.24   ,'2015-06-02 00:07:36'),
(27.24   ,'2015-06-03 00:01:40'),
(82.2    ,'2015-06-03 00:12:21'),
(23.48   ,'2015-06-03 00:12:35')
]


def sumByDay(data):
    sums = {}
    # loop through each entry and add the order value to it's corresponding day entry in dictionary 
    for x in data:
        day = x[1].split()[0] # get the date portion from the string
        order = x[0]
        sums[day]= sums.get(day, 0) + order

    return sums

sums = sumByDay(data)

for key in sums:
    print(key, sums[key])

Output:

2015-06-01 261.16
2015-06-02 415.76
2015-06-03 132.92

1 Comment

You've basically re-written what defaultdict gives you. For example, same solution here stackoverflow.com/a/46021621/2308683

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.