2

I have a list with year and day starting from December till February from 2003 to 2005. I want to divide this list into list of lists to hold year day from December to February:

a = ['2003337', '2003345', '2003353', '2003361', '2004001', '2004009', '2004017', '2004025', '2004033', '2004041', '2004049', '2004057', '2004337', '2004345', '2004353', '2004361', '2005001', '2005009', '2005017', '2005025', '2005033', '2005041', '2005049', '2005057']

Output should be like:

b = [['2003337', '2003345', '2003353', '2003361', '2004001', '2004009', '2004017', '2004025', '2004033', '2004041', '2004049', '2004057'] ['2004337', '2004345', '2004353', '2004361', '2005001', '2005009', '2005017', '2005025', '2005033', '2005041', '2005049', '2005057']]

and then loop over each list of lists. I could use even splitting but there is a chance of missing year days. So it would be better not to do evenly split. Any suggestions?

3
  • why not use a dictionary? Commented Jul 8, 2015 at 13:09
  • Are the dates in a guaranteed to be in chronological order? Is it guaranteed that all days will be inside December-February or do we have to check for illegal values, i.e. '2004151'? Commented Jul 8, 2015 at 13:23
  • @Reti43: dates will be within December-February Commented Jul 8, 2015 at 13:25

2 Answers 2

4

Convert to datetime, then group by the year whose end is nearest.

import datetime
import itertools

#convert from a "year-day" string to a datetime object
def datetime_from_year_day(s):
    year = int(s[:4])
    days = int(s[4:])
    return datetime.datetime(year=year, month=1, day=1) + datetime.timedelta(days=days-1)

#returns the year whose end is closest to the date, whether in the past or future
def nearest_year_end(d):
    if d.month <= 6:
        return d.year-1
    else:
        return d.year

a = ['2003337', '2003345', '2003353', '2003361', '2004001', '2004009', '2004017', '2004025', '2004033', '2004041', '2004049', '2004057', '2004337', '2004345', '2004353', '2004361', '2005001', '2005009', '2005017', '2005025', '2005033', '2005041', '2005049', '2005057']

result = [list(v) for k,v in itertools.groupby(a, lambda s: nearest_year_end(datetime_from_year_day(s)))]
print result

Result:

[['2003337', '2003345', '2003353', '2003361', '2004001', '2004009', '2004017', '2004025', '2004033', '2004041', '2004049', '2004057'], ['2004337', '2004345', '2004353', '2004361', '2005001', '2005009', '2005017', '2005025', '2005033', '2005041', '2005049', '2005057']]
Sign up to request clarification or add additional context in comments.

Comments

0

You can also do it by nesting 2 if-else in a for loop. This is also easy to understand

a = ['2003337', '2003345', '2003353', '2003361', '2004001', '2004009', '2004017', '2004025', '2004033', '2004041', '2004049', '2004057', '2004337', '2004345', '2004353', '2004361', '2005001', '2005009', '2005017', '2005025', '2005033', '2005041', '2005049', '2005057']
temp = []
b = []
for day in a:
    if len(temp)==0:
        temp.append(day)
    else:
        if int(temp[-1][4:]) < 60 and int(day[4:]) > 335:
            b.append(temp)
            temp = []
            temp.append(day)
        else:
            temp.append(day)
print b

Result-

[['2003337', '2003345', '2003353', '2003361', '2004001', '2004009', '2004017', '2004025', '2004033', '2004041', '2004049', '2004057'], ['2004337', '2004345', '2004353', '2004361', '2005001', '2005009', '2005017', '2005025', '2005033', '2005041', '2005049', '2005057']]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.