I have an Azure Function which is triggered by a blobtrigger. So basically anytime somebody uploads a .csv to an Azure blobstorage I want to clean it up, process it, and insert it into an sql database table.
When I tested it out locally, it worked perfectly fine, but after deployment, I get errors like these: Exception while executing function: Functions.BlobTrigger1 python exited with code 137
I read that this exception is usually thrown when my function consumes too much memory, but I'm working with a .csv that has only 26.16 MiB. Here's my code
blobBinaryDataStream = BytesIO(myblob.read())
records = [r.decode('utf-8').split(',') for r in blobBinaryDataStream]
arr = np.array(records)
df = pd.DataFrame(arr[1:], columns=[name.replace('\r\n', '') for name in records[0]])
df = df.replace('\r\n', '')
del arr
locationDF = df.iloc[: , [0, 1, 2, 46, 47, 48, 49, 50, 51, 58, 59, 60]].copy()
locationDF.drop_duplicates(inplace=True)
df.drop(df.columns.difference(df.columns[i] for i in [0, 3, 5, 8, 25, 35, 36]), axis = 1, inplace=True)
df['date'] = df['date'].transform(lambda x: SomeFunction(x))
df = df.replace('', 0)
factdf = df.groupby([df.columns[0],df.columns[1]])[df.columns[2:]].apply(lambda x : x.astype(np.longlong).sum()).reset_index()
quoted = urllib.parse.quote_plus(os.environ['ConnString'])
engine = create_engine('mssql+pyodbc:///?odbc_connect={}'.format(quoted))
locationDF.reset_index(drop=True, inplace=True)
locationDF.index = locationDF.index + 1
locationDF = locationDF.replace('', np.nan)
locationDF.to_sql('Table1', schema='dbo', con=engine, if_exists='replace', method = 'multi', chunksize = 100)
factdf.reset_index(drop=True, inplace=True)
factdf.index = factdf.index + 1
factdf.replace('', np.nan)
factdf.to_sql('Table2', schema='dbo', con=engine, if_exists='replace', method = 'multi', chunksize = 100)