I have several different 'columns' I need to save to a CSV. Currently I do this:
f = open(out_csv, 'w', newline='')
w = csv.writer(f, delimiter=",", )
w.writerow(['id_a', 'id_b',
'lat_a','lon_a',
'lat_b','lon_b',
'proj_metres'])
w.writerows(np.column_stack((
id_labels[udist.row],
id_labels[udist.col],
points[udist.row],
points[udist.col],
udist.data)))
Perhaps not important but for completeness:
tree_dist = tree.sparse_distance_matrix(tree)
udist = sparse.tril(tree_dist, k=-1)
The dimensions are around 30 million by 7 columns (two of which are strings: id_labels) - so this takes a while (around 8 minutes) and uses a lot of RAM as I think python creates a new temporary object when I call np.column_stack so at a one point in time it holds double the data it needs.
I was hoping was there was a better way to create the CSV I need?