I'm having trouble changing the header row in an existing DataFrame using pandas in python. After importing pandas and the csv file I set a header row as None in order to be able to remove duplicate dates after transposing. However this leaves me with a row header (and in fact an index column) that I do not want.
df = pd.read_csv(spreadfile, header=None)
df2 = df.T.drop_duplicates([0], take_last=True)
del df2[1]
indcol = df2.ix[:,0]
df3 = df2.reindex(indcol)
The above unimaginative code however fails on two counts. The index column is now the required one however all entries are now NaN. My understanding of python is not yet good enough to recognise what python is doing. The desired output below is what I need, any help would be greatly appreciated!
df2 before reindexing:
0 2 3 4 5
0 NaN XS0089553282 XS0089773484 XS0092157600 XS0092541969
1 01-May-14 131.7 165.1 151.8 88.9
3 02-May-14 131 164.9 151.7 88.5
5 05-May-14 131.1 165 151.8 88.6
7 06-May-14 129.9 163.4 151.2 87.1
df2 after reindexing:
0 2 3 4 5
0
NaN NaN NaN NaN NaN NaN
01-May-14 NaN NaN NaN NaN NaN
02-May-14 NaN NaN NaN NaN NaN
05-May-14 NaN NaN NaN NaN NaN
06-May-14 NaN NaN NaN NaN NaN
df2 desired:
XS0089553282 XS0089773484 XS0092157600 XS0092541969
01-May-14 131.7 165.1 151.8 88.9
02-May-14 131 164.9 151.7 88.5
05-May-14 131.1 165 151.8 88.6
06-May-14 129.9 163.4 151.2 87.1