How to change header row in a python dataframe

Question

I'm having trouble changing the header row in an existing DataFrame using pandas in python. After importing pandas and the csv file I set a header row as None in order to be able to remove duplicate dates after transposing. However this leaves me with a row header (and in fact an index column) that I do not want.

df = pd.read_csv(spreadfile, header=None)

df2 = df.T.drop_duplicates([0], take_last=True)
del df2[1]

indcol = df2.ix[:,0]
df3 = df2.reindex(indcol)

The above unimaginative code however fails on two counts. The index column is now the required one however all entries are now NaN. My understanding of python is not yet good enough to recognise what python is doing. The desired output below is what I need, any help would be greatly appreciated!

df2 before reindexing:

     0             2             3             4             5
0        NaN  XS0089553282  XS0089773484  XS0092157600  XS0092541969
1  01-May-14         131.7         165.1         151.8          88.9
3  02-May-14           131         164.9         151.7          88.5
5  05-May-14         131.1           165         151.8          88.6
7  06-May-14         129.9         163.4         151.2          87.1

df2 after reindexing:

             0    2    3    4    5
0                                 
NaN        NaN  NaN  NaN  NaN  NaN
01-May-14  NaN  NaN  NaN  NaN  NaN
02-May-14  NaN  NaN  NaN  NaN  NaN
05-May-14  NaN  NaN  NaN  NaN  NaN
06-May-14  NaN  NaN  NaN  NaN  NaN

df2 desired:

       XS0089553282  XS0089773484  XS0092157600  XS0092541969
01-May-14         131.7         165.1         151.8          88.9
02-May-14           131         164.9         151.7          88.5
05-May-14         131.1           165         151.8          88.6
06-May-14         129.9         163.4         151.2          87.1

Stephopolis · Accepted Answer · 2018-08-22 19:56:10Z

Assign the columns directly:

indcol = df2.ix[:,0]
df2.columns = indcol

The problem with reindex is it'll use the existing index and column values of your df, so your passed in new column values don't exist, hence why you get all NaNs

A simpler approach to what you're trying to do:

In [147]:
# take the cols and index values of interest
cols = df.loc[0, '2':]
idx = df['0'].iloc[1:]
print(cols)
print(idx)

2    XS0089553282
3    XS0089773484
4    XS0092157600
5    XS0092541969
Name: 0, dtype: object

1    01-May-14
3    02-May-14
5    05-May-14
7    06-May-14
Name: 0, dtype: object

In [157]:
# drop the first row and the first column
df2 = df.drop('0', axis=1).drop(0)
# overwrite the index values
df2.index = idx.values
df2

Out[157]:
               2      3      4     5
01-May-14  131.7  165.1  151.8  88.9
02-May-14    131  164.9  151.7  88.5
05-May-14  131.1    165  151.8  88.6
06-May-14  129.9  163.4  151.2  87.1

In [158]:
# now overwrite the column values    
df2.columns = cols.values
df2

Out[158]:
          XS0089553282 XS0089773484 XS0092157600 XS0092541969
01-May-14        131.7        165.1        151.8         88.9
02-May-14          131        164.9        151.7         88.5
05-May-14        131.1          165        151.8         88.6
06-May-14        129.9        163.4        151.2         87.1

Nader Hisham · Accepted Answer · 2015-11-05 09:57:44Z

1

In [310]:
cols = df.iloc[0 , 1:]
cols
Out[310]:
1    XS0089553282
2    XS0089773484
3    XS0092157600
4    XS0092541969
Name: 0, dtype: object

In [311]:
df.drop(0 , inplace=True)
df
Out[311]:
           0    1       2          3    4
1   01-May-14   131.7   165.1   151.8   88.9
2   02-May-14   131     164.9   151.7   88.5
3   05-May-14   131.1   165     151.8   88.6
4   06-May-14   129.9   163.4   151.2   87.1

In [312]:
df.set_index(0 , inplace=True)
df

Out[312]:
    0           1   2           3   4       
01-May-14   131.7   165.1   151.8   88.9
02-May-14   131     164.9   151.7   88.5
05-May-14   131.1   165     151.8   88.6
06-May-14   129.9   163.4   151.2   87.1

In [315]:

df
df.columns = cols
df
Out[315]:
            XS0089553282    XS0089773484    XS0092157600    XS0092541969                
01-May-14   131.7                  165.1    151.8           88.9
02-May-14   131                    164.9    151.7           88.5
05-May-14   131.1                    165    151.8           88.6
06-May-14   129.9                  163.4    151.2           87.1

answered Nov 5, 2015 at 9:57

Nader Hisham

5,4144 gold badges22 silver badges35 bronze badges

4 Comments

Oaka13 Over a year ago

inplace = True generates an error for me: TypeError: drop() got an unexpected keyword argument 'inplace'

Nader Hisham Over a year ago

this error indicates that there is no argument with name inplace for the method drop which of course is not the case , I'm not sure about this , are you sure you followed the same steps ?

Oaka13 Over a year ago

I've tried again and same thing, it's really strange because set_index has the inplace argument. The only arguments for df.drop are labels, axis and level.

Nader Hisham Over a year ago

ok , no problem just exclude this parameter for now , but you will need to assign the result of the function to the same df like this df = df.set_index(0)

Collectives™ on Stack Overflow

How to change header row in a python dataframe

2 Answers 2

Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related