Python pandas convert multiple headers in excel file into columns

Question

I have been looking around for a way to convert an excel file with multiple headers into column headings using the pandas library.

I have been successful in importing the data into a dataframe by reading the file and parsing it using the ExcelFile. I have also been able to identify the headers using the header=[0, 4]. Where I run into issues is reindexing and/or using the melt function to convert the headers into columns.

When I use the melt function I am able to successfully convert the columns into the rows. However, I want the headers to be a single column rather than be stacked with the rest of the data.

Currently, this is how the data is structured:

Excel file displaying data with multiple headers

After the conversion, the data should look like this:

Data that is unpivot with headers converted into columns

I have been reading about indexing, but not sure I understand how it would apply here.

I'm new to python, like really new, and any support or direction is greatly appreciated. I have been reading the following cheatsheets but haven't found the right way to convert it:

https://www.datacamp.com/community/data-science-cheatsheets

Here is a sample code:

import pandas as pd

xl = pd.ExcelFile('help.xlsx')
df1 = xl.parse('Sheet1')

df2 = pd.melt(df1,
          id_vars=['PW'],
          value_vars=['Fruit','Conventional'])

Also, adding the results after running the code: df1 the data with multiple headers

The following is the error with the data (headers are not converted into columns, headers are stacked with the rest of the data):

after using pandas melt the headers are stacked with the data and not converted into their own column

This is how the final product should look:

Headers converted into columns

do you have any code to show representing the state of your problem thus-far? — JacobIRR
– JacobIRR, Commented Jan 6, 2018 at 0:21

virtualdvid · Accepted Answer · 2018-01-07 00:47:26Z

1

Try this:

# In[1]:
df = pd.read_excel('help.xlsx', header=[0,1,2,3]) #Read file, use 4 rows as header
df.columns = df.columns.map(','.join) #Concatenate by ',' the fields name
df = df.rename_axis('PW').reset_index() #reset and rename index
df2 = pd.melt(df, id_vars=list(df.columns)[0], value_vars=list(df.columns)[1:], value_name='Volume') #Unpivot table, g roping by 'variable' and 'volume'
df2[['Category', 'Field_Type', 'Growing_Method', 'Product']] = df2['variable'].str.split(',',expand=True) #Split using ',' as delimeter
df2.__delitem__('variable') #Delete extra field 'variable'
#Reorder Columns
cols = df2.columns.tolist() 
df2 = df2[[cols[0]] + cols[2:] + [cols[1]]]
df2

edited Jan 7, 2018 at 0:47

answered Jan 6, 2018 at 18:06

virtualdvid

2,4413 gold badges18 silver badges35 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Xukrao Over a year ago

When I run this code the 'PW' data values are absent from the result dataframe. I was under the impression that the question author wants to have this data included though.

virtualdvid Over a year ago

It was because the melt method reset the index. I fixed it!

Xukrao · Accepted Answer · 2018-01-06 02:49:16Z

0

One way to accomplish this type of reshaping is with the stack operation of pandas:

import pandas as pd

# Read excel file. Use first column as row index, and use first four rows as
# column index levels
df = pd.read_excel('test.xlsx', index_col=0, header=[0, 1, 2, 3])

# Assign names to row index and column index levels
df.index.name = 'PW'
df.columns.names = ['Category', 'Field_Type', 'Growing_Method', 'Product']

# Convert all column index levels into row index levels
s = df.stack([0, 1, 2, 3])

# Assign name to the single data values column
s.name = 'Volume'

answered Jan 6, 2018 at 2:49

Xukrao

8,6745 gold badges29 silver badges58 bronze badges

Collectives™ on Stack Overflow

Python pandas convert multiple headers in excel file into columns

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related