I'm working with census data (using the Census package ). When I select variables with the census API, they pass through in their raw format (e.g. B01001_007) and I'd like to replace the column name with the label (e.g. male 18 to 19 years).
I know this can be done through df.columns = ['male 18 to 19 years', 'male 20 years', 'male 21 years'] but this is tedious.
Is there a way to do some type of mapping that will auto-query into the header in my df below?
Sample data:
import pandas as pd
from pandas import DataFrame
variables_table = pd.DataFrame({'variable': ['B01001_007E','B01001_008E','B01001_009E'],
'label': ['male 18 to 19 years','male 20 years','male 21 years']
})
variables_table
label variable
male 18 to 19 years B01001_007E
male 20 years B01001_008E
male 21 years B01001_009E
Unclean output:
df = pd.DataFrame({'B01001_007E': ['100','200','300'],
'B01001_008E': ['300','200','100'],
'B01001_009E': ['500','100','200']})
df
B01001_007E B01001_008E B01001_009E
100 300 500
200 200 100
300 100 200