Firstly, excuses if this has been asked and answered. I have looked here and here, went through the titles of recommended duplicates and try to use search engines, but cannot seem to come up with the right keyword.
problem
My problem is the following: given a dataframe with two "identifier" columns, I want to create an index that uniquely describes each combination of values in the two columns:
For instance: column 'a' has value 0, and column 'b' has value '0' and this should get index number 1. Same combinations should map to the same value.
approach
df = pd.DataFrame({
'a': np.random.randint(0,3,10),
'b': np.random.randint(0,3,10),
'c': np.random.randint(0,10,10)
})
mapping = [(*key, i+1) for i, key in enumerate(df.groupby(by=['a', 'b']).groups.keys())]
crutch = pd.DataFrame(mapping, columns=['a', 'b', 'new_index'])
df = df.merge(crutch, left_on=['a', 'b'], right_on=['a', 'b'])
This works, but it seems like there should be something built into pandas that I am missing.
question
So, is there something built into pandas that would help and that I could not figure out?
thanks
Help is greatly appreciated.