I have a pandas dataframe that looks like below
This dataframe is already grouped by the three columns O, A, N but as you see it is NOT sorted by time column
My goal is to sort it based on the time column by maintaining the groupby of O, A, N and then do shift(-1) operation for value column to create a value_next observation.
The output should look like below (NaN is imputed with -1` for demonstration)
I did below:
import pandas as pd
# Initialize data to lists.
data = [{'time': 10, 'O': 1, 'A': 2, 'N':3, 'value': 10},
{'time': 7, 'O': 1, 'A': 2, 'N':3, 'value': 11},
{'time': 15, 'O': 1, 'A': 2, 'N':3, 'value': 12},
{'time': 11, 'O': 2, 'A': 2, 'N':3, 'value': 20},
{'time': 12, 'O': 2, 'A': 2, 'N':3, 'value': 21},
{'time': 1, 'O': 2, 'A': 2, 'N':3, 'value': 25}]
# Creates DataFrame.
df = pd.DataFrame(data)
#sorting
df.sort_values(by=['O', 'A', 'N', 'time'], ascending=[True, True, True, True])
#shift
df['value_next'] = df.groupby(['O', 'A', 'N'])['value'].shift(-1)
This generates output below which is different than the expected. What am I missing?
Please suggest.


