1

I'm working with a dataframe using panadas in python. I have sorted the table and created some extra columns: https://i.sstatic.net/Y6lkN.png

{'Part Number': ['K4SD',
  'K4SD',
  'K4SD',
  'K4SD',
  'K4SD',
  'K4SD',
  'K4SD',
  'K4SD',
  'K4SD',
  'K4SD',
  'QOL2',
  'QOL2',
  'QOL2',
  'QOL2',
  'QOL2',
  'QOL2',
  'QOL2',
  'QOL2',
  'QOL2',
  'QOL2'],
 'Date': [Timestamp('2021-05-17 00:00:00'),
  Timestamp('2021-05-23 00:00:00'),
  Timestamp('2021-07-08 00:00:00'),
  Timestamp('2021-08-17 00:00:00'),
  Timestamp('2021-08-17 00:00:00'),
  Timestamp('2021-10-18 00:00:00'),
  Timestamp('2021-12-18 00:00:00'),
  Timestamp('2021-12-20 00:00:00'),
  Timestamp('2022-02-10 00:00:00'),
  Timestamp('2022-03-31 00:00:00'),
  Timestamp('2021-10-04 00:00:00'),
  Timestamp('2021-10-18 00:00:00'),
  Timestamp('2021-11-03 00:00:00'),
  Timestamp('2021-11-03 00:00:00'),
  Timestamp('2021-11-17 00:00:00'),
  Timestamp('2021-11-24 00:00:00'),
  Timestamp('2021-11-27 00:00:00'),
  Timestamp('2021-12-22 00:00:00'),
  Timestamp('2021-12-24 00:00:00'),
  Timestamp('2022-03-21 00:00:00')],
 'Code': ['SF22',
  'KFS3',
  '3FFS',
  'Replacement needed',
  'LA52',
  'K2KA',
  'Belt Broke',
  'QET6',
  'QET6',
  'P0SF',
  'Testing Broken',
  'DP2L',
  'SR2F',
  'JKO2',
  'DP2L',
  'A2BF',
  'KLL2',
  'Light Off',
  'A3SA',
  'LA52'],
 'Fix': ['na',
  'na',
  'na',
  'Custom Status',
  'na',
  'na',
  'Remade',
  'na',
  'na',
  'na',
  'Testing Procedure Fixed',
  'na',
  'na',
  'na',
  'na',
  'na',
  'na',
  'Light Repair',
  'na',
  'na'],
 'Fixed': ['No',
  'No',
  'No',
  'Yes',
  'No',
  'No',
  'Yes',
  'No',
  'No',
  'No',
  'Yes',
  'No',
  'No',
  'No',
  'No',
  'No',
  'No',
  'Yes',
  'No',
  'No'],
 'Combined': ['SF22',
  'KFS3',
  '3FFS',
  'Replacement needed',
  'LA52',
  'K2KA',
  'Belt Broke',
  'QET6',
  'QET6',
  'P0SF',
  'Testing Broken',
  'DP2L',
  'SR2F',
  'JKO2',
  'DP2L',
  'A2BF',
  'KLL2',
  'Light Off',
  'A3SA',
  'LA52']}

I sorted the dataframe by date and now I would like to create a loop that goes down the table in order row by row. In the loop, if the row in the "Fixed" column is "No", I want to append the value in that row in the "Code" column to a list (which I called list_test). Then, when the row in the "Fixed" column becomes a "Yes", I want to create a new list variable that is a copy of the list_test.

Then, I want to clear the list_test to be an empty list so that it can repeat the process down the column (clearing itself every time there is a "Fix").

In my example table above, I would want the output to be something along the lines of:

  • Fixed_Before_3 = ["SF22", "KFS3", "3FFS"]
  • Fixed_Before_6 = ["LA52", "K2KA"]
  • Fixed_Before_10 = ["QET6", "QET6", "P0SF"]
  • Fixed_Before_17 = ["DP2L", "SR2F", "JKO2", "DP2L", "A2BF", "KLL2"]

This is one way I tried to approach the problem:

list_test = []
var_test = {}

for index in df.index:
    var_test[index] = "Fixed_Before_" + str(index)
    
    if df['Fixed'][index] == 'No':
        list_test.append(df['Code'])
    
    if df['Fixed'][index] == 'Yes':
            var_test[index] = list_test
            list_test = []
list_test

Although, when I run the code, the output (https://i.sstatic.net/RM0Pj.png) is a very large column and it looks like it includes everything in my column more than a few times rather than the output I included above. I think my problems might be with:

  • The way I iterate throughout the dataframe
  • The conditional statements about dataframes
    • Maybe list_test.append(df['Code']) gives me the whole column instead of the value of the column in the row in my conditional statement?
  • Using a dictionary to create new variables in my loop.
0

1 Answer 1

1

The exact expected output is unclear, but here is a suggestion:

mask = df['Fixed'].eq('Yes')
out = (df
 .assign(index=pd.Series(df.index.where(mask), index=df.index).bfill())
 .loc[~mask]
 .groupby(['Part Number', mask.shift(fill_value=True).cumsum()])
 .agg(fix_before=('index', 'first'),
      list=('Code', list))
 .reset_index(drop=True)
 )

Output:

   fix_before                                  list
0         3.0                    [SF22, KFS3, 3FFS]
1         6.0                          [LA52, K2KA]
2        10.0                    [QET6, QET6, P0SF]
3        17.0  [DP2L, SR2F, JKO2, DP2L, A2BF, KLL2]
4         NaN                          [A3SA, LA52]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.