Problems while creating a two column based index in a new pandas column?

Question

Given the following dataframe:

col_1   col_2
False   1
False   1
False   1
False   1
False   1
False   1
False   1
False   1
False   1
False   1
False   1
False   1
False   1
False   1
False   2
True    2
False   2
False   2
True    2
False   2
False   2
False   2
False   2
False   2
False   2
False   2
False   2
False   2
False   2
False   2

How can I create a new index that help to identify when a True value is present in col_1? That is, when in the first column a True value appears I would like to fill backward with a number starting from one the new column. For example, this is the expected output for the above dataframe:

   col_1  col_2 new_id
    False   1   1
    False   1   1
    False   1   1
    False   1   1
    False   1   1
    False   1   1
    False   1   1
    False   1   1
    False   1   1
    False   1   1
    False   1   1
    False   1   1
    False   1   1
    False   1   1
    False   2   1
    True    2   1   --------- ^ (fill with 1 and increase the counter)
    False   2   2
    False   2   2
    True    2   2   --------- ^ (fill with 2 and increase the counter)
    False   2   3
    False   2   3
    False   2   3
    False   2   3
    False   2   3
    False   2   3
    False   2   3
    False   2   3
    False   2   3
    False   2   3
    False   2   3
    True    2   4   --------- ^ (fill with 3 and increase the counter)

The problem is that I do not know how to create the id although I know that pandas provide a bfill object that may help to achieve this purpose. So far I tried to iterate with a simple for loop:

count = 0
for index, row in df.iterrows():
    if row['col_1'] == False:
        print(count+1)
    else:
        print(row['col_2'] + 1)

However, I do not know how to increase the counter to the next number. Also I tried to create a function and then apply it to the dataframe:

def create_id(col_1, col_2):
    counter = 0
    if col_1 == True and col_2.bool() == True:
        return counter + 1
    else:
        pass

Nevertheless, i lose control of filling backward the column.

BENY · Accepted Answer · 2018-11-08 19:33:00Z

2

Just do with cumsum

df['new_id']=(df.col_1.cumsum().shift().fillna(0)+1).astype(int)
df
Out[210]: 
    col_1  col_2  new_id
0   False      1       1
1   False      1       1
2   False      1       1
3   False      1       1
4   False      1       1
5   False      1       1
6   False      1       1
7   False      1       1
8   False      1       1
9   False      1       1
10  False      1       1
11  False      1       1
12  False      1       1
13  False      1       1
14  False      2       1
15   True      2       1
16  False      2       2
17  False      2       2
18   True      2       2
19  False      2       3
20  False      2       3
21  False      2       3
22  False      2       3
23  False      2       3
24  False      2       3
25  False      2       3
26  False      2       3
27  False      2       3
28  False      2       3
29  False      2       3

edited Nov 8, 2018 at 19:33

answered Nov 8, 2018 at 19:17

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

tumbleweed Over a year ago

Just for curiosity... how would you back fill starting from the next position? That is if you check row 15 an id 1 is assigned how would you fill with a 2?

BENY Over a year ago

@tumbleweed you can remove the shift

tumbleweed Over a year ago

So shift allows you to do the trick of starting from the other position?

BENY Over a year ago

@tumbleweed what you mean position here ?

BENY Over a year ago

@tumbleweed yes shift control it

|

Simon · Accepted Answer · 2018-11-08 19:30:29Z

1

If you aim to append the new_id column to your dataframe:

new_id=[]
counter=1
for index, row in df.iterrows():
    new_id+= [counter]
    if row['col_1']==True:
        counter+=1   
df['new_id']=new_id

answered Nov 8, 2018 at 19:30

Simon

111 bronze badge

Collectives™ on Stack Overflow

Problems while creating a two column based index in a new pandas column?

2 Answers 2

6 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related