numpy: detect consecutive 1 in an array

Question

I want to detect consecutive spans of 1's in a numpy array. Indeed, I want to first identify whether the element in an array is in a span of a least three 1's. For example, we have the following array a:

    import numpy as np
    a = np.array([1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0])

Then the following 1's in bold are the elements satisfy the requirement.

[1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0]

Next, if two spans of 1's are separated by at most two 0's, then the two spans make up a longer span. So the above array is charaterized as

[1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0]

In other words, for the original array as input, I want the output as follows:

    [True, True, True, True, True, True, True, False, False, False, False, False, True, True, True, True, True, True, True, True, True, True, False]

I have been thinking of an algorithm to implement this function, but all the one I come up with seems to complicated. So I would love to know better ways to implement this -- it would be greatly appreciated if someone can help me out.

Update:

I apologize that I did not make my question clear. I want to identify 3 or more consecutive 1's in the array as a span of 1's, and any two spans of 1's with only one or two 0's in between are identified, along with the separating 0's, as a single long span. My goal can be understood in the following way: if there are only one or two 0's between spans of 1's, I consider those 0's as errors and are supposed to be corrected as 1's.

@ritesht93 provided an answer that almost gives what I want. However, the current answer does not identify the case when there are three spans of 1's that are separated by 0's, which should be identified as one single span. For example, for the array

    a2 = np.array([0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0])

we should receive the output

    [False,  True,  True,  True,  True,  True,  True,  True,  True,
   True,  True,  True,  True,  True, False, False,  False, False,
   False,  True,  True,  True,  True,  True, False]

Update 2:

I was greatly inspired by and found the algorithm based on regular expression is easiest to implement and to understand -- though I am not sure about the efficient compared to other methods. Eventually I used the following method.

    lst = np.array([0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0])
    lst1 = re.sub(r'1{3,}', lambda x:'c'*len(x.group()), ''.join(map(str, lst)))
    print lst1

which identified spans of 1's

    0ccc0ccc00cccc00100ccccc0

and then connect spans of 1's

    lst2 = re.sub(r'c{1}0{1,2}c{1}', lambda x:'c'*len(x.group()), ''.join(map(str, lst1)))
    print lst2

which gives

    0ccccccccccccc00100ccccc0

The final result is given by

    np.array(list(lst2)) == 'c'

    array([False,  True,  True,  True,  True,  True,  True,  True,  True,
    True,  True,  True,  True,  True, False, False, False, False,
   False,  True,  True,  True,  True,  True, False])

"Array characterized as:" and "want the output as follows" contradicts each other: Should positions 8 and 9 be part of a "span" ("True") or not (not bold)? — hvwaldow
– hvwaldow, Commented Oct 31, 2016 at 9:14
@hvwaldow Yes, you are correct. Thanks for pointing that out. Arealdy corrected. — user3821012
– user3821012, Commented Oct 31, 2016 at 9:39
"However, the current answer does not identify the case when there are three spans of 1's that are seperated by 0's,...." Hmm. I don't see this. My answer seems to produce the correct solution for your second testcase. — hvwaldow
– hvwaldow, Commented Oct 31, 2016 at 11:38
@hvwaldow Please refer to the updated description of the question. Sorry for not describing the question clearly and for the mistakes on the desired output... — user3821012
– user3821012, Commented Oct 31, 2016 at 11:44

Divakar · Accepted Answer · 2016-10-31 12:35:24Z

We could solve it with a combination of binary dilation and erosion to get past the first stage and then binary closing to get the final output, like so -

from scipy.ndimage.morphology import binary_erosion,binary_dilation,binary_closing

K = np.ones(3,dtype=int) # Kernel
b = binary_dilation(binary_erosion(a,K),K)
out = binary_closing(b,K) | b

Sample runs

Case #1 :

In [454]: a
Out[454]: array([1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0])

In [456]: out
Out[456]: 
array([ True,  True,  True,  True,  True,  True,  True, False, False,
       False, False, False,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True, False], dtype=bool)

Case #2 :

In [460]: a
Out[460]: 
array([0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0])

In [461]: out
Out[461]: 
array([False,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True, False, False, False, False,
       False,  True,  True,  True,  True,  True, False], dtype=bool)

riteshtch · Accepted Answer · 2016-10-31 13:16:16Z

1

Instead of solving the conventional way of looping and maintaining count we can convert all 0's and 1's to a single string and replace a regex match with another char say 2. Once that is done we split out the string again and check for bool() over each char.

>>> import re
>>> lst=[1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0]
>>> list(map(bool, map(int, list(re.sub(r'1{3,}0{1,2}1{3,}', lambda x:'2'*len(x.group()), ''.join(map(str, lst)))))))
[True, True, True, True, True, True, True, False, True, True, False, False, True, True, True, True, True, True, True, True, True, True, False]
>>>

All the operations happen over here:

re.sub(r'1{3,}0{1,2}1{3,}', lambda x:'2'*len(x.group()), ''.join(map(str, lst)))

where it searches for a contiguous occurrence of 3 or more 1's followed by at most 2 0's i.e 1 or 2 0's followed by 3 or more 1's and replaces the whole matched string with same length string of 2's (used 2 because bool(2) is True). Also you can use tolist() method in NumPy to get a list out of NumPy array like this: np.array([1,2, 3, 4, 5, 6]).tolist()

Edit 1 : after a change in question, here is the updated answer:

>>> lst=[1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0]
>>> import re
>>> list(map(lambda x:False if x == 0 or x ==1 else True, map(int, list(re.sub(r'1{3,}0{1,2}1{3,}', lambda x:'2'*len(x.group()), ''.join(map(str, lst)))))))
[True, True, True, True, True, True, True, False, False, False, False, False, True, True, True, True, True, True, True, True, True, True, False]
>>>

Edit 2 FINAL ANSWER:

>>> import re
>>> lst=[0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0]
>>> while re.subn(r'[12]{3,}0{1,2}[12]{3,}', lambda x:'2'*len(x.group()), ''.join(map(str, lst)))[1]:
...     lst=re.subn(r'[12]{3,}0{1,2}[12]{3,}', lambda x:'2'*len(x.group()), ''.join(map(str, lst)))[0]
... 
>>> lst
'0222222222222200100111110'
>>> lst=list(re.sub(r'1{3,}', lambda x:'2'*len(x.group()), ''.join(map(str, lst))))
>>> lst
['0', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '0', '0', '1', '0', '0', '2', '2', '2', '2', '2', '0']
>>> list(map(lambda x:False if x == 0 or x ==1 else True, map(int, lst)))
[False, True, True, True, True, True, True, True, True, True, True, True, True, True, False, False, False, False, False, True, True, True, True, True, False]
>>>

edited Oct 31, 2016 at 13:16

answered Oct 31, 2016 at 7:51

riteshtch

8,7994 gold badges28 silver badges39 bronze badges

6 Comments

hvwaldow Over a year ago

This doesn't return the desired result, after correction in original question. Should be easy to fix though.

hvwaldow Over a year ago

Well, no it doesn't work like that. Everytime you have something like 0,1,1,0,1,1,1,0,1,1,0 the three 1 will not be caught.

riteshtch Over a year ago

@hvwaldow sorry .. was offline, just saw the updated question and updated my answer accordingly

riteshtch Over a year ago

@hvwaldow so you mean 3 1's should represent true? if yes we can add another OR expression in regex as 1{3,} but I don't think that OP has mentioned anything about that

user3821012 Over a year ago

May I ask what the .group() for x does? x here is a string I guess?

|

hvwaldow · Accepted Answer · 2016-10-31 11:48:43Z

1

Many ways to to it. I would split this into grouping, apply conditions to groups and flattening operations. Like so:

from itertools import groupby, starmap
import numpy as np

a = np.array([1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0])

def condition(groups, key, newkey, minlen):
    return [(newkey, l) if l < minlen and k == key else (k, l) for k, l in groups]

def flatten(groups):
    return [k for g in starmap(lambda k, l: l * [k], groups) for k in g]

def group(l):
    return [(k, len(list(v))) for k, v in groupby(l)]

res = group(flatten(condition(group(a), 1, 0, 3)))
# groups zeros at the beginning or the end never change to ones,
# no matter their length
res = flatten([res[0]] + condition(res[1:-1], 0, 1, 3) + [res[-1]])
print [bool(v) for v in res]

edited Oct 31, 2016 at 11:48

answered Oct 31, 2016 at 10:06

hvwaldow

1,3761 gold badge11 silver badges14 bronze badges

Comments

duck · Accepted Answer · 2016-11-07 01:03:20Z

i know this is not so "python-wise", but since you talks about algorithm, i decided to give it a try (sorry i am not so familiar with python)

import numpy as np
a = np.array([1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0])
b = np.array([int])

#init 2nd array 
for x in range (0,(a.size-1)):
    b = np.append(b,0)

print (b)
#1st case
for x in range (2,(a.size)):
    if (a[x-2]==1 & a[x-1]==1 & a[x]==1): #1-1-1
        b[x] = 1
        b[x-1] = 1
        b[x-2] = 1

print (b)
#2nd case
for x in range (2,(b.size)):
    if (b[x-2]==1 & b[x]==1): #1-0-1
        if (b[x-1]==0): #sorry, i forget about logical op. in python
            b[x-1] = 1

print (b)
#3rd case
for x in range (3,(b.size)):
    if (b[x-3]==1 & b[x]==1): #1-0-0-1
        if (b[x-2]==0 & b[x]-1==0):
            b[x-1] = 1
            b[x-2] = 1

#4th case
for x in range (4,(b.size)):
    if (a[x-4]==1 & a[x-3]==1 & b[x]): #1-1-0-0-1
        if (a[x-2]==0 & a[x]-1==0):
            b[x-3] = 1
            b[x-4] = 1
print (b)

I am not sure if this is exactly your expected result, but here it is:
[1 1 1 1 1 1 1 0 1 1 0 0 1 1 1 1 1 1 1 1 1 1 0]

Collectives™ on Stack Overflow

numpy: detect consecutive 1 in an array

4 Answers 4

Comments

6 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

6 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related