Skip to content Skip to sidebar Skip to footer

Numpy 1d Array - Find Indices Of Boundaries Of Subsequences Of The Same Number

I have an numpy.array made by zeros and ones, e.g.: import numpy a = numpy.array([0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1]) And now I need to get the first and last i

Solution 1:

You can find the beginning and end of these sequences shifting and comparing using bitwise operators and np.where to get the corresponding indices:

deffirst_and_last_seq(x, n):
    a = np.r_[n-1,x,n-1]
    a = a==n
    start = np.r_[False,~a[:-1] & a[1:]]
    end = np.r_[a[:-1] & ~a[1:], False]
    return np.where(start|end)[0]-1  

Checking with the proposed example:

a = np.array([0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1])
first_and_last_seq(a, 1)# array([ 3,  6,  9, 10, 14, 17])

Or with the following array:

a = np.array([5,5,5,6,2,3,5,5,5,2,3,5,5])
first_and_last_seq(a, 5)# array([ 3,  6,  9, 10, 14, 17])

Further details:

A simple way to check for consecutive values in numpy, is to use bitwise operators to compare shifted versions of an array. Note that ~a[:-1] & a[1:] is doing precesely that. The first term is the array sliced up till the last element, and the second term a slice from the first element onwards.

Note that a is a boolean array, given a = a==n. In the above case we are taking a NOT of the first shifted boolean array (since we want a True is the value is False. And by taking a bitwise AND with the next value, we will only have True is the next sample is True This way we set to True only the indices where the sequences start (i.e. we've matched the subsequence [False, True])

Now the same logic applies for end. And by taking an OR of both arrays and np.where on the result we get all start and end indices.

Post a Comment for "Numpy 1d Array - Find Indices Of Boundaries Of Subsequences Of The Same Number"