2

Is there a nice way in numpy to get element-wise indexes of where each element in array1 is in array2?

An example:

array1 = np.array([1, 3, 4])
array2 = np.arange(-2, 5, 1, dtype=np.int)

np.where(array1[0] == array2)
# (array([3]),)
np.where(array1[1] == array2)
# (array([5]),)
np.where(array1[2] == array2)
# (array([6]),)

I would like to do

np.where(array1 == array2)
# (array([3 5 6]),)

Is something like this possible? We are guaranteed that all entries in array1 can be found in array2.

2 Answers 2

2

Approach #1 : Use np.in1d there to get a mask of places where matches occur and then np.where to get those index positions -

np.where(np.in1d(array2, array1))

Approach #2 : With np.searchsorted -

np.searchsorted(array2, array1)

Please note that if array2 is not sorted, we need to use the additional optional argument sorter with it.

Sample run -

In [14]: array1
Out[14]: array([1, 3, 4])

In [15]: array2
Out[15]: array([-2, -1,  0,  1,  2,  3,  4])

In [16]: np.where(np.in1d(array2, array1))
Out[16]: (array([3, 5, 6]),)

In [17]: np.searchsorted(array2, array1)
Out[17]: array([3, 5, 6])

Runtime test -

In [62]: array1 = np.random.choice(10000,1000,replace=0)

In [63]: array2 = np.sort(np.random.choice(100000,10000,replace=0))

In [64]: %timeit np.where(np.in1d(array2, array1))
1000 loops, best of 3: 483 µs per loop

In [65]: %timeit np.searchsorted(array2, array1)
10000 loops, best of 3: 40 µs per loop
Sign up to request clarification or add additional context in comments.

3 Comments

Awesome! I've been playing with each individually, but didn't realise the key was to combine them two. Is there any benefit to either of the two approaches you're suggesting?
@pingul I would go with np.searchsorted if array2 is sorted already. Adding timings soon.
Thanks a lot! Both of the arrays will always be sorted.
0

Here is a simpler way if your arrays are not too big.

np.equal.outer(array1,array2).argmax(axis=1)

If array1 has size N and array2 has size M, this creates a temporary array of shape (N,M), therefore the above method is not recommended if you have arrays so large that it doesn't fit in memory.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.