4

What's the fastest way of returning the index of the FIRST match between a variable and an element within an ndarray? I see numpy.where used a lot, but that returns all indices.

match = 5000
zArray = np.array([[0,1200,200],[1320,24,5000],[5000,234,5230]])

>array([[   0, 1200,  200],
   [1320,   24, 5000],
   [5000,  234, 5230]])

numpy.where(zArray==match)
>(array([1, 2], dtype=int64), array([2, 0], dtype=int64))

I'd like the first index returned, i.e. just [1,2]. but numpy.where returns both [1,2] and [2,0]

4
  • How would you define the first match ? Row major or column major ? Commented Oct 25, 2017 at 3:57
  • thanks guys, I need to clarify this a bit.. Commented Oct 25, 2017 at 4:02
  • 1
    Be aware that your example array happens to be set such, that the two arrays in the result appear to be the two x, y index pairs you're looking for. Instead, these are the [x1, x2] and [y1, y2] indices of the matches. Try e.g. [[0,5000,200],[1320,24,1200],[234,5000,5230]] instead to see. Commented Oct 25, 2017 at 4:11
  • Just to iterate on what @Evert has mentioned, "I'd like the first index returned, i.e. just [1,2]. but numpy.where returns both [1,2] and [2,0]" needs edits I believe. Commented Oct 25, 2017 at 4:29

1 Answer 1

4

You can use np.argwhere to get the matching indices packed as a 2D array with each row holding indices for each match and then index into the first row, like so -

np.argwhere(zArray==match)[0]

Alternatively, faster one with argmax to get the index of the first match on a flattened version and np.unravel_index for per-dim indices tuple -

np.unravel_index((zArray==match).argmax(), zArray.shape)

Sample run -

In [100]: zArray
Out[100]: 
array([[   0, 1200, 5000], # different from sample for a generic one
       [1320,   24, 5000],
       [5000,  234, 5230]])

In [101]: match
Out[101]: 5000

In [102]: np.argwhere(zArray==match)[0]
Out[102]: array([0, 2])

In [103]: np.unravel_index((zArray==match).argmax(), zArray.shape)
Out[103]: (0, 2)

Runtime test -

In [104]: a = np.random.randint(0,100,(1000,1000))

In [105]: %timeit np.argwhere(a==50)[0]
100 loops, best of 3: 2.41 ms per loop

In [106]: %timeit np.unravel_index((a==50).argmax(), a.shape)
1000 loops, best of 3: 493 µs per loop
Sign up to request clarification or add additional context in comments.

2 Comments

argwhere is just transpose(where(...)). It's handy for getting the first occurrence, but doesn't do any sort of short-circuiting. argmax might short circuit the boolean match (I known the nan test does).
thanks Divakar. This works well, and is perfect for my needs.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.