7

I would like to obtain the index of the nearest value in a numpy array which is greater than my search value. Example: findNearestAbove(np.array([0.,1.,1.4,2.]), 1.5) should return 3 (the index of 2.).

I know that I can get the nearest index with np.abs(a-value).argmin(), and I found out that min(a[np.where(a-value >= 0.)[0]]) returns the desired array value. Hence, np.where(a == min(a[np.where(a-value >= 0.)[0]]))[0] would probably give me the desired index. However, this looks rather convoluted, and I fear that it might break in the case of multi-dimensional arrays. Any suggestions how to improve this?

3
  • And by “nearest” you mean “leftmost”? Commented Jun 14, 2013 at 23:14
  • 2
    Are your arrays always sorted in ascending order? Commented Jun 15, 2013 at 0:28
  • To clarify: "nearest" means "by value". Also: no - the arrays are not necessarily sorted. Commented Jun 15, 2013 at 20:02

4 Answers 4

10

I believe you can use np.searchsorted for this:

In [15]: np.searchsorted(a,[1.5,],side='right')[0]
Out[15]: 3

assuming a is in ascending order.

This method also won't work for multi-dimensional arrays, but I'm not sure exactly how that use case would work in terms of the expected output. If you could give an example of what you imagine, I might be able to adapt this to that purpose.

Note: you could also use np.digitize for this purpose, although it executes a linear rather than a binary search, so for certain input sizes, it can be a lot slower than searchsorted and requires that a be monotonic:

In [25]: np.digitize([1.5,], a, right=True)[0]
Out[25]: 3
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks! This may turn out to be helpful later when I am indeed dealing with sorted arrays. The use case is that of geographic coordinates: normally they would be 1-dimensional, but you can also have "curvilinear grids", in which case both longitudes and latitudes are stored as 2D arrays. I must admit that I haven't fully thought about the use case for find_nearest_above in this case. It may well be that this doesn't make sense.
As of v1.10, "np.digitize is implemented in terms of np.searchsorted." (quoting the docs). There is literally no longer a difference between the functions.
Why when I type side='left' it doesn't work opposite way?
7

Here is one way (I am assuming that by nearest you mean in terms of value not location)

import numpy as np

def find_nearest_above(my_array, target):
    diff = my_array - target
    mask = np.ma.less_equal(diff, 0)
    # We need to mask the negative differences and zero
    # since we are looking for values above
    if np.all(mask):
        return None # returns None if target is greater than any value
    masked_diff = np.ma.masked_array(diff, mask)
    return masked_diff.argmin()

Result:

>>> find_nearest_above(np.array([0.,1.,1.4,2.]), 1.5)
3
>>> find_nearest_above(np.array([0.,1.,1.4,-2.]), -1.5)
0
>>> find_nearest_above(np.array([0., 1, 1.4, 2]), 3)
>>> 

4 Comments

This is the fastest solution if the arrays cannot be assumed to be sorted
Indeed, this is very charming. There is only one little glitch here: if value is larger than the largest array value, then this function returns an index zero. Hence, one needs an extra check: ind = masked_diff.argmin() ; if my_array[ind]>=target: ind = None ; return ind. Or is there a more efficient way to fix this?
if not np.any(mask): return
Thank your for the input, I updated my answer. It works properly now.
0

Here's a solution which worked for me pretty nicely when finding the value and index of the nearest but greater than number in an array (no promises in terms of speed, etc.):

def findNearestGreaterThan(searchVal, inputData):
    diff = inputData - searchVal
    diff[diff<0] = np.inf
    idx = diff.argmin()
    return idx, inputData[idx]

It's easily adapted for nearest but less than, too:

def findNearestLessThan(searchVal, inputData):
    diff = inputData - searchVal
    diff[diff>0] = -np.inf
    idx = diff.argmax()
    return idx, inputData[idx]

Comments

-3

Here is the right way to do this:

>>> def argfind(array, predicate):
...     for i in xrange(array.shape[0]):
...         if predicate(array[i]):
...             return i
...     return False
...
>>> def find_nearest_above(array, value):
...     return argfind(array, lambda x: x > value)
...
>>> find_nearest_above(np.array([0.,1.,1.4,2.]), 1.5)
  > 3

The point here is that if a matching value exists, you'll get the answer when this value is met. Other methods (includeing your own, proposed in the question) will inspect the whole array, which is a waste of time.

4 Comments

Assuming the arrays are indeed sorted, np.searchsorted will do this a lot faster
@ali_m yeah, you can be sure, I've heard about binary search. But why do you think that the arrays are sorted?
well, your solution would only work if the arrays are sorted
@ali_m the thing is that “nearest” for me means “leftmost”. As I've mentioned in a comment to the question.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.