1

For one dimensional array, I can implement it by the following way:

def binary_search(a, key, low=0, high=None):
    if high is None:
        high = len(a) - 1

    while low <= high:
        mid = (low + high) // 2
        midval = a[mid]

        if midval < key:
            low = mid + 1
        elif midval > key:
            high = mid - 1
        else:
            return mid
    raise ValueError

Or

def binary_search_bisect(lst, x):
    from bisect import bisect_left
    i = bisect_left(lst, x)
    if i != len(lst) and lst[i] == x:
        return i
    return -1

For Double Dimensional Array

First way: But this is not a better solution.

1. convert 2D to 1D
lst = [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
lst2 = [ j for i in lst for j in i]

2. by binary search to find the value's index in 1D
binary_search_bisect(lst, 6) # search 6 and get its index is 5

3. convert the index of 1D to index of 2D
row = 5 / 5 # 1
col = 5 % 5 # 0

4. Verify the result
lst[1][0] # the result is 6

Second way:

started from top right corner (we can also start from bottom left corner) 
and move left if current element is greater than the value to be searched 
and bottom if current element is smaller than the value to be searched.

Question:

  • How can I implement the second way by python, I have no idea to implement it.
10
  • What is the relation between the arrays? Are the contained value ranges pair-wise exclusive? Is each array sorted? Are the arrays sorted? If so, I'd recommend a third way: Create a little meta array containing only the bounds of the arrays. On this array you conduct a binary search to find the array your value is in. Then proceed with 'normal' binary search on the found array. Commented Apr 16, 2017 at 14:15
  • 1
    @MichaelHoff It was worth a try. how to create a little meta array containing only the bounds of the arrays. if the array is [[1,2,3,4,5],[6,8,10,12,16]] Commented Apr 16, 2017 at 14:17
  • Then you can simply implement my approach. Take, e.g., the maximum of array 1, which is 5. Look if your value, e.g. 6, is bigger. If so, conduct binary search in array 2. If not, conduct binary search in array 1. Commented Apr 16, 2017 at 14:18
  • @MichaelHoff Ok, I see. Commented Apr 16, 2017 at 14:23
  • 1
    @MichaelHoff Somewhat, yes. But your to_double_idx is complicated and inefficient. See step 3 in the OPs first way to see that you can simply use / width and % width. Even if for some reason you want to generalize this to allow different lengths of the sublists, you'd better accumulate the lengths once, in the initializer, and then have to_double_idx binary search those prefix sums instead of doing your linear enumeration. Commented Apr 16, 2017 at 14:55

2 Answers 2

1

This is how you could do it (a third approach). Create a meta array containing only boundaries of the supplied arrays. Bisect the meta array to find the array in which you have to look for the value and then bisect the found array.

Sample code:

import bisect

def locate(lsts, x):
    # assuming lists sorted
    meta = [lst[-1] for lst in lsts]
    i1 = bisect.bisect_left(meta, x)

    if i1 != len(meta):
        lst = lsts[i1]
        i2 = bisect.bisect_left(lst, x)
        if i2 != len(lst) and lst[i2] == x:
            return (i1, i2)
    return -1

lsts = [[1, 4, 5], [7, 8, 9, 10], [11, 12, 14, 15]]

for i in range(17):
    t = locate(lsts, i)
    print("{} => {}".format(i, t))
    # if t != -1:
    #    row, col = t
    #    assert lsts[row][col] == i
    assert t == -1 or lsts[t[0]][t[1]] == i

produces

0 => -1
1 => (0, 0)
2 => -1
3 => -1
4 => (0, 1)
5 => (0, 2)
6 => -1
7 => (1, 0)
8 => (1, 1)
9 => (1, 2)
10 => (1, 3)
11 => (2, 0)
12 => (2, 1)
13 => -1
14 => (2, 2)
15 => (2, 3)
16 => -1

Second approach with mapping data structure as discussed in the comments. However, to_double_idx essentially requires a nested bisect lookup (or a linear implementation which it is now), which renders the above approach much more efficient. It basically inverses the behavior by doing the 'nested lookup' first only once.

class Mapper(object):
    def __init__(self, lsts):
        self._lsts = lsts
        self._lens = list(map(len, lsts))
        self._total_len = sum(self._lens)

    def __getitem__(self, abs_idx):
        if abs_idx < 0 or abs_idx >= self._total_len:
            raise ValueError()
        rel_idx = self.to_double_idx(abs_idx)
        assert rel_idx != -1
        return self._lsts[rel_idx[0]][rel_idx[1]]

    def __len__(self):
        return self._total_len

    def to_double_idx(self, abs_idx):
        rel_idx = abs_idx
        for lst_idx, lst_len in enumerate(self._lens):
            if rel_idx < lst_len:
                return (lst_idx, rel_idx)
            rel_idx -= lst_len
        return -1


def locate(lsts, x):
    mapper = Mapper(lsts)
    i = bisect.bisect_left(mapper, x)
    if i != len(mapper) and mapper[i] == x:
        return mapper.to_double_idx(i)
    return -1
Sign up to request clarification or add additional context in comments.

9 Comments

Why to add assert t == -1 or lsts[t[0]][t[1]] == i, I don't quite understand here.
You could improve the first part of your first solution from linear time to logarithmic time by searching for [x] in lsts directly: bisect_left(lsts, [x]).
I added a little 'explanation' to the code. When the element is not found in any of the arrays, simply -1 is returned. But when the element is found two indices are returned as a tuple, e.g. t == (0, 1). Then you can access your element with lsts[0][1] or lsts[t[0]][t[1]].
@StefanPochmann Correction: I will not change the implementation of to_double_idx. As you say, it would come down to a second bisect and that makes it way worse than the prior solution. I will only keep it as showcase for people who are interested. Could you elaborate on the bisect_left(lsts, [x]) solution? I tried to implement it but ran into some subtle issues. I'm having a hard time to determine the correct sub-list to proceed the lookup in.
@binbjz. max(lst) == lst[-1] for a sorted list. I will correct my answer later.
|
0

Using the following way for binary search with 2d.

def bisec_left(lst, item):
    low, high = 0, len(lst)

    while low < high:
        mid = (low + high) // 2
        mid_val = lst[mid]

        if item > mid_val:
            low = mid + 1
        else:
            high = mid
    return low


def bisec_2d(lsts, x):
    meta = [lst[-1] for lst in lsts]
    i1 = bisec_left(meta, x)

    if i1 != len(meta):
        lst = lsts[i1]
        i2 = bisec_left(lst, x)
        if i2 != len(lst) and lst[i2] == x:
            return i1, i2
    raise ValueError


if __name__ == '__main__':
    v = 9
    lsts = [[1, 4, 5], [7, 8, 9, 10], [11, 12, 14, 15]]
    print("{} => {}".format(v, bisec_2d(lsts, v)))

1 Comment

You can improve your answer with an explanation, what your script does and why you think it is (partially) better, faster, more efficient in comparison to the other approaches.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.