2

I have an array arr of size n containing the integers 1 through n in a random order; for example, it might be arr = [4 5 3 6 2 1].

For each element e in arr, I need to find the closest element k that is less than e. ("Closest" in terms of position, not value; so in the above example, 5 is closer to 3 than to 6.) If there are equally-close lesser elements in both directions — as with 6 in the example above, whose immediate neighbors are both less than it — then it doesn't matter which of those elements is chosen.

So with the example above, arr = [4 5 3 6 2 1], one valid result is [3 3 2 2 1 NIL].

I know of a O(n) algorithm to solve this; it works by maintaining a stack to find the nearest smaller element in left and right of every element.

But surprisingly, a simple linear bidirectional search performed better than that algorithm, even though it seems to me to be O(n2) in the worst case:

for i = 1 to N
  right = i + 1
  left = i - 1
  while (left > 0 && right <= n)
    // ...find the closest element that is smaller than arr[i] and break while loop

What is the worst-case complexity of the above? Am I right that it's O(n2)?

5
  • 2
    If you can only do comparisons, you cannot do this faster than O(n log(n)), as you would then be able to sort the array in faster than O(n log(n)), a contradiction. That doesn't mean it's impossible (e.g. if you use observations more powerful than mere comparison), but it does make me suspicious of your claim that "I know of a O(n) algorithm" without further elaboration. Commented Feb 26, 2017 at 18:47
  • @DanielWagner I'm not sure if i can relate this to sorting. The problem's only about finding the closest smaller element for every element in the array. The O(n) algorithm works by maintaining a stack having a subsequence of elements traversed till .. say .. i. if the top of the stack is greater than arr[i]...then we keep popping from the array till we get stack[top] < arr[i]...that would be the closest smaller element of i on the left. Reverse the array and rerun to get the smaller on right. Further, we can keep track of indices of these elements to find the closest by comparing rite/left Commented Feb 26, 2017 at 19:46
  • 1
    Relation to sorting: we can find the max element in O(n) time by linear search. Then we execute whatever your algorithm is in O(n) time so that, given any element of the array, we know the next smaller element in O(1). Now we fill in our sorted array by starting with the max element and iterating the "get the next smaller element" operation n times, for an additional n*O(1)=O(n) cost. Since all three operations are O(n), so is doing all of them, and at the end we have a sorted list. Commented Feb 26, 2017 at 19:51
  • 2
    I seem to have left a huge confusion in the statement. Closest means closest in terms of indices. Thanks for pointing it out @DanielWagner Commented Feb 26, 2017 at 19:54
  • Cool, now that I understand the question I agree it's a pretty good one! Surprisingly it seems like the algorithm is O(n) ("seems" in the sense that I haven't managed to cook up a bad case that costs more than O(n)), but I'm having quite a difficult time proving it! Commented Feb 26, 2017 at 20:09

1 Answer 1

1

It is worst-case Θ(n log n) time (assuming you implement the fix that Daniel mentions above — you need to make sure that you can scan all the way to the end in each direction).

First, observe that if two elements a and b are neighbors, then either a < b or b < a. In either case, at least one of these elements will require a "search distance" of only 1.

By extension, at least half the elements of the array will require a search distance of only 1.

Similarly, if four elements [a, b, c, d] are all in a row, then one of these is less than the other three. So at least three of these elements will require a search distance of at most 3. And as noted above, two of them actually require a search distance of only 1, so really this means that two have a search distance of 1 and a third has a search distance of at most 3. The total search distance is then at most 2·1 + 1·3 + (n−1).

We can continue this logic for increasing runs of length 2k: we have at most 2k−1·(21−1) + 2k−2·(22−1) + 2k−3·(23−1) + ⋯ + (n−1).

If n is a power of 2, then we can write it as 2k, and the total search distance is at most:
2k−1·(21−1) + 2k−2·(22−1) + 2k−3·(23−1) + ⋯ + 20·(2k−1) =
      = (2k−2k−1) + (2k−2k−2) + (2k−2k−3) + ⋯ + (2k−20)
      = k·2k − 2k + 1
      = n log n − log n + 1
      < n log n

(If n is not a power of 2, then we can obtain a somewhat larger array by appending the values n+1, n+2, etc., until we do have a power of 2, and observe that the resulting total search distance is less than 2n log 2n, which is still in Θ(n log n).)

Of course, the above just sets an upper bound (O rather than Θ); but I think it shows how to actually achieve the worst-case: we want the least numbers spread as far apart as possible, in a powers-of-two fashion. For example, we can "expand" a list like this:

1       2
1   3   2   4
1 5 3 6 2 7 4 8

The total search distance for the above is 7 + 1 + 2 + 1 + 4 + 1 + 2 + 1 = 17 = 3·8 − 8 + 1, as predicted.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.