3

I am trying to write a recursive function that returns the position of a word in a sorted word list, or return None when the word is not found. The following is the code:

def partition(t,kw,upper,lower):
    if ((upper-lower) > 1):
        pivot = lower + (upper-lower)//2
        print("pivot is now {}".format(pivot))
        if (kw >= t[pivot]):
            lower = pivot
            print("searching on the right",upper,lower)
            return(partition(t,kw,upper,lower))
        
        elif (kw <= t[pivot-1]):
            upper = pivot-1
            print("searching on the left",upper,lower)
            return(partition(t,kw,upper,lower))

        else:
            return None 
            #that means the keyword is between t[pivot-1] and t[pivot]
            #which means it doesnt exist in the list
    if (upper - lower) <= 1:
        if (kw == t[upper]):
            print("found element {} at upper bound {}".format(kw,upper))
            return(upper)
        elif (kw == t[lower]):
            print("found element {} at lower bound {}".format(kw,lower))
            return(lower)
        else:
            return None

def search(t, kw):
    u = len(t)
    partition(t,kw,u,0)

As you can see I returned the function whenever I called it(not returning is a common mistake in using recursive calls). At the same time, here's the sample array I was using:

['', '', '150', '150', '1997,with', '36', '49', 'An', 'Annotated', 'Annotation', 'Bibliography', 'China', 'Chinese', 'Chinese', 'Classical', 'During', 'Dynasty', 'Hong', 'Hong', 'Hong', 'Hong', 'Hong', 'In', 'It', 'Kong', 'Kong', 'Kong', 'Kong,', 'Kong.', 'Mainland', 'Poets', 'Qing', 'They', 'Together,', 'Writings', 'a', 'a', 'a', 'a', 'a', 'active', 'activity,', 'addition', 'almost', 'and', 'and', 'and', 'and', 'and', 'and', 'annotations', 'anthologies', 'basic', 'been', 'before.', 'bibliographic', 'bibliographies,', 'by', 'carry', 'ci-poetry', 'ci-poetry', 'classical', 'collected', 'commentaries', 'compilation,', 'compilations', 'compiled', 'covered,', 'development', 'events', 'focused', 'form', 'form', 'formation,', 'from', 'from', 'has', 'help', 'hidden', 'in', 'in', 'in', 'in', 'includes', 'individual', 'information', 'information', 'introduces', 'invaluable', 'is', 'late', 'literary', 'literati', 'literature', 'literature.', 'membership,', 'most', 'never', 'not', 'of', 'of', 'of', 'of', 'of', 'of', 'of', 'of', 'of', 'of', 'offer', 'on', 'on', 'on', 'on', 'order', 'over', 'past', 'periods', 'pioneer', 'pity', 'poetic', 'poetry', 'poetry', 'poetry', 'poet’s', 'political', 'previous', 'previously', 'published', 'refuge', 'research', 'sequel', 'shi-', 'shi-', 'societies', 'societies', 'societies', 'societies.', 'societies.', 'splendor,', 'that', 'that', 'the', 'the', 'the', 'the', 'the', 'the', 'the', 'the', 'the', 'the', 'the', 'the', 'these', 'these', 'these', 'this', 'times', 'to', 'to', 'to', 'to', 'to', 'to', 'took', 'tools', 'topic', 'tradition.', 'turmoil', 'two', 'uncover', 'understand', 'understanding', 'unique', 'up', 'various', 'very', 'volume,', 'which', 'works,', 'worthwhile', 'would', 'years,']

I was searching for the word "Qing", it should end up in the 31st position. Right now it seems like the function could find the word I need, but couldn't return:

the result is
pivot is now 92
searching on the left 91 0
pivot is now 45
searching on the left 44 0
pivot is now 22
searching on the right 44 22
pivot is now 33
searching on the left 32 22
pivot is now 27
searching on the right 32 27
pivot is now 29
searching on the right 32 29
pivot is now 30
searching on the right 32 30
pivot is now 31
searching on the right 32 31
found element Qing at lower bound 31
None

I tried searching for issues relating to recursive functions but didn't seem much related content. Most posts on SO are caused by not returning the recursive function, and I am really not sure what's wrong in here.

2
  • Please provide a complete minimal reproducible example – that includes how the code is actually called. Commented Mar 3, 2021 at 13:30
  • Are you simply wanting to return 31 from def search? If so, you need to add a return statement! Commented Mar 3, 2021 at 13:30

3 Answers 3

5

Your function seems to be working. I think you just forgot to return from search, i.e.

def search(t, kw):
    u = len(t)
    lower = partition(t,kw,u,0)
    return lower

Outputs

...
searching on the right 32 30
pivot is now 31
searching on the right 32 31
found element Qing at lower bound 31
31
Sign up to request clarification or add additional context in comments.

Comments

2

single purpose, simplified

As it's written partition is a big function with many parameters trying to do many things. Is it possible to separate some of its concerns and make it more usable? Can we abstract away some of the complexity to make the function easier to write and less prone to error?

To look at ways to improve partition, we will first look at Python's range data structure. In your implementation, you have used the following arithmetic to arrive at a solution -

  • upper - lower > 1
  • lower + (upper - lower) // 2
  • kw <= t[pivot - 1]
  • upper = pivot - 1
  • upper - lower <= 1

The cognitive overhead above is tremendous and the potential for Off-By-One Errors is high. Using a range, r, we can find the pivot p and efficiently slice using [], and simultaneously side-step all of our headaches -

r = range(1,100)
p = len(r) // 2
print("input", r)        # input range(1, 100)
print("pivot", p)        # pivot 49
print("guess", r[p])     # guess 50
print("left ", r[0:p])   # left  range(0, 50)
print("right", r[p+1:])  # right range(51, 100)

Here is the same arithmetic applied to a different range -

r = range(44,66)
p = len(r) // 2
print("input", r)        # input range(44, 66)
print("pivot", p)        # pivot 11
print("guess", r[p])     # guess 55
print("left ", r[0:p])   # left  range(44, 55)
print("right", r[p+1:])  # right range(56, 66)

partition

Using these properties, we can write a generic partition that only concerns itself with a single range -

def partition(r):
  if len(r) == 0: return (yield None)             # base case
  pivot = len(r) // 2                             # pivot
  (a, b) = yield r[pivot]                         # guess
  if a < b: yield from partition(r[0:pivot])      # recur on left
  elif a > b: yield from partition(r[pivot + 1:]) # recur on right
  else: return None                               # equal

Above, our function has no concerns for a list of words, t, a keyword to search for, kw, nor individual lower and upper bounds. Most importantly arithmetic is minimized, using only // 2 to find a pivot and +1 to calculate the right-side range. The potential for range errors has been significantly reduced.

search

Now we can write search as a specialization of partition which takes a list of words, t, and a keyword to search for, kw -

def search(t, kw):
  def loop(p, v):                     # loop
    i = p.send(v)                     # get guess
    if i is None: return None         # base case
    elif kw == t[i]: return i         # found it! return i
    else: return loop(p, (kw, t[i]))  # recur with comparables
  r = range(0, len(t))                # init range
  return loop(partition(r), None)     # loop over partition of r 

This is a simple loop over partition. On each iteration, we get an index, i, from our partition generator, p. If there is no index, we have reached the end of the search. Otherwise if kw is equal to t[i] we found the answer and can return it. Otherwise kw is not equal to t[i] and so we recur on p sending (kw, t[i]) as the next comparison.

Using our search function is straightforward -

r = search(words, "Qing")
if r is None:
  print("no result")
else:
  print(f"found at index: {r}")
found at index: 31

If we search for a word that does not exist -

r = search(words, "Wxyz")
if r is None:
  print("no result")
else:
  print(f"found at index: {r}")
no result

handle with care

Python generators have a unique interface and they're one of the rare exceptions where using try..except for program control flow is acceptable. Above I explicitly yield None in partition and check for None in loop but Python offers a StopIteration exception that allows us remove some of this clutter -

def partition(r):
  if len(r) == 0: return                    # stop
  p = len(r) // 2                           # pivot
  (a, b) = yield r[p]                       # guess
  if a < b: yield from partition(r[0:p])    # recur left
  if a > b: yield from partition(r[p+1:])   # recur right
                                            # (implict) return None

Now when we write search, we can skip null-checks for None and proceed with the assumption that we have a valid guess. Otherwise if the generator is exhausted, we catch the stop exception and return None -

def search(t, kw):
  def loop(p, v):                           # loop
    i = p.send(v)                           # get guess 
    if kw == t[i]: return i                 # match?
    return loop(p, (kw, t[i]))              # recur
  try:
    r = range(0, len(t))
    return loop(partition(r), None)
  except StopIteration:                     # base case
    return None

Over time I have come to prefer this interface as it reduces explicit nulls and the need to check for them. These new implementations of partition and search behave the same and produce the same output.

decoupling side effects

Another important thing to note here is that side effects like print are not present in the partition function. This makes it usable in other parts of your program that might compare other data structures or wishes to format output in a different way -

Below we can write a simple modification to search which gives you debug output -

def search(t, kw):
  def loop(p, v):
    i = p.send(v)
    print(f"searching index {i}: {kw} == {t[i]}?") # one change
    if kw == t[i]: return i
    return loop(p, (kw, t[i]))
  try:
    r = range(0, len(t))
    return loop(partition(r), None)
  except StopIteration:
    return None

Let's see the output of our examples with the debug in place -

r = search(words, "Qing")
if r is None:
  print("no result")
else:
  print(f"found at index: {r}")
searching index 92: Qing == literati?
searching index 46: Qing == and?
searching index 23: Qing == It?
searching index 35: Qing == a?
searching index 29: Qing == Mainland?
searching index 32: Qing == They?
searching index 31: Qing == Qing?
found at index: 31

And our second example -

r = search(words, "Wxyz")
if r is None:
  print("no result")
else:
  print(f"found at index: {r}")
searching index 92: Wxyz == literati?
searching index 46: Wxyz == and?
searching index 23: Wxyz == It?
searching index 35: Wxyz == a?
searching index 29: Wxyz == Mainland?
searching index 32: Wxyz == They?
searching index 34: Wxyz == Writings?
no result

without recursion

You'll often hear people talking-down about recursion in Python. Because binary search is very efficient, it would take a monstrous input for the recursive solution above to overflow the stack. That said, there's nothing stopping us from writing partition and search using iteration in place of recursion, if we wish -

def partition(r):
  while len(r):
    p = len(r) // 2
    (a, b) = yield r[p]
    if a < b: r = r[0:p]
    if a > b: r = r[p+1:]
    if a == b: break
def search(t, kw):
  try:
    r = range(0, len(t))
    p = partition(r)
    v = None
    while True:
      i = p.send(v)
      if kw == t[i]: return i
      v = (kw, t[i])
  except StopIteration:
    return None

Comments

1

As pointed out by this great answer, you neglected to add a return statement to your search function. There are, however, things that can be improved in your code:

def partition(t, kw, upper, lower):
    if upper - lower > 1:
        pivot = lower + (upper - lower) // 2
        print("pivot is now {}".format(pivot))
        if kw >= t[pivot]:
            lower = pivot
            print("searching on the right", upper, lower)
            return partition(t, kw, upper, lower) 
        if kw <= t[pivot - 1]:
            upper = pivot - 1
            print("searching on the left", upper, lower)
            return partition(t, kw, upper, lower)
    if upper - lower <= 1:
        if kw == t[upper]:
            print("found element {} at upper bound {}".format(kw, upper))
            return upper 
        if kw == t[lower]:
            print("found element {} at lower bound {}".format(kw, lower))
            return lower

def search(t, kw):
    u = len(t)
    return partition(t, kw, u, 0)

lst = ['', '', '150', '150', '1997,with', '36', '49', 'An', 'Annotated', 'Annotation', 'Bibliography', 'China', 'Chinese', 'Chinese', 'Classical', 'During', 'Dynasty', 'Hong', 'Hong', 'Hong', 'Hong', 'Hong', 'In', 'It', 'Kong', 'Kong', 'Kong', 'Kong,', 'Kong.', 'Mainland', 'Poets', 'Qing', 'They', 'Together,', 'Writings', 'a', 'a', 'a', 'a', 'a', 'active', 'activity,', 'addition', 'almost', 'and', 'and', 'and', 'and', 'and', 'and', 'annotations', 'anthologies', 'basic', 'been', 'before.', 'bibliographic', 'bibliographies,', 'by', 'carry', 'ci-poetry', 'ci-poetry', 'classical', 'collected', 'commentaries', 'compilation,', 'compilations', 'compiled', 'covered,', 'development', 'events', 'focused', 'form', 'form', 'formation,', 'from', 'from', 'has', 'help', 'hidden', 'in', 'in', 'in', 'in', 'includes', 'individual', 'information', 'information', 'introduces', 'invaluable', 'is', 'late', 'literary', 'literati', 'literature', 'literature.', 'membership,', 'most', 'never', 'not', 'of', 'of', 'of', 'of', 'of', 'of', 'of', 'of', 'of', 'of', 'offer', 'on', 'on', 'on', 'on', 'order', 'over', 'past', 'periods', 'pioneer', 'pity', 'poetic', 'poetry', 'poetry', 'poetry', 'poet’s', 'political', 'previous', 'previously', 'published', 'refuge', 'research', 'sequel', 'shi-', 'shi-', 'societies', 'societies', 'societies', 'societies.', 'societies.', 'splendor,', 'that', 'that', 'the', 'the', 'the', 'the', 'the', 'the', 'the', 'the', 'the', 'the', 'the', 'the', 'these', 'these', 'these', 'this', 'times', 'to', 'to', 'to', 'to', 'to', 'to', 'took', 'tools', 'topic', 'tradition.', 'turmoil', 'two', 'uncover', 'understand', 'understanding', 'unique', 'up', 'various', 'very', 'volume,', 'which', 'works,', 'worthwhile', 'would', 'years,']
print(search(lst, "Qing"))

Output:

pivot is now 92
searching on the left 91 0
pivot is now 45
searching on the left 44 0
pivot is now 22
searching on the right 44 22
pivot is now 33
searching on the left 32 22
pivot is now 27
searching on the right 32 27
pivot is now 29
searching on the right 32 29
pivot is now 30
searching on the right 32 30
pivot is now 31
searching on the right 32 31
found element Qing at lower bound 31
31

Actually, every second if statement can be eliminated as well (as in, remove the if line and unindent its contents), but for the sake of readability, leaving them is fine.

Also, if you're using python 3, the "Example number {}".format(5) elements can be replaced with f"Example number {5}".

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.