0

I have this function hash() that encrypts a given string to an integer.

letters = 'weiojpknasdjhsuert'

def hash(string_input):
  h = 3

  for i in range(0, len(string_input)):
    h = h * 43 + letters.index(string_input[i])

  return h

So if I do print(hash('post')), my output is: 11231443

How can I find what my input needs to be to get an output like 1509979332193868 if the input can only be a string from letters? There is a formula inside the loop body but I couldn't figure out how to reverse it.

7
  • 5
    Just to make straight the terminology used, please look at Fundamental difference between Hashing and Encryption algorithms Commented Oct 2, 2021 at 6:20
  • I don't think the formula is reversible, it may even be the case that multiple strings hash to the same value. Commented Oct 2, 2021 at 6:25
  • So how can I find out what the input can be given a single output? Commented Oct 2, 2021 at 6:33
  • @Barnar What if its a string made up exclusively from characters in a series of characters? Commented Oct 2, 2021 at 6:35
  • 1
    @Mark Fixed it. Letters is actually chars Commented Oct 2, 2021 at 6:48

1 Answer 1

2

It seem like since 43 is larger than your alphabet, you can just reverse the math. I don't know how to prove there are no hash collisions, so this may have edge cases. For example:

letters = 'weiojpknasdjhsuert'

def hash(string_input):
    h = 3

    for i in range(0, len(string_input)):
        h = h * 43 + letters.index(string_input[i])

    return h

n = hash('wepostjapansand')
print(n)
# 9533132150649729383107184

def rev(n):
    s = ''
    while n:
        l = n % 43  # going in reverse this is the index of the next letter
        n -= l      # now reverse the math...subtract that index
        n //= 43    # and divide by 43 to get the next n
        if n:
            s = letters[l] + s
    return s

print(rev(n))
# wepostjapansand

With a more reasonable alphabet, like lowercase ascii and a space, this still seem to be okay:

from string import ascii_lowercase 

letters = ascii_lowercase + ' '

def hash(string_input):
    h = 3

    for i in range(0, len(string_input)):
        h = h * 43 + letters.index(string_input[i])

    return h

n = hash('this is some really long text to test how well this works')
print(n)
# 4415562436659212948343839746913950248717359454765313646276852138403823934037244869651587521298

def rev(n):
    s = ''
    # with more compact logic
    while n > 3:
        s = letters[n % 43] + s
        n = (n - (n % 43)) // 43
    return s

print(rev(n))
# this is some really long text to test how well this works

The basic idea is that after all the math, the last number is:

prev * 43 + letter_index

This means you can recover the final letter index by taking the prev modulus 43. Then subtract that and divide by 43 (which is just the reverse of the math) and do it again until your number is zero.

Sign up to request clarification or add additional context in comments.

6 Comments

I didn't notice that alphabet was so small, I assumed it would include all uppercase and lowercase letters, so at least 52 characters, which is larger than 43.
Your first def rev(n) actually works. Would you mind explaining your logic?
Yeah, @Barmar, I think that's right. With all the upper and lower case letters, this seems to work with a bigger multiple like 59
@oo92, I added some explanation at the end...hope that's helpful.
I'm curious: Why doesn't the initial value h = 3 appear in the reverse?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.