Find a hash encryption input from an output

Question

I have this function hash() that encrypts a given string to an integer.

letters = 'weiojpknasdjhsuert'

def hash(string_input):
  h = 3

  for i in range(0, len(string_input)):
    h = h * 43 + letters.index(string_input[i])

  return h

So if I do print(hash('post')), my output is: 11231443

How can I find what my input needs to be to get an output like 1509979332193868 if the input can only be a string from letters? There is a formula inside the loop body but I couldn't figure out how to reverse it.

Just to make straight the terminology used, please look at Fundamental difference between Hashing and Encryption algorithms — buran
– buran, Commented Oct 2, 2021 at 6:20
I don't think the formula is reversible, it may even be the case that multiple strings hash to the same value. — Barmar
– Barmar, Commented Oct 2, 2021 at 6:25
So how can I find out what the input can be given a single output? — mchd
– mchd, Commented Oct 2, 2021 at 6:33
@Barnar What if its a string made up exclusively from characters in a series of characters? — mchd
– mchd, Commented Oct 2, 2021 at 6:35

Mark · Accepted Answer · 2021-10-02 07:25:50Z

2

It seem like since 43 is larger than your alphabet, you can just reverse the math. I don't know how to prove there are no hash collisions, so this may have edge cases. For example:

letters = 'weiojpknasdjhsuert'

def hash(string_input):
    h = 3

    for i in range(0, len(string_input)):
        h = h * 43 + letters.index(string_input[i])

    return h

n = hash('wepostjapansand')
print(n)
# 9533132150649729383107184

def rev(n):
    s = ''
    while n:
        l = n % 43  # going in reverse this is the index of the next letter
        n -= l      # now reverse the math...subtract that index
        n //= 43    # and divide by 43 to get the next n
        if n:
            s = letters[l] + s
    return s

print(rev(n))
# wepostjapansand

With a more reasonable alphabet, like lowercase ascii and a space, this still seem to be okay:

from string import ascii_lowercase 

letters = ascii_lowercase + ' '

def hash(string_input):
    h = 3

    for i in range(0, len(string_input)):
        h = h * 43 + letters.index(string_input[i])

    return h

n = hash('this is some really long text to test how well this works')
print(n)
# 4415562436659212948343839746913950248717359454765313646276852138403823934037244869651587521298

def rev(n):
    s = ''
    # with more compact logic
    while n > 3:
        s = letters[n % 43] + s
        n = (n - (n % 43)) // 43
    return s

print(rev(n))
# this is some really long text to test how well this works

The basic idea is that after all the math, the last number is:

prev * 43 + letter_index

This means you can recover the final letter index by taking the prev modulus 43. Then subtract that and divide by 43 (which is just the reverse of the math) and do it again until your number is zero.

edited Oct 2, 2021 at 7:25

answered Oct 2, 2021 at 6:51

Mark

92.6k8 gold badges116 silver badges156 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Barmar Over a year ago

I didn't notice that alphabet was so small, I assumed it would include all uppercase and lowercase letters, so at least 52 characters, which is larger than 43.

mchd Over a year ago

Your first def rev(n) actually works. Would you mind explaining your logic?

Mark Over a year ago

Yeah, @Barmar, I think that's right. With all the upper and lower case letters, this seems to work with a bigger multiple like 59

Mark Over a year ago

@oo92, I added some explanation at the end...hope that's helpful.

Barmar Over a year ago

I'm curious: Why doesn't the initial value h = 3 appear in the reverse?

|

Collectives™ on Stack Overflow

Find a hash encryption input from an output

1 Answer 1

6 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related