2

Consider the following image:

enter image description here

How can I represent hashcode to binary format (as 0110011)

Thank you

1

4 Answers 4

4
>>> import hashlib
>>> s="cat"
>>> hashcode=hashlib.md5(s).hexdigest()
>>> bin(int(hashcode,16))
'0b11010000011101111111001001000100110111101111100010100111000011100101111010100111010110001011110110000011010100101111110011011000'
Sign up to request clarification or add additional context in comments.

Comments

2

Try this:

Originally answered by @Ashwini Chaudhary

''.join(format(ord(n),'b') for n in hashcode)
'11001001100001101111101111100110110010110100110100110010011001011100110111000110000111011111000011001011101011100101110000111011111010111100011000101100100111000110011110101110010110011011000111100100111000'

Another update

Much cleaner approach from @m.wasowski, I am combining his idea to try and solve this (still not sure if this is what exactly OP wants, since lengths of words in sentence are not consistent):

def text_to_bin(text, n_bin=16):
    if n_bin < 16 or n_bin > 36:
        print "n_bin must be >= 16 and <= 36"
    else:
        for word in text.split():
            word_ord = ['{}'.format(bin(int(hashlib.md5(word).hexdigest(), n_bin)))]
            print '{} {}'.format(word, ''.join(word_ord)[2:])

text_to_bin('A cat in the woods', 16)
A 1111111110001010110001001110000111001111010011100001111101010000001101001011001001101011011011100101110101011001011111000101001
cat 11010000011101111111001001000100110111101111100010100111000011100101111010100111010110001011110110000011010100101111110011011000
in 10011101101011011111111101001011011110011111000101111111001000001000111001001111101100110111101001010010110000010101011011111
the 10001111110001000010110001101101110111111001100101100110110110110011101100001001111010000100001101100101000000110100001101010111
woods 10110111101111111011100101001001001001111100010000000101011101111100000100000001100111100000000010000100110101110011001010010111

5 Comments

Thanks for the answer. Is it possible to get a fixed length of hash values for each word in a string? For example "A cat in the woods". I would like to get a hash value of each word (in binary representation)
Is it possible the results from the MD5 ,for each word, to be of fixed length in the form of '100011...'?
@PanosTheo, what do you exactly mean? you want to have fixed length of '100011', like 8b => '00100011'?
What I really want is to create a b-bit binary hash for every word in ,lets say, the above string. What I want to do, in the end, is to make a simhash algorithm so I need the b-bit binary hashes of the words to begin with. (Thank you for bearing with me. I know my English is bad)
Have you read comment from @m.wasowski? I will update the answer with his idea, much cleaner
0

you can use the bin() function

bin(d077ff) # Result: '0b11...'

to remove 0b you can do this:

int(str(temp)[2:])

8 Comments

Thanks for the answer but the bin() function cannot be used with the above string and it will produce results of uneven length. For example "cat" and "bird" will produce binary with different length. How can I have a fixed length of the result? I hope I was clear. Thanks
why can't bin be used? and of course cat and bird will have different length in binary. I am confused what you mean by that.
What I want is to represent each string with a fixed length of a hash value in the form of '0011100011'
that near impossible to do unless your strings are going to fixed length?
hashes are displayed as hexadecimal, so you can use bin(int('998fd550322', 16))
|
0

A little late but try this:

binhashcode = ''.join(bin(int(i,16)).zfill(4) for i in hashcode)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.