Python 3.6 equivalent of md5 digest() method?

765 Views Asked by At

I'am having trouble in achieving the same results of md5 digest() method from Python 2.7 in Python 3.6.

Python 2.7:

import md5

encryption_base  = 'cS35jJYp15kjQf01FVqA7ubRaNOXKPmYGRbLUiimX0g3frQhzOZBmTSni4IEjHLWYMMioGaliIz5z8u2:abcdefghkmnopqrstuvwxyz:4'
digest = md5.new (encryption_base).digest()
print(digest)

#T┼ǃ×ÞRK(M<¶┤#  ²

Python 3.6:

from hashlib import md5

encryption_base  = 'cS35jJYp15kjQf01FVqA7ubRaNOXKPmYGRbLUiimX0g3frQhzOZBmTSni4IEjHLWYMMioGaliIz5z8u2:abcdefghkmnopqrstuvwxyz:4'
digest = md5(encryption_base.encode()).digest()
print(digest)

#b'T\xc5\x80\x9f\x9e\xe8RK(M<\xf4\xb4#\t\xfd'

How can I get the same string as in the Python 2.7 result? .hexdigest is not the case for this also.

1

There are 1 best solutions below

4
On BEST ANSWER

You have the exact same result, a bytestring. The only difference is that in Python 3 printing a bytestring gives you a debugging-friendly representation, not the raw bytes. That's because the raw bytes are not necessarily printable and print() needs Unicode strings.

If you must have the same output, write the bytes directly to the stdout buffer, bypassing the Unicode TextIOWrapper() that takes care of encoding text to the underlying locale codec:

import sys

digest = md5(encryption_base.encode('ASCII')).digest()
sys.stdout.buffer.write(digest + b'\n')

Note that you must ensure that you define your encryption_base value as a bytes value too, or at least encode it to the same codec, ASCII, like I did above.

Definining it as a bytestring gives you the same value as in Python 2 without encoding:

encryption_base  = b'cS35jJYp15kjQf01FVqA7ubRaNOXKPmYGRbLUiimX0g3frQhzOZBmTSni4IEjHLWYMMioGaliIz5z8u2:abcdefghkmnopqrstuvwxyz:4'

When you use str.encode() without explicitly setting an argument, you are encoding to UTF-8. IF your encryption_base string only consists of ASCII codepoints, the result would be the same, but not if you have any Latin-1 or higher codepoints in there too. Don't conflate bytes with Unicode codepoints! See https://nedbatchelder.com/text/unipain.html to fully understand the difference and how that difference applies to Python 2 and 3.