Given a character, how can we transform its UTF-8 encoding to bits in Python?
As an example, a corresponds to 01100001. I am aware of ord, but something like bin(ord('a'))[2:] returns 1100001, and it does not include 0 to the left. Of course, by zfill(8) I can make it 8 bits, but I would like to know if there is a more pythonic way of doing this. For instance, if we do not know in-advance how many bits it requires, then zfill(8) approach may not work any longer, as it may be 16 bits long.
Python 3 strings contain Unicode code points, not "UTF-8 characters". You can use
ord()to get the Unicode code point value, and.encode()to convert it to UTF-8 bytes. Then format each byte as 8-digit binary text, and.join()them together. Example:Output: