Python base64.b32hexencode not creating expected result

106 Views Asked by At

I expect the code to return TPLIG0 as the base32 hex-extended value for 1 billion (1.000.000.000). Instead I get 7EDCK00= .

Here is my code:

import base64
num = 1000000000
needed_bytes = num.to_bytes((num.bit_length() + 7) // 8, byteorder='big')
result = base64.b32hexencode(needed_bytes).decode('utf-8')
print(result)

I have tried with b32hexencode, b32encode, as well as byteorder='little' and byteorder='big', but I cannot reproduce the expected result.

If I replace the division with the hardcoded number: num.to_bytes(5, byteorder='big') I can get the approximate result: 00TPLIG0 . But what is going on here??

I'm using python 3.10.7. on Windows and 3.11.0 on Ubuntu (both create same output).

2

There are 2 best solutions below

0
Mark Tolonen On BEST ANSWER

num.bit_length() + 7) // 8 is 4, not 5. b32hexencode pads its result with = if not a multiple of 40 bits (5 bytes) and the bits are shifted.

>>> base64.b32hexencode (bytes([1,2,3,4]))
b'0410610='
>>> base64.b32hexencode (bytes([1,2,3,4,5]))
b'04106105'

Use (num.bit_length() + 39) // 40 * 5 to calculate needed_bytes in multiples of 5 bytes, then strip leading zeroes to use base64.b32hexencode() correctly:

import base64
num = 1_000_000_000

def convert(n):
    if n == 0:  # b32hexencode() return '=====' for zero, so special handling
        return '0'
    num_bytes = (n.bit_length() + 39) // 40 * 5
    needed_bytes = n.to_bytes(num_bytes, byteorder='big')
    result = base64.b32hexencode(needed_bytes).lstrip(b'0')
    return result.decode() # bytes -> str

def display(n):
    result = convert(n)
    verify = int(result, 32)
    print(f'{result:>9} {verify:17,}')
    
display(num)
for i in range(9):
    n = 2**(i * 5)
    display(n - 1)
    display(n)

Output testing OP value and rolling over every 5-bit binary value:

   TPLIG0     1,000,000,000
        0                 0
        1                 1
        V                31
       10                32
       VV             1,023
      100             1,024
      VVV            32,767
     1000            32,768
     VVVV         1,048,575
    10000         1,048,576
    VVVVV        33,554,431
   100000        33,554,432
   VVVVVV     1,073,741,823
  1000000     1,073,741,824
  VVVVVVV    34,359,738,367
 10000000    34,359,738,368
 VVVVVVVV 1,099,511,627,775
100000000 1,099,511,627,776
1
Tranbi On

You are actually trying to convert a decimal base number to base32. I don't think base64 has been made for such use cases.

Converting from one base to another is already covered in other SO answers (see here for instance). In your case, you could implement the following function:

def base10to32(n,symbols="0123456789ABCDEFGHIJKLMNOPQRSTUV"):
    return (base10to32(n//32)+symbols[n%32]).lstrip("0") if n>0 else "0"

base10to32(1000000000)

Or if you prefer to use an existing module, numpy can handle this type of conversion:

import numpy as np
np.base_repr(1000000000, 32)

Output:

TPLIG0