Converting text to binary. 2 part issue

243 Views Asked by At

Ok... Im writing a python script to convert text to binary...
Im using easygui to quickly convert short phrases and test for issues. My issue...
The main work horse is:

BText = bin(int(binascii.hexlify(DText),16))

I have the value return through an easygui dialog also... But when i type in a single character i get a 15 character response...
So a) im getting an extra character somewhere(b4 workhorse?)and
b) why isnt the returned value 16 characters?
Ive also tried 4 letter words and other various sizes and i always end up 7 characters too long. So im getting an extra entry value somewhere and am always returning one character short of a full 8 return...
I dont know a thing about the underlying processes that make this happen but it should be something i should know... Thanks...

Alright tried for an hour to post my code and it isnt properly formatted i guess... I run Python 2.7.8.
I use easygui.textbox to receive input and for the output.
The input is run through the workhorse above. 0b is then stripped from the returned input using BText = str(BText)[2:]. The resulting string is then returned and shown to the user via easygui.textbox...

EasyGui

#Imports
import OTPModule as TP
import easygui as EG

Plain = EG.textbox(msg='Enter Message', title='OTP', text='Hi', codebox=1)
XORD, Key = TP.Gather(Plain)
EG.textbox(msg='XORD', title='OTP - XOR Message', text=XORD, codebox=1)
EG.textbox(msg='Key', title='OTP - Key', text=Key, codebox=1)


raw_input("Press Enter To Decrypt")

XOrd = EG.textbox(msg='Enter XOR Message', title='OTP', text='01', codebox=1)
Key = EG.textbox(msg='Enter Key', title='OTP', text='10', codebox=1)
Plain = TP.Release(XORD, Key)
EG.textbox(msg='ASCII', title='OTP', text=Plain, codebox=1)

raw_input("Press Enter To Exit")

Module..

#################
#  One Time Pad #
#    (Module)   #
#  Python 2.7.8 #
#    Nov 2014   #
#Retler & Amnite#
#################


    #imports
import binascii
import random

def Gather(DText):

  print(DText)#Debug

  #First Things First... Convert To Binary
  BText = bin(int(binascii.hexlify(DText),16))

  #Strip 0b
  BText = str(BText)[2:]

  print(BText)#Debug

  #Generate Key
  KText = []
  a = 0

  while a < len(BText):
    b = random.randint(0,1)
    KText.append(b)
    a = a+1

  KText = ''.join(map(str,KText))
  print(KText)#Debug
  print a

  #So apparently we have to define the XOR ourselves
  #0^0=0, 0^1=1, 1^0=1, 1^1=0
  EText = []
  a = 0

  while a < len(BText):
    if BText[a] == KText[a]:
      EText.append(0)
    else:
      EText.append(1)
    a = a+1

  EText = ''.join(map(str,EText))

  return(EText, KText)

######The Other Half#######

def Release(EText, KText):

  print(EText)#Debug
  print(KText)#Debug

  #XOR
  BText = []
  a = 0

  while a < len(EText):
    if EText[a] == KText[a]:
      BText.append(0)
    else:
      BText.append(1)
    a = a+1

  BText = ''.join(map(str,BText))

  print(BText)#Debug

  #Binary To ASCI(Re-Add 0b)
  DText = int('0b'+BText,2)
  DText = binascii.unhexlify('%x' % DText)

  return(DText)
1

There are 1 best solutions below

8
On BEST ANSWER

Edit

Having installed easygui and trying textbox(), unicode strings are returned with a trailing new line character...

>>> Plain = EG.textbox(msg='Enter Message', title='OTP', text='Hi', codebox=1)
# hit OK in text box
>>> Plain
u'Hi\n'

That's the source of the additional character. You can get rid of the new line it with:

>>> Plain = Plain.rstrip()
>>> Plain
u'Hi'

Note also that a unicode string is returned. You may run into decoding issues if you enter non-ascii data, e.g. u'\u4000' (= 䀀) - hexlify() will blow up but that's another problem.

Original answer

I'm not familiar with easygui but I am guessing that it's producing UTF-16 output or some other multi-byte encoded data. Try printing the input character using repr(input_string) or similar. That could be why you are apparently seeing an additional character when inputting only a single character:

>>> bin(int(hexlify('a'), 16))[2:]
'1100001'
>>> bin(int(hexlify('a'.encode('utf-16-le')),16))[2:]
'110000100000000'

In the first example, a single character is translated to 7 bits (leading zeros not emitted by bin()). In the second example, the UTF-16 encoding is 2 bytes long:

>>> 'a'.encode('utf-16-le')
'a\x00'

and hence the result is the 15 bit string - again any leading zero bits are not emitted.