Python struct.unpack - converting binary string literals to ascii text writing to file

Question

Python struct.unpack - converting binary string literals to ascii text writing to file

2.2k Views Asked by Rohit At 12 May 2017 at 08:22

We use struct.unpack to read a binary file created out of a dump of all the C structures fields and their values (integers and strings). The unpacked tuples are then used to create an intermediate dictionary representation of the fields and their values, which is later written to a text file output.

The text file output displays the strings as below:

ID = b'000194901137\x00\x00\x00\x00' 
timestampGMT = 1489215906
timezoneDiff = -5
timestampPackage = 1489215902
version = 293
type = b'FULL\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

The program was earlier written in python 2.6, where it used to work fine. We had used the below lambda expression to remove the unwanted hex characters, while writing to the text file :

filtered_string = filter(lambda x: x in string.printable, line)

Moving the to Python 3.5, the lambda expression isn't supported anymore, since it now returns a filter which can't be converted to a string easily.

What is the Pythonic way to convert these binary string literals to equivalent ascii text ( without the trailing NUL'\x00'), so its written as normal strings values.

Also, since there are multiple thousand of entries to be processed for each file ( again there are multiple files ), looking for some best possible solutions in the current context.

Original Q&A

There are 1 best solutions below

**Rohit** · Answer 1 · 2017-05-18T09:11:30.727000

In Python 2 you could use the str type for both text and binary data interchangeably and it worked fine. From Python3 binary data read is of type bytes, and it doesn't share a common base class as in Python 2.

$ python3
Python 3.5.0 (default, Sep 15 2015, 13:42:03) 
[GCC 4.6.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> type(b'aaa')
<class 'bytes'>
>>> type(b'aaa').__mro__
(<class 'bytes'>, <class 'object'>)
>>> type('aaa')
<class 'str'>
>>> type('aaa').__mro__
(<class 'str'>, <class 'object'>)

$ python
Python 2.6.6 (r266:84292, Nov 21 2013, 10:50:32) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> type(b'aaa').__mro__
(<type 'str'>, <type 'basestring'>, <type 'object'>)
>>> type('aaa').__mro__
(<type 'str'>, <type 'basestring'>, <type 'object'>)

Strings encoded in the binary file are read in as bytes type string literals, which need to be converted to the str (Unicode) type to be displayed/written to a file as normal strings.

After I retrieve the tuple from struct.unpack() , I do the following :

  valTuple = struct.unpack(fmt, self.data[off : goff + struct_size])

  valList = list(valTuple)
  for i in range(len(valList)):
    if type(valList[i]) == bytes:
      valList[i] = valList[i].rstrip(b'\x00').decode()

Read this https://docs.python.org/3/howto/pyporting.html#text-versus-binary-data

Python struct.unpack - converting binary string literals to ascii text writing to file

There are 1 best solutions below

Related Questions in PYTHON-3.X

Related Questions in LAMBDA

Related Questions in STRUCT.PACK

Trending Questions

Popular # Hahtags

Popular Questions