Converting binary timestamp to string

1.9k Views Asked by At

I'm trying to parse a proprietary binary-format (Wintec NAL) with python. There's existing and working C-code that does the same (Author: Dennis Heynlein), which i'm trying to port to Python.

I'm struggling to understand parts of the C-code. Here's the definition of the binary format in C:

/*
 * File extension:. NAL
 * File format: binary, 32 byte fixed block length
 */

/*
 * For now we will read raw structs direct from the data file, ignoring byte
 * order issues (since the data is in little-endian form compatible with i386)
 *
 * XXX TODO:  write marshalling functions to read records in the proper
 * byte-order agnostic way.
 */
#pragma pack (1)

typedef struct nal_data32 {
  unsigned char point_type; /* 0 - normal, 1 - start, 2 - marked */

  unsigned char padding_1;

  unsigned int second: 6, minute: 6, hour: 5;
  unsigned int day: 5, month: 4, year: 6; /* add 2000 to year */

  signed int latitude;    /* divide by 1E7 for degrees */
  signed int longitude;   /* divide by 1E7 for degrees */

  unsigned short height;    /* meters */

  signed char temperature;  /* °C */

  unsigned short pressure;  /* mbar */

  unsigned char cadence;    /* RPM */
  unsigned char pulse;    /* BPM */

  signed char slope;    /* degrees */

  signed short compass;   /* °Z axis */
  signed short roll;    /* °X axis */
  signed short yaw;   /* °Y axis */

  unsigned char speed;    /* km/h */

  unsigned char bike;   /* ID# 0-3 */

  unsigned char padding_2;
  unsigned char padding_3;
} nal_t;

I'm using python-bitstring to replicate this functionality in Python, but i have difficulties in understanding the time-format given above and adopting it to Python.

from bitstring import ConstBitStream
nal_format=('''
    uint:8,
    uint:8,
    bin:32,
    intle:32,
    intle:32,
    uint:16,
    uint:8,
    uint:16,
    uint:8,
    uint:8,
    uint:8,
    uint:16,
    uint:16,
    uint:16,
    uint:8,
    uint:8,
    uint:8,
    uint:8
''')

f = ConstBitStream('0x01009f5a06379ae1cb13f7a6b62bca010dc703000000c300fefff9ff00000000')
f.pos=0

#type,padding1,second,minute,hour,day,month,year,lat,lon,height,temp,press,cad,pulse,slope,compass,roll,yaw,speed,bike,padding2,padding3=f.peeklist(nal_format)

type,padding1,time,lat,lon,height,temp,press,cad,pulse,slope,compass,roll,yaw,speed,bike,padding2,padding3=f.readlist(nal_format)

print type
print padding1
#print second 
#print minute
#print hour
#print day
#print month
#print year
print time
print lat
print lon

While i've figured out that latitude and longitude has to be defined as little-endian, i have no idea how to adapt the 32bit wide timestamp so it fits the format given in the C-definition (And i also couldn't figure out a matching mask for "height" - correspondingly i didn't try the fields after it).

These are the values for the hex-string above:

  • date: 2013/12/03-T05:42:31
  • position: 73.3390583° E, 33.2128666° N
  • compass: 195°, roll -2°, yaw -7°
  • alt: 458 meters
  • temp: 13 °C
  • pres: 967 mb
2

There are 2 best solutions below

1
On BEST ANSWER

I'm not familiar with bitstring, so I'll convert your input into packed binary data and then use struct to handle it. Skip to the break if you're uninterested in that part.

import binascii

packed = binascii.unhexlify('01009f5a06379ae1cb13f7a6b62bca010dc703000000c300fefff9ff00000000')

I can go over this part in more detail if you want. It's just turning '0100...' into b'\x01\x00...'.

Now, the only "gotcha" in unpacking this is figuring out that you only want to unpack ONE unsigned int, since that bit field fits into 32 bits (the width of a single unsigned int):

format = '<ccIiiHbHBBbhhhBBBB'

import struct

struct.unpack(format,packed)
Out[49]: 
('\x01',
 '\x00',
923163295,
...
)

That converts the output into an output we can use. You can unpack that into your long list of variables, like you were doing before.


Now, your question seemed to be centered around how to mask time (above: 923163295) to get the proper values out of the bit field. That's just a little bit of math:

second_mask = 2**6 - 1
minute_mask = second_mask << 6
hour_mask = (2**5 - 1) << (6+6)
day_mask = hour_mask << 5
month_mask = (2**4 - 1) << (6+6+5+5)
year_mask = (2**6 - 1) << (6+6+5+5+4)

time & second_mask
Out[59]: 31

(time & minute_mask) >> 6
Out[63]: 42

(time & hour_mask) >> (6+6)
Out[64]: 5

(time & day_mask) >> (6+6+5)
Out[65]: 3

(time & month_mask) >> (6+6+5+5)
Out[66]: 12

(time & year_mask) >> (6+6+5+5+4)
Out[67]: 13L

In function form, the whole thing is a bit more natural:

def unmask(num, width, offset):
     return (num & (2**width - 1) << offset) >> offset

Which (now that I think about it) rearranges into:

def unmask(num, width, offset):
     return (num >> offset) & (2**width - 1)

unmask(time, 6, 0)
Out[77]: 31

unmask(time, 6, 6)
Out[78]: 42

#etc

And if you want to get fancy,

from itertools import starmap
from functools import partial

width_offsets = [(6,0),(6,6),(5,12),(5,17),(4,22),(6,26)]

list(starmap(partial(unmask,time), width_offsets))
Out[166]: [31, 42, 5, 3, 12, 13L]

Format all those numbers correctly and finally out comes the expected date/time:

'20{:02d}/{:02d}/{:02d}-T{:02d}:{:02d}:{:02d}'.format(*reversed(_))
Out[167]: '2013/12/03-T05:42:31'

(There is likely a way to do all of this bitwise math elegantly with that bitstring module, but I just find it satisfying to solve things from first principles.)

3
On

The time stamp in the 'C' structure is a 'C' bitfield. The compiler uses the number after the colon to allocate a number of bits within the larger field definition. In this case, an unsigned int (4 bytes). Look here for a better explanation. The big gotcha, for bit fields, is that the bits are assigned based on the endian type of the computer so they aren't very portable.

There appears to be an error in your Python format declaration. It probably should have an additional 4 byte unsigned int allocated for the date. Something like:

nal_format=('''
    uint:8,
    uint:8,
    bin:32,
    bin:32,
    intle:32,
    intle:32,
''')

To represent the bit field in Python, use a Python Bit Array to represent the bits. Check out this.

One other thing to be aware of, the pack(1) on the structure. It tells the compiler to align on one byte boundaries. In other words, don't add any padding between fields. typically the alignment is 4 bytes causing the compiler to start each field on a 4 byte boundary. Check here for more information.