QDataStream readQString() How to read utf8 String

292 Views Asked by At

I am trying to decode UDP packet data from an application which encoded the data using Qt's QDataStream methods, but having trouble when trying to decode string fields. The docs say the data was encoded in utf8. The python QDataStream module only has a readQString() method. Numbers seem to decode fine, but the stream pointer gets messed up when the first strings decode improperly.

How can i decode these UTF8 Strings?

I am using some documentation from the source project interpret the encoding: wsjtx-2.2.2.tgz NetworkMessage.hpp Description in the header file

Header:
   32-bit unsigned integer magic number 0xadbccbda
   32-bit unsigned integer schema number

There is a status message for example with comments like this:

Heartbeat     Out/In    0                       quint32
                             Id (unique key)        utf8
                             Maximum schema number  quint32
                             version                utf8
                             revision               utf8

example data from the socket when a status message is received:

b'\xad\xbc\xcb\xda\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x06WSJT-X\x00\x00\x00\x03\x00\x00\x00\x052.1.0\x00\x00\x00\x0624fcd1'

def jt_decode_heart_beat(i):
    """
    Heartbeat     Out/In    0                      quint32
                             Id (unique key)        utf8
                             Maximum schema number  quint32
                             version                utf8
                             revision               utf8
    :param i: QDataStream
    :return: JT_HB_ID,JT_HB_SCHEMA,JT_HB_VERSION,JT_HB_REVISION
    """
    JT_HB_ID = i.readQString()
    JT_HB_SCHEMA = i.readInt32()
    JT_HB_VERSION = i.readQString()
    JT_HB_REVISION = i.readQString()
    print(f"HB:ID={JT_HB_ID} JT_HB_SCHEMA={JT_HB_SCHEMA} JT_HB_VERSION={JT_HB_VERSION} JT_HB_REVISION={JT_HB_REVISION}")
    return (JT_HB_ID, JT_HB_SCHEMA, JT_HB_VERSION, JT_HB_REVISION)

while 1:
    data, addr = s.recvfrom(1024)
    b = QByteArray(data)
    i = QDataStream(b)
    JT_QT_MAGIC_NUMBER  = i.readInt32()
    JT_QT_SCHEMA_NUMBER = i.readInt32()
    JT_TYPE = i.readInt32()

    if JT_TYPE == 0:
        # Heart Beat
        jt_decode_heart_beat(i)
    elif JT_TYPE == 1:
        jt_decode_status(i)
2

There are 2 best solutions below

0
baler1992 On

Long story short the wsjtx udp protocol I was reading did not encode the strings using the the QDataString type, so it was wrong to expect that i.readQString() would work.

Instead the data was encoded using a QInt32 to define the string length, followed by the UTF8 characters encoded in QByteArray.

I successfully encapsulated this functionality in a function:

def jt_decode_utf8_str(i): """ strings are encoded with an int 32 indicating size and then an array of bytes in utf-8 of length size :param i: :return: decoded string """ sz = i.readInt32() b = i.readRawData(sz) return b.decode("utf-8")

0
ckuhtz On

I ended up using this function, with exception handling for null strings which are encoded as 0xfffffffff length. Also, the values are QUInt32, and not QInt32.

def decode_utf8_str(buf):
    len = buf.readUInt32()
    if len == 0xffffffff: # null string
        return ""
    else:
        bytes = buf.readRawData(len)
        return bytes.decode("utf-8)")