The CBOR docs state that the most efficient (less number of bytes) encoding should be preferred.
floats can be encoded as 64-bit floats, or with extensions as 32-bit, 16-bit, BigFloat or DecimalFloat formats.
Stanards 64-bit encoding uses 9 bytes. Some floating values can take much less space if using an alterantive format (e.g the values 0.0, 1.0, 1.5 can be represented as 4 bytes using BigFloats).
Some values are better represented as standard floats (e.g. 0.123456789 is represented by 9 bytes as 64-bit float or 29 bytes with BigFloats.
The cbor2
python library supports BigFloats if using the Decimal
type, or the float if using the float
type.
How can I get cbor2
to automatically emit the most efficient type depending on the actual value?
I have tried various arbitrary values using cbor2.dumps()
. floats
are always encoded as CBOR floats, and Decimal
types are alwasy encoded as CBOR BigFloats.
>>> x=0.0 ; x ; d1 = dumps(x) ; d1 ; len(d1) ; dx = Decimal(x) ; d2 = dumps(dx) ; d2 ; len(d2)
0.0
b'\xfb\x00\x00\x00\x00\x00\x00\x00\x00'
9
b'\xc4\x82\x00\x00'
4
>>> x=1.0 ; x ; d1 = dumps(x) ; d1 ; len(d1) ; dx = Decimal(x) ; d2 = dumps(dx) ; d2 ; len(d2)
1.0
b'\xfb?\xf0\x00\x00\x00\x00\x00\x00'
9
b'\xc4\x82\x00\x01'
4
>>> x=1.5 ; x ; d1 = dumps(x) ; d1 ; len(d1) ; dx = Decimal(x) ; d2 = dumps(dx) ; d2 ; len(d2)
1.5
b'\xfb?\xf8\x00\x00\x00\x00\x00\x00'
9
b'\xc4\x82 \x0f'
4
>>> x=0.123456789 ; x ; d1 = dumps(x) ; d1 ; len(d1) ; dx = Decimal(x) ; d2 = dumps(dx) ; d2 ; len(d2)
0.123456789
b'\xfb?\xbf\x9a\xdd79c_'
9
b'\xc4\x8287\xc2W\x80\xe5\x18Js\xc0\xe4\x8f-\xf1\xc9\xf0\x90\xf4u%+\x93\xa7\n\x88\xa2?'
29
So I found the answer is a combination of using the
canonical=True
argument todumps()
and casting the floats to lower precision floats (usingnumpy
) where suitable (if any loss of precision is tolerable/acceptable).NOTE: have to cast back to python float as
cbor
can't encodenumpy
classes at the momement.