Why encode a binary value as a byte instead of a bit?

69 Views Asked by At

I am used to seeing the encoding of flags (i.e., binary values) as bits. See, for example, the SYN and ACK flags in the TCP header.

I recently stumbled upon the specification of Certificate Transparency: https://www.rfc-editor.org/rfc/rfc6962.html

Long story short: the main building block of the Certificate Transparency log is a Merkle tree, a tree of hashes. In order to prevent second preimage attacks, they require to make a distinction between leafs and non-leaf nodes in the tree, which they do by prepending 0x00 to leaf and 0x01 to non-leaf nodes before hashing, see this link.

I'm a bit puzzled because even though this information could be encoded in one bit, the RFC specifies to encode it as a byte (0x00 or 0x01). I am not sure what the rationale is.

To clarify, I understand why they separate leafs from non-leafs and what second preimage attacks are. My question is: why would they encode one bit of information into one whole byte? I am suspecting it has to do with the properties of hash functions, but perhaps there is a simpler explanation.

1

There are 1 best solutions below

0
On

SHA-256, like many (but not all) other hash algorithms, technically is defined to operate on bits. It is acceptable within the standard defining SHA-256 to use an input that has a bits not a multiple of eight.

However, as a practical matter, this is extremely inconvenient to work with. Computers store and address memory as a series of bytes, so the vast majority of programs are designed to work on whole bytes. Similarly, even though SHA-256 is defined over lengths of bits that are not full bytes, the overwhelming majority of implementations support only byte-sized inputs. Hence, it makes sense to define input that comprises a series of bytes, even if that is slightly wasteful, because otherwise implementations become substantially more complicated. Typically, standards just define the other bits as "reserved for update" in the future demand that they be set to zero in the meantime.

There are algorithms, such as BLAKE2, which only operate on whole bytes because practically, almost nobody wants to operate on non-byte-oriented bit streams outside of a testsuite or compliance exercise. Even non-cryptographic algorithms, such as compression algorithms, which do operate on bit streams typically always pad to a full byte for everyone's convenience.