So far I thought they are the same as bytes are made of bits and that both side needs to know byte size and endiannes of the other side and transform stream accordingly. However Wikipedia says that byte stream
!= bit stream
(https://en.wikipedia.org/wiki/Byte_stream ) and that bit streams
are specifically used in video coding (https://en.wikipedia.org/wiki/Bitstream_format). In this RFC https://www.rfc-editor.org/rfc/rfc107 they discuss these 2 things and describe Two separate kinds of inefficiency arose from bit streams.
. My questions are:
- what's the real difference between byte stream and bit stream?
- how bit stream works if it's different from byte stream? How does a receiving side know how many bits to process at a given time?
- why is bit stream better than byte stream in some cases?
This is a pretty broad question, I'll have to give the 10,000 feet view. Bit streams are common in two distinct usages:
very low-level, it is the fundamental way that lots of hardware operates. Best examples are the data stream that comes off a hard disk or a optical disk or the data sent across a transmission line, like a USB cable or the coax cable or telephone line through which you received this post. The RFC you found applies here.
high-level, they are common in data compression, a variable number of bits per token allows packing data tighter. Huffman coding is the most basic way to compress. The video encoding subjects you found applies here.
Byte streams are highly compatible with computers which are byte-oriented devices and the ones you'll almost always encounter in programming. Bit streams are much more low-level, only system integration engineers ever worry about them. While the payload of a bit stream is often the bytes that a computer is interested in, more overhead is typically required to ensure that the receiver can properly interpret the data. There are usually a lot more bits than necessary to encode the bytes in the data. Extra bits are needed to ensure that the receiver is properly synchronized and can detect and perhaps correct bit errors. NRZ encoding is very common.
The RFC is quite archeological, in 1971 they were still hammering out the basics of getting computers to talk to each other. Back then they were still close to the transmission line behavior, a bit stream, and many computers did not yet agree on 8 bits in a byte. They are fretting over the cost of converting bits to local bytes on very anemic hardware and the need to pack as many bits into a message as possible.
The protocol determines that, like that RFC does. In the case of a variable length bit encoding it is bit values themselves that determine it, like Huffman coding does.
Covered already I think, because it is better match for its purpose. Either because the hardware is bit-oriented or because variable bit-length coding is useful.