I am trying to turn a boost::dynamic_bitset
into a string
so I can pass it to a compression function. I am able to convert it using boost::to_string
but it leads to 8x more bits. When I convert back from string
to boost::dynamic_bitset
, it doesn't reduce 8x the amount of bits used, which would solve my problem as it would lead to the total amount of space used would eventually be the original amount of bits compressed.
I am looking to either stop the 8x increase when going from boost::dynamic_bitset
-> string
, or reduce the space used 8x when going from boost::dynamic_bitset
-> string
.
The data stored in boost::dynamic_bitset
is coming from a binary file. Here is the one I am using, but theoretically any file with binary data should work.
When outputting amount of bits, I am calling string.size()
which returns the size in terms of bytes, hence why I am multiplying by 8. boost::dynamic_bitset.size()
returns size in terms of bits, so my output should be comparing apples to apples, assuming that is all correct.
Here is the output I am currently getting:
Dynamic bitset input bits = 6431936
String from dynamic_bitset bits = 51455488
Dynamic bitset from string bits = 51455488
Here is my code:
#include <iostream>
#include <string>
#include <vector>
#include <fstream>
#include <boost/dynamic_bitset.hpp>
#include <boost/math/cstdfloat/cstdfloat_types.hpp>
#include <boost/numeric/conversion/cast.hpp>
typedef boost::float32_t float32;
typedef boost::uint32_t uint32;
int main() {
std::vector<unsigned char> data;
unsigned char *cp;
float32 value = 8.0;
boost::uint32_t internal;
std::memcpy( &internal , &value , sizeof( value ) );
std::ifstream fin("ex.bin", std::ios::binary);
while (fin.read(reinterpret_cast<char*>(&internal), sizeof(uint32))) {
std::memcpy( &value, &internal , sizeof( internal ) );
cp = (unsigned char *)&internal;
for(int i = 0; i < 4; ++i) {
data.push_back(*cp);
++cp;
}
}
boost::dynamic_bitset<unsigned char> bitset;
std::string buffer;
bitset.init_from_block_range(data.begin(), data.end());
std::cout << "Dynamic bitset input bits = " << bitset.size() << "\n";
boost::to_string(bitset, buffer);
std::cout << "String from dynamic_bitset bits = " << buffer.size()*8 << "\n";
boost::dynamic_bitset<unsigned char> from_str(buffer.begin(), buffer.end());
std::cout << "Dynamic bitset from string bits = " << from_str.size() << "\n";
return 0;
}
The
size()
method on C++ containers idiomatically refers to the number of elements, not to the number of bytes.std::string::size()
gives you the count ofchar
values in the string, whileboost::dynamic_bitset::size()
returns the number of bits stored in it; as you initialized it from n=buffer.size()
"blocks" of 8 bits each (char
is 8 bit on pretty much any "normal" platform), it's completely expected that the size it returns is 8 times as big, while the actual consumed memory is exactly the same.Edit: after the last modification, the problem is now completely different.
boost::to_string
doesn't output the internal compact representation, but generates a human-readable string made of actual'0'
and'1'
characters, which does result in an 8x increase of required size (the output is a sequence of 8 bit bytes, of which just one bit is effectively used).