boost read_json throw exception when the json file has some gbk chinese charactor

2.3k Views Asked by At

there is a json file like this, without bom, use gbk code set. The boost::property_tree can parse it successfully in the majority.

try {
    boost::property_tree::read_json(filename, tree);
}
catch (exception &e) {
    cerr << e.what() << endl;
}

However, if the file has chinese character"历"(c0fa)or"繞"(c040), the property_tree will throw exception"invalid code sequence"

1

There are 1 best solutions below

1
On

You could try to use the overload that takes a stream and imbue a proper locale before hand:

#include <fstream>
#include <iostream>
#include <boost/locale.hpp>

Where you use Boost Locale to generate a locale e.g., on POSIX:

boost::locale::generator gen;
auto CN = gen.generate("zh_CN.GBK");

And then imbue that:

std::ifstream ifs(filename, std::ios::binary);
ifs.imbue(CN);

boost::property_tree::ptree pt;
read_json(ifs, pt);