How to use u8_to_u32_iterator in Boost Spirit X3?

237 Views Asked by At

I am using Boost Spirit X3 to create a programming language, but when I try to support Unicode, I get an error!
Here is an example of a simplified version of that program.

#define BOOST_SPIRIT_X3_UNICODE
#include <boost/spirit/home/x3.hpp>

namespace x3 = boost::spirit::x3;

struct sample : x3::symbols<unsigned> {
    sample()
    {
        add("48", 10);
    }
};

int main()
{
  const std::string s("");

  boost::u8_to_u32_iterator<std::string::const_iterator> first{cbegin(s)},
    last{cend(s)};

  x3::parse(first, last, sample{});
}

Live on wandbox

What should I do?

1

There are 1 best solutions below

1
On BEST ANSWER

As you noticed, internally char_encoding::unicode employs char32_t.

So, first changing the symbols accordingly:

template <typename T>
using symbols = x3::symbols_parser<boost::spirit::char_encoding::unicode, T>;

struct sample : symbols<unsigned> {
    sample() { add(U"48", 10); }
};

Now the code fails calling into case_compare:

/home/sehe/custom/boost_1_78_0/boost/spirit/home/x3/string/detail/tst.hpp|74 col 33| error: no match for call to ‘(boost::spirit::x3::case_compare<boost::spirit::char_encoding::unicode>) (reference, char32_t&)’

As you can see it expects a char32_t reference, but u8_to_u32_iterator returns unsigned ints (std::uint32_t).

Just for comparison / sanity check: https://godbolt.org/z/1zozxq96W

Luckily you can instruct the u8_to_u32_iterator to use another co-domain type:

Live On Compiler Explorer

#define BOOST_SPIRIT_X3_UNICODE
#include <boost/spirit/home/x3.hpp>
#include <iomanip>
#include <iostream>

namespace x3 = boost::spirit::x3;

template <typename T>
using symbols = x3::symbols_parser<boost::spirit::char_encoding::unicode, T>;

struct sample : symbols<unsigned> {
    sample() { add(U"48", 10)(U"", 11); }
};

int main() {
    auto test = [](auto const& s) {
        boost::u8_to_u32_iterator<decltype(cbegin(s)), char32_t> first{
            cbegin(s)},
            last{cend(s)};

        unsigned parsed_value;
        if (x3::parse(first, last, sample{}, parsed_value)) {
            std::cout << s << " -> " << parsed_value << "\n";
        } else {
            std::cout << s << " FAIL\n";
        }
    };

    for (std::string s : {"", "48", ""})
        test(s);
}

Prints

 -> 11
48 -> 10
 FAIL