Let's say we have a regex "start:(?: ([0-9]{1,2}))? ([0-9].*)".
It will match
std::string string1 = "start: 01 0ab";
and
std::string string2 = "start: 0ab";
We can also get the 2 matched string respectively.
I try to use boost::spirit::qi parser to parse string2 but it couldn't match.
qi::rule<std::string::const_iterator, std::string()> rule1 = qi::repeat(1,2)[qi::digit];
qi::rule<std::string::const_iterator, std::string()> rule2 = qi::digit >> *qi::char_;
std::vector<std::string> attr;
auto it_begin = string2.begin();
auto it_end = string2.end();
if (qi::parse(
it_begin,
it_end,
qi::lit("start:")
>> -(qi::lit(" ") >> rule1)
>> qi::lit(" ") >> rule2
>> qi::eoi,
attr))
std::cout<<"match"<<std::endl;
else
std::cout<<"not match"<<std::endl;
We can of course use a look-ahead operator to check what's behind rule1, but is there a more generic approach to implement regex operator '?' ? Thanks!
I'm not sure what's wrong with the expectation. It is the only way for otherwise ambiguous rules, since PEG grammars are always greedy.
However, maybe you didn't arrive at the most elegant form, since you were looking for something "better". Here's what I'd do.
I'd use a skipper to match spaces¹:
Where the rules are still lexemes (because there were declared without the skipper):
Live On Coliru
Prints
¹ this assumes that multiple/different whitespace is okay. If newlines should not count as whitespace, use
qi::blank
instead ofqi::space