Code below compiles fine with
clang++ -std=c++11 test.cpp -o test
But when running an exception is thrown
terminate called after throwing an instance of 'boost::lexer::runtime_error' what(): Lookahead ('/') is not supported yet.
The problem is the the slash (/) in input and/or regex (line 12 and 39) but I can't find a solution how to escape it right. Any hints?
#include <string>
#include <cstring>
#include <boost/spirit/include/lex.hpp>
#include <boost/spirit/include/lex_lexertl.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace lex = boost::spirit::lex;
namespace qi = boost::spirit::qi;
namespace phoenix = boost::phoenix;
std::string regex("FOO/BAR");
template <typename Type>
struct Lexer : boost::spirit::lex::lexer<Type> {
Lexer() : foobar_(regex) {
this->self.add(foobar_);
}
boost::spirit::lex::token_def<std::string> foobar_;
};
template <typename Iterator, typename Def>
struct Grammar
: qi::grammar <Iterator, qi::in_state_skipper<Def> > {
template <typename Lexer> Grammar(const Lexer & _lexer);
typedef qi::in_state_skipper<Def> Skipper;
qi::rule<Iterator, Skipper> rule_;
};
template <typename Iterator, typename Def>
template <typename Lexer>
Grammar<Iterator, Def>::Grammar(const Lexer & _lexer)
: Grammar::base_type(rule_) {
rule_ = _lexer.foobar_;
}
int main() {
// INPUT
char const * first("FOO/BAR");
char const * last(first + strlen(first));
// LEXER
typedef lex::lexertl::token<const char *> Token;
typedef lex::lexertl::lexer<Token> Type;
Lexer<Type> l;
// GRAMMAR
typedef Lexer<Type>::iterator_type Iterator;
typedef Lexer<Type>::lexer_def Def;
Grammar<Iterator, Def> g(l);
// PARSE
bool ok = lex::tokenize_and_phrase_parse (
first
, last
, l
, g
, qi::in_state("WS")[l.self]
);
// CHECK
if (!ok || first != last) {
std::cout << "Failed parsing input file" << std::endl;
return 1;
}
return 0;
}
As sehe points out,
/is likely intended to be used as a lookahead operator, likely taking after the syntax of flex. It's unfortunate that Spirit wouldn't use more normal lookahead syntax (not that I think that other syntax is more elegant; it just gets confusing with all the subtle variations in regex syntax).If you look at
re_tokeniser.hpp:It thinks you're not in an escape sequence nor are you inside a string, so it's checking for meta characters.
/is considered a meta character for lookahead (even though the feature isn't implemented), and must be escaped, despite the Boost docs not mentioning that at all.Try escaping the
/(not in the input) with a backslash (i.e."\\/", or"\/"if using a raw string). Alternatively, others have suggested using[/].I'd consider this a bug in the Spirit Lex documentation for it lacking to point out that
/must be escaped.Edit: kudos to sehe and cv_and_he, who helped correct some of my earlier thinking. If they post an answer here, be sure to give them a +1.