I seem to be experiencing some mental block with Boost Spirit I just cannot get by. I have a fairly simple grammar I need to handle, where I would like to put the values into a struct, that contains a std::map<> as one of it's members. The key names for the pairs are known up front, so that only those are allowed. There could be one to many keys in the map, in any order with each key name validated via qi.
The grammar looks something like this, as an example.
test .|*|<hostname> add|modify|save ( key [value] key [value] ... ) ;
//
test . add ( a1 ex00
a2 ex01
a3 "ex02,ex03,ex04" );
//
test * modify ( m1 ex10
m2 ex11
m3 "ex12,ex13,ex14"
m4 "abc def ghi" );
//
test 10.0.0.1 clear ( c1
c2
c3 );
In this example the keys for “add” being a1, a2 and a3, likewise for “modify” m1, m2, m3 and m4 and each must contain a value. For “clear” the keys of the map c1, c2 and c3 may not contain a value. Also, let's say for this example you can have up to 10 keys (a1 ... a11, m1 ... m11 and c1 ... c11) any combination of them could be used, in any order, for their corresponding action. Meaning that you cannot use the known key cX for the "add" or mX for "clear"
The structure follows this simple pattern
//
struct test
{
std::string host;
std::string action;
std::map<std::string,std::string> option;
}
So from the above examples, I would expect to have the struct contain ...
// add ...
test.host = .
test.action = add
test.option[0].first = a1
test.option[0].second = ex00
test.option[1].first = a2
test.option[1].second = ex01
test.option[2].first = a3
test.option[2].second = ex02,ex03,ex04
// modify ...
test.host = *
test.action = modify
test.option[0].first = m1
test.option[0].second = ex10
test.option[1].first = m2
test.option[1].second = ex11
test.option[2].first = m3
test.option[2].second = ex12,ex13,ex14
test.option[2].first = m3
test.option[2].second = abc def ghi
// clear ...
test.host = *
test.action = 10.0.0.1
test.option[0].first = c1
test.option[0].second =
test.option[1].first = c2
test.option[1].second =
test.option[2].first = c3
test.option[2].second =
I can get each indivudal part working, standalone, but I cannot seem to them working together. For example I have the host and action working without the map<>.
I’ve adapted a previously posted example from Sehe (here) trying to get this to work (BTW: Sehe has some awesome examples, which I’ve been using as much as the documentation).
Here is an excerpt (obviously not working), but at least shows where I’m trying to go.
namespace ast {
namespace qi = boost::spirit::qi;
//
using unused = qi::unused_type;
//
using string = std::string;
using strings = std::vector<string>;
using list = strings;
using pair = std::pair<string, string>;
using map = std::map<string, string>;
//
struct test
{
using preference = std::map<string,string>;
string host;
string action;
preference option;
};
}
//
BOOST_FUSION_ADAPT_STRUCT( ast::test,
( std::string, host )
( std::string, action ) )
( ast::test::preference, option ) )
//
namespace grammar
{
//
template <typename It>
struct parser
{
//
struct skip : qi::grammar<It>
{
//
skip() : skip::base_type( text )
{
using namespace qi;
// handle all whitespace (" ", \t, ...)
// along with comment lines/blocks
//
// comment blocks: /* ... */
// // ...
// -- ...
// # ...
text = ascii::space
| ( "#" >> *( char_ - eol ) >> ( eoi | eol ) ) // line comment
| ( "--" >> *( char_ - eol ) >> ( eoi | eol ) ) // ...
| ( "//" >> *( char_ - eol ) >> ( eoi | eol ) ) // ...
| ( "/*" >> *( char_ - "*/" ) >> "*/" ); // block comment
//
BOOST_SPIRIT_DEBUG_NODES( ( text ) )
}
//
qi::rule<It> text;
};
//
struct token
{
//
token()
{
using namespace qi;
// common
string = '"' >> *("\\" >> char_ | ~char_('"')) >> '"';
identity = char_("a-zA-Z_") >> *char_("a-zA-Z0-9_");
real = double_;
integer = int_;
//
value = ( string | identity );
// ip target
any = '*';
local = ( char_('.') | fqdn );
fqdn = +char_("a-zA-Z0-9.\\-" ); // consession
ipv4 = +as_string[ octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] >> '.'
>> octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] >> '.'
>> octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] >> '.'
>> octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] ];
//
target = ( any | local | fqdn | ipv4 );
//
pair = identity >> -( attr( ' ' ) >> value );
map = pair >> *( attr( ' ' ) >> pair );
list = *( value );
//
BOOST_SPIRIT_DEBUG_NODES( ( string )
( identity )
( value )
( real )
( integer )
( any )
( local )
( fqdn )
( ipv4 )
( target )
( pair )
( keyval )
( map )
( list ) )
}
//
qi::rule<It, std::string()> string;
qi::rule<It, std::string()> identity;
qi::rule<It, std::string()> value;
qi::rule<It, double()> real;
qi::rule<It, int()> integer;
qi::uint_parser<unsigned, 10, 1, 3> octet;
qi::rule<It, std::string()> any;
qi::rule<It, std::string()> local;
qi::rule<It, std::string()> fqdn;
qi::rule<It, std::string()> ipv4;
qi::rule<It, std::string()> target;
//
qi::rule<It, ast::map()> map;
qi::rule<It, ast::pair()> pair;
qi::rule<It, ast::pair()> keyval;
qi::rule<It, ast::list()> list;
};
//
struct test : token, qi::grammar<It, ast::test(), skip>
{
//
test() : test::base_type( command_ )
{
using namespace qi;
using namespace qr;
auto kw = qr::distinct( copy( char_( "a-zA-Z0-9_" ) ) );
// not sure how to enforce the "key" names!
key_ = *( '(' >> *value >> ')' );
// tried using token::map ... didn't work ...
//
add_ = ( ( "add" >> attr( ' ' ) ) [ _val = "add" ] );
modify_ = ( ( "modify" >> attr( ' ' ) ) [ _val = "modify" ] );
clear_ = ( ( "clear" >> attr( ' ' ) ) [ _val = "clear" ] );
//
action_ = ( add_ | modify_ | clear_ );
/* *** can't get from A to B here ... not sure what to do *** */
//
command_ = kw[ "test" ]
>> target
>> action_
>> ';';
BOOST_SPIRIT_DEBUG_NODES( ( command_ )
( action_ )
( add_ )
( modify_ )
( clear_ ) )
}
//
private:
//
using token::value;
using token::target;
using token::map;
qi::rule<It, ast::test(), skip> command_;
qi::rule<It, std::string(), skip> action_;
//
qi::rule<It, std::string(), skip> add_;
qi::rule<It, std::string(), skip> modify_;
qi::rule<It, std::string(), skip> clear_;
};
...
};
}
I hope this question isn't too ambiguous and if you need a working example of the problem, I can certainly provide that. Any and all help is greatly appreciated, so thank you in advance!
Notes:
with this
did you mean to require a space? Or are you really just trying to force the struct
action
field to contain a trailing space (that's what will happen).If you meant the latter, I'd do that outside of the parser¹.
If you wanted the first, use the
kw
facility:In fact, you can simplify that (again, ¹):
Which also means that you can simplify to
However, getting a bit close to your question, you could also use a symbol parser:
If you meant to capture the input representation of ipv4 addresses with
Simplify:
This obviates the range checks (see ¹ again):
However, this would build a 4-char binary string representation of the address. If you wanted that, fine. I doubt it (because you'd have written
std::array<uint8_t, 4>
oruint64_t
, right?). So if you wanted the string, again useraw[]
:Same issue as with number 1.:
This time, the problem betrays that the productions should not be in
token
; Conceptuallytoken
-izing precedes parsing and hence I'd keep the tokens skipper-less.kw
doesn't really do a lot of good in that context. Instead, I'd movepair
,map
andlist
(unused?) into the parser:Some examples
There's a very recent example I made about using
symbols
to parse (here), but this answer comes a lot closer to your question:It goes far beyond the scope of your parser because it does all kinds of actions in the grammar, but what it does show is to have generic "lookup-ish" rules that can be parameterized with a particular "symbol set": see the Identifier Lookup section of the answer:
I think you can use the same approach constructively here. One additional gimmick could be to validate that properties supplied are, in fact, unique.
Demo Work
Combining all the hints above makes it compile and "parse" the test commands:
Live On Coliru
Prints
Let's restrict the Keys
Like in the linked answer above, let's pass the
map
,pair
rules the actual key set to get their allowed values from:Now the heavy lifting starts.
Using
qi::locals<>
and inherited attributesLet's give
command_
some local space to store the selected keyset:Now we can in principle assignt to it (using the
_a
placeholder). However, there's some details:Always prefer descriptive names :)
_a
and_r1
get old pretty quick. Things are confusing enough as it is.But all in all, that doesn't read so bad?
And now at least things parse
Almost there
Live On Coliru
Prints
HOLD ON, NOT SO FAST! It's wrong
Yeah. If you enable debug, you'll see it parses things oddly:
This is actually "merely" a problem with the grammar. If the grammar cannot see the difference between a
key
andvalue
then obviouslyc2
is going to be parsed as the value of property with keyc1
.It's up to you to disambiguate the grammar. For now, I'm going to demonstrate a fix using a negative assertion: we only accept values that are not known keys. It's a bit dirty, but might be useful to you for instructional purposes:
Note I factored out the
key
rule for readability:Live On Coliru
Parses
Just what the doctor ordered!
¹ Boost Spirit: "Semantic actions are evil"?