I'm devising a very simple grammar, where I use the unary minus operand. However, I get a shift/reduce conflict. In the Bison manual, and everywhere else I look, it says that I should define a new token and give it higher precedence than the binary minus operand, and then use "%prec TOKEN" in the rule.
I've done that, but I still get the warning. Why?
I'm using bison (GNU Bison) 2.4.1. The grammar is shown below:
%{
#include <string>
extern "C" int yylex(void);
%}
%union {
std::string token;
}
%token <token> T_IDENTIFIER T_NUMBER
%token T_EQUAL T_LPAREN T_RPAREN
%right T_EQUAL
%left T_PLUS T_MINUS
%left T_MUL T_DIV
%left UNARY
%start program
%%
program : statements expr
;
statements : '\n'
| statements line
;
line : assignment
| expr
;
assignment : T_IDENTIFIER T_EQUAL expr
;
expr : T_NUMBER
| T_IDENTIFIER
| expr T_PLUS expr
| expr T_MINUS expr
| expr T_MUL expr
| expr T_DIV expr
| T_MINUS expr %prec UNARY
| T_LPAREN expr T_RPAREN
;
%prec
doesn't do as much as you might hope here. It tells Bison that in a situation where you have- a * b
you want to parse this as(- a) * b
instead of- (a * b)
. In other words, here it will prefer theUNARY
rule over theT_MUL
rule. In either case, you can be certain that theUNARY
rule will get applied eventually, and it is only a question of the order in which the input gets reduced to the unary argument.In your grammar, things are very much different. Any sequence of
line
non-terminals will make up asequence
, and there is nothing to say that aline
non-terminal must end at an end-of-line. In fact, any expression can be aline
. So here are basically two ways to parsea - b
: either as a single line with a binary minus, or as two “lines”, the second starting with a unary minus. There is nothing to decide which of these rules will apply, so the rule-based precedence won't work here yet.Your solution is correcting your line splitting, by requiring every
line
to actually end with or be followed by an end-of-line symbol.If you really want the behaviour your grammar indicates with respect to line endings, you'd need two separate non-terminals for expressions which can and which cannot start with a
T_MINUS
. You'd have to propagate this up the tree: the firstline
may start with a unary minus, but subsequent ones must not. Inside a parenthesis, starting with a minus would be all right again.