Difficulty fixing dangling else problem without assoc in Bison - parsing

I have tried to follow the advice on fixing dangling else problem for an assignments grammar but I can't quite get it. I still have the a single shift/reduce conflict. I can't use assoc or left nor right for the Bison file.
I have tried to exhaust all of the posts on StackOverflow and gone through Google to no avail.
ConditionStmt
: ClosedStatement { $$ = $1;}
| OpenStatement {$$ = $1; }
;
OpenStatement
: "if" SimpleExpression "then" Statement {$$ = ast::action::MakeStatement<ast::Branch>({$2, $4}, #1, ast::BranchType::kIf); }
| "if" SimpleExpression "then" ClosedStatement "else" OpenStatement {$$ = ast::action::MakeStatement<ast::Branch>({$2, $4, $6}, #1, ast::BranchType::kIf); }
| "while" SimpleExpression "do" OpenStatement { $$ = ast::action::MakeStatement<ast::Iterator>({$2, $4}, #1, ast::IteratorType::kWhile); }
;
ClosedStatement
: "if" SimpleExpression "then" ClosedStatement "else" ClosedStatement { \
$$ = ast::action::MakeStatement<ast::Branch>({$2, $4, $6}, #1, ast::BranchType::kIf); \
}
| "while" SimpleExpression "do" ClosedStatement { $$ = ast::action::MakeStatement<ast::Iterator>({$2, $4}, #1, ast::IteratorType::kWhile); }
| NonIfStatement
;
This is the bison output
State 148
42 ConditionStmt: ClosedStatement .
45 OpenStatement: "if" SimpleExpression "then" ClosedStatement . "else" OpenStatement
47 ClosedStatement: "if" SimpleExpression "then" ClosedStatement . "else" ClosedStatement
"else" shift, and go to state 158
"else" [reduce using rule 42 (ConditionStmt)]
$default reduce using rule 42 (ConditionStmt)
State 158
45 OpenStatement: "if" SimpleExpression "then" ClosedStatement "else" . OpenStatement
47 ClosedStatement: "if" SimpleExpression "then" ClosedStatement "else" . ClosedStatement
"if" shift, and go to state 61
"return" shift, and go to state 62
"for" shift, and go to state 63
"while" shift, and go to state 64
"break" shift, and go to state 65
"not" shift, and go to state 36
";" shift, and go to state 66
"{" shift, and go to state 67
"(" shift, and go to state 37
"-" shift, and go to state 38
"*" shift, and go to state 39
"?" shift, and go to state 40
kNumericConst shift, and go to state 41
kCharConst shift, and go to state 42
kString shift, and go to state 43
kIdentifier shift, and go to state 44
kBoolConst shift, and go to state 45
NonIfStatement go to state 69
ExpressionStmt go to state 70
CompoundStmt go to state 71
OpenStatement go to state 163
ClosedStatement go to state 164
IteratorStmt go to state 75
ReturnStmt go to state 76
BreakStmt go to state 77
Expression go to state 78
SimpleExpression go to state 79
AndExpr go to state 47
UnaryReletiveExpr go to state 48
ReletiveExpr go to state 49
SumExpr go to state 50
MulExp go to state 51
UnaryExpr go to state 52
UnaryOper go to state 53
Factor go to state 54
Mutable go to state 80
Immutable go to state 56
Call go to state 57
Constant go to state 58

These two lines in the bison .output file
"else" shift, and go to state 158
"else" [reduce using rule 42 (ConditionStmt)]
tell you that the shift-reduce conflict is between shifting an else and reducing to the non-terminal ConditionStmt. This tells you that the problem is somewhere you have ConditionStmt on the right-hand-side of a rule -- something about that context allows the ConditionStmt to be followed by an else. But in the grammar you show, ConditionStmt is never used, so we can't say what that problem is.
My best guess is that you have something in the rule for NonIfStatement (which you also don't show) that ends (directly or indirectly) with ConditionStmt on the rhs, which causes this problem.

Related

Shift/reduce conflict with ambiguous grammar

I've been stuck with some ambiguous grammar for a while now as yacc reports 6 shift/reduce conflicts. I've looked in the y.output file and have tried to understand how to look at the states and figure out what to do to fix the ambiguous grammar but to no avail. I'm legitimately stuck at how I'm supposed to fix the issues. I've looked at a lot of questions on stack overflow to see if other people's explanation would help me with my problem, but that hasn't helped me much either. For the record, I cannot use any precedence defining directives such as %left to solve the parsing conflicts.
Would someone be able to help me out by guiding me as to how I should change the grammar to fix the shift/reduce conflicts? Maybe by trying to resolve one of the issues and showing me the thinking process behind it? I know the grammar is quite long and hefty and I apologize in advance for that. If anyone is willing to spare their free time on this it would be greatly appreciated, but I realize that I may not be able to have that.
Anyways, here is my grammar in question (it is a slight expansion of the MiniJava grammar):
Grammar
0 $accept: program $end
1 program: main_class class_decl_list
2 main_class: CLASS ID '{' PUBLIC STATIC VOID MAIN '(' STRING '[' ']' ID ')' '{' statement '}' '}'
3 class_decl_list: class_decl_list class_decl
4 | %empty
5 class_decl: CLASS ID '{' var_decl_list method_decl_list '}'
6 | CLASS ID EXTENDS ID '{' var_decl_list method_decl_list '}'
7 var_decl_list: var_decl_list var_decl
8 | %empty
9 method_decl_list: method_decl_list method_decl
10 | %empty
11 var_decl: type ID ';'
12 method_decl: PUBLIC type ID '(' formal_list ')' '{' var_decl_list statement_list RETURN exp ';' '}'
13 formal_list: type ID formal_rest_list
14 | %empty
15 formal_rest_list: formal_rest_list formal_rest
16 | %empty
17 formal_rest: ',' type ID
18 type: INT
19 | BOOLEAN
20 | ID
21 | type '[' ']'
22 statement: '{' statement_list '}'
23 | IF '(' exp ')' statement ELSE statement
24 | WHILE '(' exp ')' statement
25 | SOUT '(' exp ')' ';'
26 | SOUT '(' STRING_LITERAL ')' ';'
27 | ID '=' exp ';'
28 | ID index '=' exp ';'
29 statement_list: statement_list statement
30 | %empty
31 index: '[' exp ']'
32 | index '[' exp ']'
33 exp: exp OP exp
34 | '!' exp
35 | '+' exp
36 | '-' exp
37 | '(' exp ')'
38 | ID index
39 | ID '.' LENGTH
40 | ID index '.' LENGTH
41 | INTEGER_LITERAL
42 | TRUE
43 | FALSE
44 | object
45 | object '.' ID '(' exp_list ')'
46 object: ID
47 | THIS
48 | NEW ID '(' ')'
49 | NEW type index
50 exp_list: exp exp_rest_list
51 | %empty
52 exp_rest_list: exp_rest_list exp_rest
53 | %empty
54 exp_rest: ',' exp
And here are the relevant states from y.output that have shift/reduce conflicts.
State 58
7 var_decl_list: var_decl_list . var_decl
12 method_decl: PUBLIC type ID '(' formal_list ')' '{' var_decl_list . statement_list RETURN exp ';' '}'
INT shift, and go to state 20
BOOLEAN shift, and go to state 21
ID shift, and go to state 22
ID [reduce using rule 30 (statement_list)]
$default reduce using rule 30 (statement_list)
var_decl go to state 24
type go to state 25
statement_list go to state 69
State 76
38 exp: ID . index
39 | ID . '.' LENGTH
40 | ID . index '.' LENGTH
46 object: ID .
'[' shift, and go to state 64
'.' shift, and go to state 97
'.' [reduce using rule 46 (object)]
$default reduce using rule 46 (object)
index go to state 98
State 100
33 exp: exp . OP exp
34 | '!' exp .
OP shift, and go to state 103
OP [reduce using rule 34 (exp)]
$default reduce using rule 34 (exp)
State 101
33 exp: exp . OP exp
35 | '+' exp .
OP shift, and go to state 103
OP [reduce using rule 35 (exp)]
$default reduce using rule 35 (exp)
State 102
33 exp: exp . OP exp
36 | '-' exp .
OP shift, and go to state 103
OP [reduce using rule 36 (exp)]
$default reduce using rule 36 (exp)
State 120
33 exp: exp . OP exp
33 | exp OP exp .
OP shift, and go to state 103
OP [reduce using rule 33 (exp)]
$default reduce using rule 33 (exp)
And there we have it. I apologize again for the length of this grammar and the number of shift/reduce conflicts. I just cannot seem to understand how to fix them by changing the grammar in question. Any help would be thoroughly appreciated, though if no one has time to look through such a massive post, I would understand. If anyone needs more information, don't hesitate to ask.
The basic problem is that when parsing a method_decl body, it can't tell where the var_decl_list ends and the statement_list begins. This is because when the lookahead is ID, it doesn't know whether that is the start of another var_decl or the start of the first statement, and it needs to reduce an empty statement before it can start working on a statement_list.
There are a number of ways you can deal with this:
have the lexer return different tokens for type IDs and other IDs -- that way the difference will tell the parser which is next.
don't require an empty statement at the start of a statement list. Change the grammar to:
statement_list: statement | statement_list statement ;
opt_statement_list: statement_list | %empty ;
and use opt_statement_list in the method_decl rule. This gets around the problem of having to reduce an empty statement_list before you start parsing statements. This is a process known as "unfactoring" the grammar as you are replacing rules with multiple variations. It makes the grammar more complex, and in this case, doesn't solve the problem, it just moves it; you'll then see shift/reduce conflicts betweeen statement: ID . index and type: ID on a [ lookahead. This problem can also be solved by unfactoring, but is harder.
So this brings up the general idea of resolving shift-reduce conflicts by unfactoring. The basic idea is to get rid of the rule causing the reduce half of the shift reduce conflict, replacing it with rules that are more limited in context, so don't trigger the conflict. The example above is easily solved by the "replace a 0-or-more recursive repeat with a 1-or-more recursive repeat and an optional rule". This works well for shift-reduce conflicts on the epsilon rule of the repeat if the following context means you can easily resolve when the 0-case should be legal (only when the next token is } in this case.)
The second conflict is tougher. Here the conflict is on reducing type: ID when the lookahead is [. So we need to duplicate type rules until that is not necessary. Something like:
type: simpleType | arrayType ;
simpleType: INT | BOOLEAN | ID ;
arrayType: INT '[' ']' | BOOLEAN '[' ']' | ID '[' ']'
| arrayType '[' ']' ;
replaces the "0 or more repetitions of the '[' ']' suffix" with "1 or more" and works for similar reasons (defers the reduction until after seeing the '[' ']' instead of requiring it before.) The key being that the simpleType: ID rule never needs to be reduced when the lookahead is '[' as it is only valid in other contexts.

Conflicting shift/reduce with conditions and PEMDAS EBNF in Bison

I currently have an error in my Bison with regards to my EBNF while creating the parsing for PEMDAS and conditions. I don't understand well on how I'm going to reduce the EBNF.
I got 1 shift/reduce error in the output file. Here's the part where the conflicting appears. The "expr" in "rel_cond" appears to be in conflict when I declared the "rel_expr" and "par_expr" rules.
cond : rel_cond
| par_cond
;
rel_cond : expr relop cond
| expr
;
par_cond : PAR_START cond PAR_END
;
.
.
expr : rel_expr
| par_expr
;
par_expr : PAR_START expr PAR_END
;
rel_expr : term ADD expr
| term SUB expr
| term
;
term : factor MULT factor
| factor DIV factor
| factor
;
factor : CONSTANT
| ID
;
Here's what appears in the .output file
state 47
16 rel_cond: expr . relop expr
17 | expr .
32 par_expr: PAR_START expr . PAR_END
EQT shift, and go to state 49
NOT_EQT shift, and go to state 50
LT shift, and go to state 51
GT shift, and go to state 52
LT_EQT shift, and go to state 53
GT_EQT shift, and go to state 54
BIT_AND shift, and go to state 55
LOG_AND shift, and go to state 56
BIT_OR shift, and go to state 57
LOG_OR shift, and go to state 58
PAR_END shift, and go to state 41
PAR_END [reduce using rule 17 (rel_cond)]
relop go to state 59
Your ouput file doesn't correspond to your grammar, so you're not running bison on the input file you think you are. The output file has the rule
rel_cond: expr relop expr
while your rules have
rel_cond : expr relop cond
but in either case the problem is the same -- the rule here is ambiguous because it doesn't specifiy whether it is right or left recursive. An input like
expr relop expr relop expr
can be parsed as either (expr relop expr) relop expr or expr relop (expr relop expr). The default "shift before reduce" conflict resolution means the resulting parser will be left recursive, but you can get rid of the conflict by using precedence rule, or by rewriting the grammar to be explicitly left or right recursive.

yacc shift/reduce in parser

I am writing a parser for a compiler in one homework and when I am running the command
$ bison --yacc -v --defines -o parser.c parser.y
parser.y: warning: 8 shift/reduce conflicts [-Wconflicts-sr]
$
Except of the if/else shift/reduce conflict which is expected I am taking in parser.output file conflicts in the following states,
State 34
35 term: lvalue . PLUSPLUS
37 | lvalue . MINUSMINUS
39 assignexpr: lvalue . ASSIGN expr
40 primary: lvalue .
49 member: lvalue . FULLSTOP IDENTIFIER
50 | lvalue . LEFTSQUARE expr RIGHTSQUARE
54 call: lvalue . callsuffix
ASSIGN shift, and go to state 88
PLUSPLUS shift, and go to state 89
MINUSMINUS shift, and go to state 90
LEFTSQUARE shift, and go to state 91
FULLSTOP shift, and go to state 92
LEFTPAR shift, and go to state 93
PLUSPLUS [reduce using rule 40 (primary)]
MINUSMINUS [reduce using rule 40 (primary)]
LEFTSQUARE [reduce using rule 40 (primary)]
LEFTPAR [reduce using rule 40 (primary)]
$default reduce using rule 40 (primary)
callsuffix go to state 94
normcall go to state 95
methodcall go to state 96
State 36
41 primary: call .
51 member: call . FULLSTOP IDENTIFIER
52 | call . LEFTSQUARE expr RIGHTSQUARE
53 call: call . LEFTPAR elist RIGHTPAR
LEFTSQUARE shift, and go to state 97
FULLSTOP shift, and go to state 98
LEFTPAR shift, and go to state 99
LEFTSQUARE [reduce using rule 41 (primary)]
LEFTPAR [reduce using rule 41 (primary)]
$default reduce using rule 41 (primary)
State 52
16 expr: expr . PLUS expr
17 | expr . MINUS expr
18 | expr . MUL expr
19 | expr . DIV expr
20 | expr . MOD expr
21 | expr . GREATER expr
22 | expr . GREATER_EQUAL expr
23 | expr . LESS expr
24 | expr . LESS_EQUAL expr
25 | expr . EQUAL expr
26 | expr . NOTEQUAL expr
27 | expr . AND expr
28 | expr . OR expr
95 returnstmt: RETURN expr .
PLUS shift, and go to state 74
MINUS shift, and go to state 75
MUL shift, and go to state 76
DIV shift, and go to state 77
MOD shift, and go to state 78
EQUAL shift, and go to state 79
NOTEQUAL shift, and go to state 80
OR shift, and go to state 81
AND shift, and go to state 82
GREATER shift, and go to state 83
LESS shift, and go to state 84
GREATER_EQUAL shift, and go to state 85
LESS_EQUAL shift, and go to state 86
MINUS [reduce using rule 95 (returnstmt)]
$default reduce using rule 95 (returnstmt)
Any idea how to solve it?
Its difficult to say what the problem is as you don't show your full grammar. Often times conflicts are due to other rules in the grammar (not the rules shown in the states with the conflict) due to the context, or how the rules are combined.
state 34/36:
It looks like you have some sort of circular ambiguity between the rules for primary, lvalue, and call. What are these (full) rules? How do expect to know the difference between an lvalue and a primary?
state 52: Here it looks like an ambiguity between a returnstmt and a following expression that begins with a MINUS. It looks like you do not have a statement terminator/separator?
These all might be the same underlying problem -- the parser can't figure out where one statement ends and the next begins...

Rewriting Bison grammar to fix shift/reduce conflicts

Here are the relevant parts of my Bison grammar rules:
statement:
expression ';' |
IF expression THEN statement ELSE statement END_IF ';'
;
expression:
IDENTIFIER |
IDENTIFIER '('expressions')' |
LIT_INT |
LIT_REAL |
BOOL_OP |
LOG_NOT expression |
expression operator expression |
'('expression')'
;
expressions:
expression |
expressions ',' expression
;
operator:
REL_OP |
ADD_OP |
MULT_OP |
LOG_OR |
LOG_AND
;
When compiling, I get 10 shift/reduce conflicts:
5 conflicts are caused by the LOG_NOT expression rule:
State 45
25 expression: LOG_NOT expression .
26 | expression . operator expression
REL_OP shift, and go to state 48
ADD_OP shift, and go to state 49
MULT_OP shift, and go to state 50
LOG_OR shift, and go to state 51
LOG_AND shift, and go to state 52
REL_OP [reduce using rule 25 (expression)]
ADD_OP [reduce using rule 25 (expression)]
MULT_OP [reduce using rule 25 (expression)]
LOG_OR [reduce using rule 25 (expression)]
LOG_AND [reduce using rule 25 (expression)]
$default reduce using rule 25 (expression)
operator go to state 54
5 conflicts are caused by the expressions operator expression rule:
State 62
26 expression: expression . operator expression
26 | expression operator expression .
REL_OP shift, and go to state 48
ADD_OP shift, and go to state 49
MULT_OP shift, and go to state 50
LOG_OR shift, and go to state 51
LOG_AND shift, and go to state 52
REL_OP [reduce using rule 26 (expression)]
ADD_OP [reduce using rule 26 (expression)]
MULT_OP [reduce using rule 26 (expression)]
LOG_OR [reduce using rule 26 (expression)]
LOG_AND [reduce using rule 26 (expression)]
$default reduce using rule 26 (expression)
operator go to state 54
I know that the problem has to do with precedence. For instance, if the expression was:
a + b * c
Does Bison shift after the a + and hope to find an expression, or does it reduce the a to an expression? I have a feeling that this is due to the Bison 1-token look-ahead limitation, but I can't figure out how to rewrite the rule(s) to resolve the conflicts.
My professor will take points off for shift/reduce conflicts, so I can't use %expect. My professor has also stated that we cannot use %left or %right precedence values.
This is my first post on Stack, so please let me know if I'm posting this all wrong. I've searched existing posts, but this really seems a case-by-case thing. If I use any code from Stack, I will note the source in my submitted project.
Thanks!
As written, your grammar is ambiguous. So it must have conflicts.
There is no inherent rule of binding precedences, and apparently you're not allowed to use bison's precedence declarations either. If you were allowed to, you wouldn't be able to use operator as a non-terminal, because you need to distinguish between
expr1 + expr2 * expr3 expr1 * expr2 + expr3
| | | | | |
| +---+---+ +---+---+ |
| | | |
| expr expr |
| | | |
+-----+-----+ +-----+-----+
| |
expr expr
And you cannot distinguish between them if + and * are replaced with operator. The terminals actually have to be visible.
Now, here's a quick clue:
expr1 + expr2 + expr3 reduces expr1 + expr2 first
expr1 * expr2 + expr3 reduces expr1 * expr2 first
So in non-terminal-1 + non-terminal-2, non-terminal-1 cannot produce x + y or x * y. But in non-terminal-1 * non-terminal-2, non-terminal-1 can produce `x + y
Thanks! I did some more troubleshooting, and fixed the reduce/conflict errors by rewriting the expression and operator rules:
expression:
expression LOG_OR term1 |
term1
;
term1:
term1 LOG_AND term2 |
term2
;
term2:
term2 REL_OP term3 |
term3
;
term3:
term3 ADD_OP term4 |
term4
;
term4:
term4 MULT_OP factor |
factor
;
factor:
IDENTIFIER |
IDENTIFIER '('expressions')' |
LIT_INT |
LIT_REAL |
BOOL_OP |
LOG_NOT factor |
'('expression')'
;
expressions:
expression |
expressions ',' expression
;
I had to rearrange what was actually an expression and what was actually a factor. I made a factor rule that includes all of the factors (terminals), which has the highest precedence. I then made a term# rule for each expression, which also sets them at different precedence levels (term5 has higher precedence than term4, term4 has higher precedence than term3, etc.).
This allowed me to set each operator at a difference precedence without using any of the built-in % precedence functions.
I was able to parse all of my test input files without error. Any thoughts on the design?

i am trying to build a parser for mini java where i am getting shift/reduce conflicts in the expression grammar part.I can't resolve this conflict

this is part of y.ouput file
state 65
15 Expression: Expression . "&&" Expression
16 | Expression . "<" Expression
17 | Expression . "+" Expression
18 | Expression . "-" Expression
19 | Expression . "*" Expression
20 | Expression . "[" Expression "]"
21 | Expression . "." "length"
22 | Expression . "." Identifier "(" Expression "," Expression ")"
25 | "!" Expression .
"[" shift, and go to state 67
"<" shift, and go to state 69
"+" shift, and go to state 70
"-" shift, and go to state 71
"*" shift, and go to state 72
"." shift, and go to state 73
"[" [reduce using rule 25 (Expression)]
"<" [reduce using rule 25 (Expression)]
"+" [reduce using rule 25 (Expression)]
"-" [reduce using rule 25 (Expression)]
"*" [reduce using rule 25 (Expression)]
"." [reduce using rule 25 (Expression)]
$default reduce using rule 25 (Expression)
this is how the precedence of operators is set
%left "&&"
%left '<'
%left '-' '+'
%left '*'
%right '!'
%left '.'
%left '(' ')'
%left '[' ']'
In bison, there is a difference between "x" and 'x'; they are not the same token. So, assuming you are using bison, your precedence declarations don't refer to the terminals in the productions.
Bison also allows %token definitions of the following form:
%token name quoted-string ...
For example (a short excerpt from bison's own grammar file):
%token
PERCENT_CODE "%code"
PERCENT_DEBUG "%debug"
PERCENT_DEFAULT_PREC "%default-prec"
PERCENT_DEFINE "%define"
PERCENT_DEFINES "%defines"
PERCENT_ERROR_VERBOSE "%error-verbose"
Once the symbols have been aliased, they can be used interchangeably in the grammar, making it possible to use the double-quoted string in productions; some people find such grammars easier to read. However, there is no mechanism to ensure that the lexer produces the correct token number for a double-quoted string since it only has access to the token names.
The "original" yacc, at least in the current "byacc" version maintained by Thomas Dickey, allows both single- and double-quoted token names, but does not distinguish between them; both "+" and '+' are mapped to token number 43 ('+'). It also does not provide any easy way to alias token names, so the double-quoted multi-character strings are not particularly easy to use in a reliable way.

Resources