Cannot find shift/reduce & reduce/reduce errors - parsing

Ive been working on a project and have come across:
Below is the list of if declarations, I have everything here except for the types in typeSpec, but we don't have to worry about those for right now. I am fairly sure that I understand what shift/reduce and reduce/reduce errors are but (lucky for me) the compiler doesn't tell me where the conflicts are. Can someone point me in the right direction on where the conflicts are, or give me some tips on how to find where the conflicts are would be really appreciated. Also if I am missing something from the post, anything and everything is greatly appreciated aswell!
%%
program
: declList { syntaxTree = $1; }
;
declList
: declList decl { $$ = AddSibling($1, $2); }
| decl { $$ = $1; }
;
decl
: varDecl { $$ = $1; }
| funDecl { $$ = $1; }
;
varDecl
: typeSpec varDeclList SEMICOLON { $$ = $2; SetType($2, $1, false); }
;
scopedVarDecl
: STATIC typeSpec varDeclList SEMICOLON { $$ = $3; SetType($3, $2, true); }
| typeSpec varDeclList SEMICOLON { $$ = $2; SetType($2, $1, false); }
;
varDeclList
: varDeclList COMMA varDeclInit { $$ = AddSibling($1, $3); }
| varDeclInit { $$ = $1; }
;
varDeclInit
: varDeclId { $$ = $1; }
| varDeclId COLON simpleExp { $$ = $1; AddChild($$, $3); }
;
varDeclId
: ID { $$ = NewDeclNode(VarK, UndefinedType, $1); $$->tmp = $1->idIndex; }
| ID LBRACKET NUMCONST RBRACKET{ $$ = NewDeclNode(VarK, UndefinedType, $1); $$->isArray = true; $$->aSize = $3->nvalue; $$->tmp = $1->idIndex;}
;
typeSpec
: {}
;
funDecl
: typeSpec ID LPAREN parms RPAREN compoundStmt { $$ = NewDeclNode(FuncK, $1, $2, $4, $6);$$->tmp = $2->idIndex; SetType($$, $1, true);}
| ID LPAREN parms RPAREN compoundStmt{ $$ = NewDeclNode(FuncK, Void, $1, $3, $5); $$->tmp = $1->idIndex; }
;
parms
: parmList { $$ = $1; }
| { $$ = NULL; }
;
parmList
: parmList SEMICOLON parmTypeList { $$ = AddSibling($1, $3); }
| parmTypeList { $$ = $1; }
;
parmTypeList
: typeSpec parmIdList { $$ = $2; SetType($2, $1, false); }
;
parmIdList
: parmIdList COMMA parmId { $$ = AddSibling($1, $3); }
| parmId { $$ = $1; }
;
parmId
: ID { $$ = NewDeclNode(ParamK, Void, $1); $$->tmp = $1->svalue; }
| ID LBRACKET RBRACKET { $$ = NewDeclNode(ParamK, Void, $1); $$->isArray = true; $$->tmp = $1->svalue; }
;
stmt
: matched { $$ = $1; }
| unmatched { $$ = $1; }
;
matched
: expStmt { $$ = $1; }
| compoundStmt { $$ = $1; }
| returnStmt { $$ = $1; } //
| breakStmt { $$ = $1; } //
| matchedSelectStmt { $$ = $1; }
| matchedIterStmt { $$ = $1; }
;
unmatched
: unmatchedSelectStmt{ $$ = $1; }
| unmatchedIterStmt { $$ = $1; }
;
expStmt
: exp SEMICOLON { $$ = $1; }
| SEMICOLON { $$ = NULL; }
;
compoundStmt
: BEG localDecls stmtList END { $$ = NewStmtNode(CompoundK, $1, $2, $3); }
;
localDecls
: localDecls scopedVarDecl { $$ = AddSibling($1, $2); }
| { $$ = NULL; }
;
stmtList
: stmtList stmt { $$ = AddSibling($1, $2); }
| { $$ = NULL; }
;
matchedSelectStmt : IF simpleExp THEN matched ELSE matched
{ $$ = NewStmtNode(IfK, $1, $2, $4, $6); }
;
unmatchedSelectStmt
: IF simpleExp THEN stmt { $$ = NewStmtNode(IfK, $1, $2, $4); }
| IF simpleExp THEN matched ELSE unmatched { $$ = NewStmtNode(IfK, $1, $2, $4, $6); }
;
matchedIterStmt
: WHILE simpleExp DO matched { $$ = NewStmtNode(WhileK, $1, $2, $4); }
| FOR ID ASGN iterRange DO matched { $$ = NewStmtNode(ForK, $1, NewDeclNode(VarK, Integer, $2), $4, $6); $$->tmp = $2->idIndex; }
;
unmatchedIterStmt
: WHILE simpleExp DO unmatched { $$ = NewStmtNode(WhileK, $1, $2, $4); }
| FOR ID ASGN iterRange DO unmatched { $$ = NewStmtNode(ForK, $1, NewDeclNode(VarK, Integer, $2), $4, $6); $$->tmp = $2->idIndex; }
;
iterRange
: simpleExp TO simpleExp { $$ = NewStmtNode(RangeK, $2, $1, $3);}
| simpleExp TO simpleExp BY simpleExp { $$ = NewStmtNode(RangeK, $2, $1, $3, $5); $$->tmp = $2->idIndex; }
;
returnStmt
: RETURN SEMICOLON { $$ = NewStmtNode(ReturnK, $1); }
| RETURN exp SEMICOLON { $$ = NewStmtNode(ReturnK, $1, $2); }
;
breakStmt
: BREAK SEMICOLON { $$ = NewStmtNode(BreakK, $1); }
;
exp
: mutable ASGN exp { $$ = NewExpNode(AssignK, $2, $1, $3); }
| mutable ADDASGN exp { $$ = NewExpNode(AssignK, $2, $1, $3); }
| mutable SUBASGN exp { $$ = NewExpNode(AssignK, $2, $1, $3); }
| mutable MULASGN exp { $$ = NewExpNode(AssignK, $2, $1, $3); }
| mutable DIVASGN exp { $$ = NewExpNode(AssignK, $2, $1, $3); }
| mutable INC { $$ = NewExpNode(AssignK, $2, $1); }
| mutable DEC { $$ = NewExpNode(AssignK, $2, $1); }
| simpleExp {$$ = $1; }
;
simpleExp
: simpleExp OR andExp { $$ = NewExpNode(OpK, $2, $1, $3); }
| andExp { $$ = $1; }
;
andExp
: andExp AND unaryRelExp { $$ = NewExpNode(OpK, $2, $1, $3); }
| unaryExp { $$ = $1; }
;
unaryRelExp
: NOT unaryRelExp { $$ = NewExpNode(OpK, $1, $2); }
| relExp { $$ = $1; }
;
relExp
: sumExp relop sumExp { $$ = $2; AddChild($$, $1); AddChild($$, $3); }
| sumExp { $$ = $1; }
;
relop
: GT { $$ = NewExpNode(OpK, $1); }
| GEQ { $$ = NewExpNode(OpK, $1); }
| LT { $$ = NewExpNode(OpK, $1); }
| LEQ { $$ = NewExpNode(OpK, $1); }
| EQ { $$ = NewExpNode(OpK, $1); }
| NEQ { $$ = NewExpNode(OpK, $1); }
;
sumExp
: sumExp sumop mulExp { $$ = $2; AddChild($$,$1); AddChild($$,$3); }
| mulExp { $$ = $1; }
;
sumop
: ADD { $$ = NewExpNode(OpK, $1); }
| MINUS { $$ = NewExpNode(OpK, $1); }
;
mulExp
: mulExp mulop unaryExp { $$ = $2; AddChild($$, $1); AddChild($$, $3); }
| unaryExp { $$ = $1; }
;
mulop
: STAR { $$ = NewExpNode(OpK, $1); }
| DIV { $$ = NewExpNode(OpK, $1); }
| PERCENT { $$ = NewExpNode(OpK, $1); }
;
unaryExp
: unaryop unaryExp { $$ = $1; AddChild($$, $2); }
| factor { $$ = $1; }
;
unaryop
: MINUS { $$ = NewExpNode(OpK, $1); }
| STAR { $$ = NewExpNode(OpK, $1); }
| QMARK { $$ = NewExpNode(OpK, $1); }
;
factor
: mutable { $$ = $1; }
| immutable { $$ = $1; }
mutable
: ID { $$ = NewExpNode(IdK, $1); $$->name = $1->idIndex; }
| ID LBRACKET exp RBRACKET { $$ = NewExpNode(OpK, $2, NewExpNode(IdK, $1), $3); $$->child[0]->name = $1->idIndex; }
;
immutable
: LPAREN exp RPAREN { $$ = $2; }
| call { $$ = $1; }
| constant { $$ = $1; }
;
call
: ID LPAREN args RPAREN { $$ = NewExpNode(CallK, $1, $3); $$->name = $1->idIndex; }
;
args
: argList { $$ = $1; }
| { $$ = NULL; }
;
argList
: argList COMMA exp { $$ = AddSibling($1, $3); $$->name = $2->svalue; }
| exp { $$ = $1; }
;
constant
: NUMCONST { $$ = NewExpNode(ConstantK, $1); $$->expType = Integer; $$->value = $1->nvalue; }
| CHARCONST { $$ = NewExpNode(ConstantK, $1); $$->expType = Char; $$->cvalue = $1->cvalue; }
| STRINGCONST { $$ = NewExpNode(ConstantK, $1); $$->expType = String; $$->string = $1->svalue; }
| BOOLCONST { $$ = NewExpNode(ConstantK, $1); $$->expType = Boolean; $$->value = $1->nvalue;}
;
%%

Related

Yacc shift/reduce that I cannot identify

So I am having this .y file on which I am trying to parse and evaluate a function with it's parameters, but a have one shift/reduce conflict that I cannot identify:
.y
%{
#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>
#include "types.h"
#define YYDEBUG 0
/* prototypes */
nodeType *opr(int oper, int nops, ...);
nodeType *id(int i);
nodeType *con(int value);
void freeNode(nodeType *p);
void yyerror(char *s);
nodeType *RadEc;
int sym[26]; /* symbol table */
%}
%union {
int iValue; /* integer value */
char sIndex; /* symbol table index */
nodeType *nPtr; /* node pointer */
};
%token <iValue> INTEGER
%token <sIndex> VARIABLE
%token WHILE IF PRINT SUBRAD ENDSUB THEN DO ENDIF RAD
%nonassoc IFX
%nonassoc ELSE
%left GE LE EQ NE '>' '<'
%left '+' '-'
%left '*' '/'
%nonassoc UMINUS
%type <nPtr> statement expr stmt_list
%type <iValue> expresie
%start program
%%
program : declaratii cod { exit(0); }
;
declaratii: SUBRAD stmt_list ENDSUB { RadEc=$2; }
| /* NULL */
;
statement : '\n' { $$ = opr(';', 2, NULL, NULL); }
| expr '\n' { $$ = $1; }
| PRINT expr '\n' { $$ = opr(PRINT, 1, $2); }
| VARIABLE '=' expr '\n' { $$ = opr('=', 2, id($1), $3); }
| DO stmt_list WHILE expr { $$ = opr(WHILE, 2, $4, $2); }
| IF expr THEN stmt_list ENDIF %prec IFX { $$ = opr(IF, 2, $2, $4); }
| IF expr THEN stmt_list ELSE stmt_list ENDIF { $$ = opr(IF, 3, $2, $4, $6); }
;
stmt_list : statement
| stmt_list statement { $$ = opr(';', 2, $1, $2); }
;
expr : INTEGER { $$ = con($1); }
| VARIABLE { $$ = id($1); }
| '-' expr %prec UMINUS { $$ = opr(UMINUS, 1, $2); }
| expr '+' expr { $$ = opr('+', 2, $1, $3); }
| expr '-' expr { $$ = opr('-', 2, $1, $3); }
| expr '*' expr { $$ = opr('*', 2, $1, $3); }
| expr '/' expr { $$ = opr('/', 2, $1, $3); }
| expr '<' expr { $$ = opr('<', 2, $1, $3); }
| expr '>' expr { $$ = opr('>', 2, $1, $3); }
| expr GE expr { $$ = opr(GE, 2, $1, $3); }
| expr LE expr { $$ = opr(LE, 2, $1, $3); }
| expr NE expr { $$ = opr(NE, 2, $1, $3); }
| expr EQ expr { $$ = opr(EQ, 2, $1, $3); }
| '(' expr ')' { $$ = $2; }
;
cod : '.' {exit(0);}
| instruc '\n' cod
;
instruc : '\n'
| PRINT expresie {printf("%d\n",$2);}
| VARIABLE '=' expresie {sym[$1]=$3;}
| RAD'('expresie','expresie','expresie')' {sym[0]=$3; sym[1]=$5; sym[2]=$7; ex(RadEc);}
;
expresie : INTEGER { $$ = $1; }
| VARIABLE { $$ = sym[$1]; }
| '-' expresie %prec UMINUS { $$ = -$2; }
| expresie '+' expresie { $$ = $1+$3; }
| expresie '-' expresie { $$ = $1-$3; }
| expresie '*' expresie { $$ = $1*$3; }
| expresie '/' expresie { $$ = $1/$3; }
| expresie '<' expresie { $$ = $1<$3; }
| expresie '>' expresie { $$ = $1>$3; }
| expresie GE expresie { $$ = $1>=$3; }
| expresie LE expresie { $$ = $1<=$3; }
| expresie NE expresie { $$ = $1!=$3; }
| expresie EQ expresie { $$ = $1==$3; }
| '(' expresie ')' { $$ = $2; }
;
%%
nodeType *con(int value)
{
nodeType *p;
/* allocate node */
if ((p = malloc(sizeof(conNodeType))) == NULL)
yyerror("out of memory");
/* copy information */
p->type = typeCon;
p->con.value = value;
return p;
}
nodeType *id(int i)
{
nodeType *p;
/* allocate node */
if ((p = malloc(sizeof(idNodeType))) == NULL)
yyerror("out of memory");
/* copy information */
p->type = typeId;
p->id.i = i;
return p;
}
nodeType *opr(int oper, int nops, ...)
{
va_list ap;
nodeType *p;
size_t size;
int i;
/* allocate node */
size = sizeof(oprNodeType) + (nops - 1) * sizeof(nodeType*);
if ((p = malloc(size)) == NULL)
yyerror("out of memory");
/* copy information */
p->type = typeOpr;
p->opr.oper = oper;
p->opr.nops = nops;
va_start(ap, nops);
for (i = 0; i < nops; i++)
p->opr.op[i] = va_arg(ap, nodeType*);
va_end(ap);
return p;
}
void freeNode(nodeType *p)
{
int i;
if (!p)
return;
if (p->type == typeOpr) {
for (i = 0; i < p->opr.nops; i++)
freeNode(p->opr.op[i]);
}
free (p);
}
int ex(nodeType *p)
{
if (!p)
return 0;
switch(p->type)
{
case typeCon: return p->con.value;
case typeId: return sym[p->id.i];
case typeOpr: switch(p->opr.oper)
{
case WHILE: while(ex(p->opr.op[0]))
ex(p->opr.op[1]);
return 0;
case IF: if (ex(p->opr.op[0]))
ex(p->opr.op[1]);
else if (p->opr.nops > 2)
ex(p->opr.op[2]);
return 0;
case PRINT: printf("%d\n", ex(p->opr.op[0]));
return 0;
case ';': ex(p->opr.op[0]);
return ex(p->opr.op[1]);
case '=': return sym[p->opr.op[0]->id.i] = ex(p->opr.op[1]);
case UMINUS: return -ex(p->opr.op[0]);
case '+': return ex(p->opr.op[0]) + ex(p->opr.op[1]);
case '-': return ex(p->opr.op[0]) - ex(p->opr.op[1]);
case '*': return ex(p->opr.op[0]) * ex(p->opr.op[1]);
case '/': return ex(p->opr.op[0]) / ex(p->opr.op[1]);
case '<': return ex(p->opr.op[0]) < ex(p->opr.op[1]);
case '>': return ex(p->opr.op[0]) > ex(p->opr.op[1]);
case GE: return ex(p->opr.op[0]) >= ex(p->opr.op[1]);
case LE: return ex(p->opr.op[0]) <= ex(p->opr.op[1]);
case NE: return ex(p->opr.op[0]) != ex(p->opr.op[1]);
case EQ: return ex(p->opr.op[0]) == ex(p->opr.op[1]);
}
}
}
void yyerror(char *s)
{
fprintf(stdout, "%s\n", s);
}
int main(void)
{
#if YYDEBUG
yydebug = 1;
#endif
yyparse();
return 0;
}
I tried different ways to see were am I losing something, but I am pretty new at this and still cannot figure it out very well the conflicts.
Any help much appreciated.
Your grammar allows statements to be expressions and it allows two statements to appear in sequence without any separator.
Now, both of the following are expressions:
a
-1
Suppose they appear like that in a statement list. How is that different from this single expression?
a - 1
Ambiguity always shows up as a parsing conflict.
By the way, delimited if statements (with an endif marker) cannot exhibit the dangling else ambiguity. The endif bracket makes the parse unambiguous. So all of the precedence apparatus copied from a different grammar is totally redundant here.

why do I get this error, how do can I fix it

I am trying to run my first flex bison project and this happens:
aky#aky-VirtualBox:~/wk1$ flex project1.l
aky#aky-VirtualBox:~/wk1$ bison -d project1.y
aky#aky-VirtualBox:~/wk1$ gcc -o project1 project1.c project1.tab.c lex.yy.c
project1.c: In function ‘main’:
project1.c:18:9: warning: implicit declaration of function ‘yyparse’
project1.tab.c:1213:16: warning: implicit declaration of function ‘yylex’
lex.yy.c:(.text+0x470): undefined reference to `lookup'
The related code:
project1.c ----------------------------
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "project1.h"
void yyerror(char *s)
{
fprintf(stderr, "error: %s\n", s);
}
int main(int argc, char **argv)
{
extern FILE *yyin;
++argv; --argc;
yyin = fopen(argv[0], "r");
return yyparse();
}
project1.l ------------------------
%option noyywrap nodefault yylineno
%{
#include "project1.h"
#include "project1.tab.h"
%}
EXP ([Ex][-+]?[0-9]+)
%%
".." { return DOTS; }
"+" |
"-" |
"*" |
"/" |
"=" |
"|" |
"," |
";" |
":" |
"." |
"[" |
"]" |
"{" |
"}" |
"(" |
")" { return yytext[0]; }
">" { yylval.fn = 1; return CMP; }
"<" { yylval.fn = 2; return CMP; }
"<>" { yylval.fn = 3; return CMP; }
"==" { yylval.fn = 4; return CMP; }
">=" { yylval.fn = 5; return CMP; }
"<=" { yylval.fn = 6; return CMP; }
"integer" { yylval.type_c = 'a'; return STD_TYPE; }
"real" { yylval.type_c = 'b'; return STD_TYPE; }
"program" { return PROGRAM; }
"var" { return VAR; }
"array" { return ARRAY; }
"of" { return OF; }
"begin" { return BGN; }
"end" { return END; }
"if" { return IF; }
"then" { return THEN; }
"else" { return ELSE; }
"while" {return WHILE; }
"do" { return DO; }
"print" { return PRINT; }
[a-zA-Z][a-zA-Z0-9]* { yylval.s = lookup(yytext); return ID; }
[0-9]+"."[0-9]+ |
[0-9]+ { yylval.d = atof(yytext); return NUMBER; }
"//".*
[ \t\n]
. { yyerror("Mystery character.\n"); }
%%
project1.y ------------------------
%{
#include <stdio.h>
#include <stdlib.h>
#include "project1.h"
%}
%union {
struct ast *a;
double d;
struct symbol *s;
struct symlist *sl;
struct numlist *nl;
int fn;
char type_c;
}
/* declare tokens */
%token <d> NUMBER
%token <s> ID
%token PROGRAM VAR ARRAY OF INTEGER REAL BGN END IF THEN ELSE WHILE DO DOTS PRINT
%token <type_c> STD_TYPE
%nonassoc <fn> CMP
%right '='
%left '+' '-'
%left '*' '/'
%nonassoc '|' UMINUS
%type <a> decl_list decl stmt_list stmt exp
%type <sl> id_list
%type <nl> num_list
%start program
%%
program: PROGRAM ID '(' id_list ')' ';' decl_list BGN stmt_list END '.'
{ printf("new program.\n"); }
;
decl_list: { /*$$ = NULL;*/ }
| decl ';' decl_list { printf("new declaration.\n"); }
;
decl: VAR id_list ':' STD_TYPE { }
| VAR id_list ':' ARRAY '[' NUMBER DOTS NUMBER ']' OF STD_TYPE
{ }
;
stmt: IF exp THEN '{' stmt_list '}' { }
| IF exp THEN '{' stmt_list '}' ELSE '{' stmt_list '}' { }
| WHILE exp DO '{' stmt_list '}' { }
| exp
;
stmt_list: stmt { printf("new statement.\n"); }
| stmt_list ';' stmt { }
;
exp: exp CMP exp { }
| exp '+' exp { }
| exp '-' exp { }
| exp '*' exp { }
| exp '/' exp { }
| '|' exp { }
| '(' exp ')' { }
| '-' exp %prec UMINUS { }
| NUMBER{ }
| ID { }
| ID '[' exp ']' { }
| ID '[' exp ']' '=' exp { }
| ID '=' exp { }
| ID '=' '{' num_list '}' { }
| PRINT '(' exp ')' { }
;
num_list: NUMBER { }
| NUMBER ',' num_list {}
;
id_list: ID { }
| ID ',' id_list { }
;
%%
project1.tab.h --------------------
#ifndef YY_YY_PROJECT1_TAB_H_INCLUDED
# define YY_YY_PROJECT1_TAB_H_INCLUDED
/* Debug traces. */
#ifndef YYDEBUG
# define YYDEBUG 0
#endif
#if YYDEBUG
extern int yydebug;
#endif
/* Token type. */
#ifndef YYTOKENTYPE
# define YYTOKENTYPE
enum yytokentype
{
NUMBER = 258,
ID = 259,
PROGRAM = 260,
VAR = 261,
ARRAY = 262,
OF = 263,
INTEGER = 264,
REAL = 265,
BGN = 266,
END = 267,
IF = 268,
THEN = 269,
ELSE = 270,
WHILE = 271,
DO = 272,
DOTS = 273,
PRINT = 274,
STD_TYPE = 275,
CMP = 276,
UMINUS = 277
};
#endif
/* Value type. */
#if ! defined YYSTYPE && ! defined YYSTYPE_IS_DECLARED
union YYSTYPE
{
#line 7 "project1.y" /* yacc.c:1909 */
struct ast *a;
double d;
struct symbol *s;
struct symlist *sl;
struct numlist *nl;
int fn;
char type_c;
#line 87 "project1.tab.h" /* yacc.c:1909 */
};
typedef union YYSTYPE YYSTYPE;
# define YYSTYPE_IS_TRIVIAL 1
# define YYSTYPE_IS_DECLARED 1
#endif
extern YYSTYPE yylval;
int yyparse (void);
#endif /* !YY_YY_PROJECT1_TAB_H_INCLUDED */
yyparse is declared in project1.tab.h so you need to #include that file in any translation unit which refers to yyparse.
yylex is not declared in any header. In your yacc/bison file, you need to insert a correct declaration:
int yylex(void);
That should go after the #includes.
It's not clear to me which file lookup is defined in, but you need to add it to your final compilation command .

unclear how to add extra productions to bison grammar to create error messages

This is not homework, but it is from a book.
I'm given a following bison spec file:
%{
#include <stdio.h>
#include <ctype.h>
int yylex();
int yyerror();
%}
%token NUMBER
%%
command : exp { printf("%d\n", $1); }
; /* allows printing of the result */
exp : exp '+' term { $$ = $1 + $3; }
| exp '-' term { $$ = $1 - $3; }
| term { $$ = $1; }
;
term : term '*' factor { $$ = $1 * $3; }
| factor { $$ = $1; }
;
factor : NUMBER { $$ = $1; }
| '(' exp ')' { $$ = $2; }
;
%%
int main() {
return yyparse();
}
int yylex() {
int c;
/* eliminate blanks*/
while((c = getchar()) == ' ');
if (isdigit(c)) {
ungetc(c, stdin);
scanf("%d", &yylval);
return (NUMBER);
}
/* makes the parse stop */
if (c == '\n') return 0;
return (c);
}
int yyerror(char * s) {
fprintf(stderr, "%s\n", s);
return 0;
} /* allows for printing of an error message */
The task is to do the following:
Rewrite the spec to add the following useful error messages:
"missing right parenthesis," generated by the string (2+3
"missing left parenthesis," generated by the string 2+3)
"missing operator," generated by the string 2 3
"missing operand," generated by the string (2+)
The simplest solution that I was able to come up with is to do the following:
half_exp : exp '+' { $$ = $1; }
| exp '-' { $$ = $1; }
| exp '*' { $$ = $1; }
;
factor : NUMBER { $$ = $1; }
| '(' exp '\n' { yyerror("missing right parenthesis"); }
| exp ')' { yyerror("missing left parenthesis"); }
| '(' exp '\n' { yyerror("missing left parenthesis"); }
| '(' exp ')' { $$ = $2; }
| '(' half_exp ')' { yyerror("missing operand"); exit(0); }
;
exp : exp '+' term { $$ = $1 + $3; }
| exp '-' term { $$ = $1 - $3; }
| term { $$ = $1; }
| exp exp { yyerror("missing operator"); }
;
These changes work, however they lead to a lot of conflicts.
Here is my question.
Is there a way to rewrite this grammar in such a way so that it wouldn't generate conflicts?
Any help is appreciated.
Yes it is possible:
command : exp { printf("%d\n", $1); }
; /* allows printing of the result */
exp: exp '+' exp {
// code
}
| exp '-' exp {
// code
}
| exp '*' exp {
// code
}
| exp '/' exp {
// code
}
|'(' exp ')' {
// code
}
Bison allows Ambiguous grammars.
I don't see how can you rewrite grammar to avoid conflicts. You just missed the point of terms, factors etc. You use these when you want left recursion context free grammar.
From this grammar:
E -> E+T
|T
T -> T*F
|F
F -> (E)
|num
Once you free it from left recursion you would go to:
E -> TE' { num , ( }
E' -> +TE' { + }
| eps { ) , EOI }
T -> FT' { ( , num }
T' -> *FT' { * }
|eps { + , ) , EOI }
F -> (E) { ( }
|num { num }
These sets alongside rules are showing what input character has to be in order to use that rule. Of course this is just example for simple arithmetic expressions for example 2*(3+4)*5+(3*3*3+4+5*6) etc.
If you want to learn more about this topic I suggest you to read about "left recursion context free grammar". There are some great books covering this topic and also covering how to get input sets.
But as I said above, all of this can be avoided because Bison allows Ambiguous grammars.

Context-Free Grammar for Custom Programming Language

After having completed the Compiler Design course at my university I have been playing around with making a compiler for a simple programming language, but I'm having trouble with the parser. I'm making the compiler in mosml and using its builtin parser mosmlyac for constructing the parser. Here is an excerpt from my parser showing the grammar and associativity+precedence.
...
%right ASSIGN
%left OR
%left AND
%nonassoc NOT
%left EQUAL LESS
%left PLUS MINUS
%left TIMES DIVIDE
%nonassoc NEGATE
...
Prog : FunDecs EOF { $1 }
;
FunDecs : Fun FunDecs { $1 :: $2 }
| { [] }
;
Fun : Type ID LPAR TypeIds RPAR StmtBlock { FunDec (#1 $2, $1, $4, $6, #2 $2) }
| Type ID LPAR RPAR StmtBlock { FunDec (#1 $2, $1, [], $5, #2 $2) }
;
TypeIds : Type ID COMMA TypeIds { Param (#1 $2, $1) :: $4 }
| Type ID { [Param (#1 $2, $1)] }
;
Type : VOID { Void }
| INT { Int }
| BOOL { Bool }
| CHAR { Char }
| STRING { Array (Char) }
| Type LBRACKET RBRACKET { Array ($1) }
;
StmtBlock : LCURLY StmtList RCURLY { $2 }
;
StmtList : Stmt StmtList { $1 :: $2 }
| { [] }
;
Stmt : Exp SEMICOLON { $1 }
| IF Exp StmtBlock { IfElse ($2, $3, [], $1) }
| IF Exp StmtBlock ELSE StmtBlock { IfElse ($2, $3, $5, $1) }
| WHILE Exp StmtBlock { While ($2, $3, $1) }
| RETURN Exp SEMICOLON { Return ($2, (), $1) }
;
Exps : Exp COMMA Exps { $1 :: $3 }
| Exp { [$1] }
;
Index : LBRACKET Exp RBRACKET Index { $2 :: $4 }
| { [] }
;
Exp : INTLIT { Constant (IntVal (#1 $1), #2 $1) }
| TRUE { Constant (BoolVal (true), $1) }
| FALSE { Constant (BoolVal (false), $1) }
| CHRLIT { Constant (CharVal (#1 $1), #2 $1) }
| STRLIT { StringLit (#1 $1, #2 $1) }
| LCURLY Exps RCURLY { ArrayLit ($2, (), $1) }
| ARRAY LPAR Exp RPAR { ArrayConst ($3, (), $1) }
| Exp PLUS Exp { Plus ($1, $3, $2) }
| Exp MINUS Exp { Minus ($1, $3, $2) }
| Exp TIMES Exp { Times ($1, $3, $2) }
| Exp DIVIDE Exp { Divide ($1, $3, $2) }
| NEGATE Exp { Negate ($2, $1) }
| Exp AND Exp { And ($1, $3, $2) }
| Exp OR Exp { Or ($1, $3, $2) }
| NOT Exp { Not ($2, $1) }
| Exp EQUAL Exp { Equal ($1, $3, $2) }
| Exp LESS Exp { Less ($1, $3, $2) }
| ID { Var ($1) }
| ID ASSIGN Exp { Assign (#1 $1, $3, (), #2 $1) }
| ID LPAR Exps RPAR { Apply (#1 $1, $3, #2 $1) }
| ID LPAR RPAR { Apply (#1 $1, [], #2 $1) }
| ID Index { Index (#1 $1, $2, (), #2 $1) }
| ID Index ASSIGN Exp { AssignIndex (#1 $1, $2, $4, (), #2 $1) }
| PRINT LPAR Exp RPAR { Print ($3, (), $1) }
| READ LPAR Type RPAR { Read ($3, $1) }
| LPAR Exp RPAR { $2 }
;
Prog is the %start symbol and I have left out the %token and %type declaration on purpose.
The problem I have is that this grammar seems to be ambiguous and looking at the output of running mosmlyac -v on the grammar it seems that it is the rules containing the token ID that is the problem and creates shift/reduce and reduce/reduce conflicts. The output also tells me that the rule Exp : ID is never reduced.
Can anyone help me make this grammar unambiguous?
Index has an empty production.
Now consider:
Exp : ID
| ID Index
Which of those applies? Since Index is allowed to be empty, there is no context in which only one of those is applicable. The parser generator you are using evidently prefers to reduce an empty INDEX, making Exp : ID unusable and creating a large number of conflicts.
I'd suggesting changing Index to:
Index : LBRACKET Exp RBRACKET Index { $2 :: $4 }
| LBRACKET Exp RBRACKET { [ $2 ] }
although in the long run, you might be better off with a more traditional "lvalue/rvalue" grammar, in which lvalue includes ID and lvalue [ Exp ] and rvalue includes lvalue. (That will give a more elaborate parse tree for ID [ Exp ] [ Exp ], but there is an obvious homormorphism.)

noob wants make a parser for a small language

I want make a parser in happy for the let-in-expression language. For example, i want parse the following string:
let x = 4 in x*x
At the university we study attribute grammars, and i want use this tricks to calculate directly the value of the parsed let-in-expression. So in the happy file, i set the data type of the parsing function to Int, and i created a new attribute called env. This attribute is a function from String to Int that associates variable name to value. Referring to my example:
env "x" = 4
Now i put here below the happy file, where there is my grammar:
{
module Parser where
import Token
import Lexer
}
%tokentype { Token }
%token
let { TLet }
in { TIn }
int { TInt $$ }
var { TVar $$ }
'=' { TEq }
'+' { TPlus }
'-' { TMinus }
'*' { TMul }
'/' { TDiv }
'(' { TOB }
')' { TCB }
%name parse
%attributetype { Int }
%attribute env { String -> Int }
%error { parseError }
%%
Exp : let var '=' Exp in Exp
{
$4.env = $$.env;
$2.env = (\_ -> 0);
$6.env = (\str -> if str == $2 then $4 else 0);
$$ = $6;
}
| Exp1
{
$1.env = $$.env;
$$ = $1;
}
Exp1 : Exp1 '+' Term
{
$1.env = $$.env;
$2.env = $$.env;
$$ = $1 + $3;
}
| Exp1 '-' Term
{
$1.env = $$.env;
$2.env = $$.env;
$$ = $1 - $3;
}
| Term
{
$1.env = $$.env;
$$ = $1;
}
Term : Term '*' Factor
{
$1.env = $$.env;
$2.env = $$.env;
$$ = $1 * $3;
}
| Term '/' Factor
{
$1.env = $$.env;
$2.env = $$.env;
$$ = div $1 $3;
}
| Factor
{
$1.env = $$.env;
$$ = $1;
}
Factor
: int
{
$$ = $1;
}
| var
{
$$ = $$.env $1;
}
| '(' Exp ')'
{
$1.env = $$.env;
$$ = $1;
}
{
parseError :: [Token] -> a
parseError _ = error "Parse error"
}
When i load the haskell file generated from the happy file above, i get the following error:
Ambiguous occurrence `Int'
It could refer to either `Parser.Int', defined at parser.hs:271:6
or `Prelude.Int',
imported from `Prelude' at parser.hs:2:8-13
(and originally defined in `GHC.Types')
I don't know why i get this, because i don't define the type Parser.Int in my happy file. I tried to replace Int with Prelude.Int, but i get other errors.
How can i resolve? Can i have also some general tips if I'm doing something not optimal?
See the happy explaination of attributetype: http://www.haskell.org/happy/doc/html/sec-AtrributeGrammarsInHappy.html
Your line:
%attributetype { Int }
Is declaring a type named Int. This is what causes the ambiguity.

Resources