ANTLRWorks compiling - antlrworks

I am using ANTLRWorks 1.5 and am trying to compile and debug. Sometimes I get the following error while compiling:
[11:44:52] TODO: run and send output to this console
[11:44:56] C:\Antlr\output\tryParser.java:102: error: <identifier> expected
[11:44:56] public final tryParser.poly_return poly() throws {
[11:44:56] ^
[11:44:56] C:\Antlr\output\tryParser.java:215: error: <identifier> expected
[11:44:56] public final tryParser.term_return term() throws {
[11:44:56] ^
[11:44:56] C:\Antlr\output\tryParser.java:504: error: <identifier> expected
[11:44:56] public final tryParser.exp_return exp() throws {
[11:44:56] ^
[11:44:56] 3 errors
Can anyone tell me what does this error mean?
Sometimes it compiles without any problem but I always get the 'timeout' error in debugging.
My code is:
grammar try;
options {output=AST;}
tokens { MULT; } // imaginary token
poly: term ('+'^ term)*
;
term: INT ID -> ^(MULT["*"] INT ID)
| INT exp -> ^(MULT["*"] INT exp)
| exp
| INT
| ID
;
exp : ID '^'^ INT
;
ID : 'a'..'z'+ ;
INT : '0'..'9'+ ;
WS : (' '|'\t'|'\r'|'\n')+ {skip();} ;

It means you need to update to ANTLRWorks 1.5.2.
Issue #5: ANTLRworks fails to generate proper Java Code

Related

Overriding yylex() in Bison's C++ API

I have a handwritten parser in C++ which contains a next_token() method and I wanna use it inside yylex() (I already did that correctly with Bison's C API but wanna use dynamic types so moved to C++). I read these parts of documentation, read examples and tried both existing signatures but still can't do it correctly...
parser.yy:
%require "3.2"
%language "c++"
%define api.value.type variant
%define api.token.constructor
%define parse.assert
%code requires
{
#include <iostream>
#include <string>
}
%code{
yy::parser::symbol_type yy::parser::yylex();
void yy::parser::yyerror(const char *error);
}
%token VAR COL ITYPE
%token IDENTIFIER
%token INTEGER
%token EOL
%type <std::string> type PrimitiveType IDENTIFIER
%type <int> INTEGER
%%
program:
| program EOL
| program SimpleDeclaration { }
;
SimpleDeclaration: VariableDeclaration
;
VariableDeclaration: VAR IDENTIFIER COL type {std::cout<<"defined variable " << $2 << " with type " << $4 << std::endl; }
type: IDENTIFIER
| PrimitiveType
;
PrimitiveType: ITYPE { $$ = "int"; }
;
%%
void yy::parser::yyerror(const std::string& m)
{
std::cout << "syntax error" << std::endl;
}
// also 'yy::parser::symbol_type yy::parser::yylex(semantic_type* yylval)' does the same
int yy::parser::yylex(semantic_type* yylval)
{
return 0; // just returning a zero for now
}
int main()
{
yy::parser p;
return 0;
}
Errors :
bison -d parser.ypp
g++ -std=c++17 -o foobar parser.tab.cpp
parser.ypp:14:29: error: no declaration matches ‘yy::parser::symbol_type yy::parser::yylex()’
14 | yy::parser::symbol_type yy::parser::yylex();
| ^~
parser.ypp:14:29: note: no functions named ‘yy::parser::symbol_type yy::parser::yylex()’
In file included from parser.tab.cpp:41:
parser.tab.hpp:193:9: note: ‘class yy::parser’ defined here
193 | class parser
| ^~~~~~
parser.ypp:15:10: error: no declaration matches ‘void yy::parser::yyerror(const char*)’
15 | void yy::parser::yyerror(const char *error);
| ^~
parser.ypp:15:10: note: no functions named ‘void yy::parser::yyerror(const char*)’
In file included from parser.tab.cpp:41:
parser.tab.hpp:193:9: note: ‘class yy::parser’ defined here
193 | class parser
| ^~~~~~
parser.tab.cpp: In member function ‘virtual int yy::parser::parse()’:
parser.tab.cpp:453:38: error: ‘yylex’ was not declared in this scope; did you mean ‘yylen’?
453 | symbol_type yylookahead (yylex ());
| ^~~~~
| yylen
parser.ypp: At global scope:
parser.ypp:45:6: error: no declaration matches ‘void yy::parser::yyerror(const string&)’
45 | void yy::parser::yyerror(const std::string& m)
| ^~
parser.ypp:45:6: note: no functions named ‘void yy::parser::yyerror(const string&)’
In file included from parser.tab.cpp:41:
parser.tab.hpp:193:9: note: ‘class yy::parser’ defined here
193 | class parser
| ^~~~~~
parser.ypp:50:5: error: no declaration matches ‘int yy::parser::yylex(yy::parser::semantic_type*)’
50 | int yy::parser::yylex(semantic_type* yylval)
| ^~
parser.ypp:50:5: note: no functions named ‘int yy::parser::yylex(yy::parser::semantic_type*)’
In file included from parser.tab.cpp:41:
parser.tab.hpp:193:9: note: ‘class yy::parser’ defined here
193 | class parser
| ^~~~~~
I feel weird also that even without overriding, it seems to use yylex here (which causes the error to appear even though I don't try to override):
parser.tab.cpp: In member function ‘virtual int yy::parser::parse()’:
parser.tab.cpp:453:38: error: ‘yylex’ was not declared in this scope; did you mean ‘yylen’?
453 | symbol_type yylookahead (yylex ());
| ^~~~~
| yylen
What did I do wrong here ?
Thanks in advance

How to use antlr4 to analyze the grammar of .aidl files?

Actually, I was assigned to analyze the grammar of .aidl files and extract the grammar elements using listener methods.
In order to finish this, I have thought for long and finally I worked out a .g4 file.
grammar aidl3;
file : pack* imp* parcelable? interfa? ;
pack : 'package' WS+ PAC_NAME WS* ';' WS* ;
imp : 'import' WS+ IMP_NAME WS* ';' WS* ;
parcelable : 'parcelable' WS+ PARCE_NAME WS* ';' WS* ;
interfa : INTER_TAG? WS* 'interface' WS+ INTER_NAME WS* '{' WS* methods+ WS* '}' WS*;
methods : RETURN_TYPE WS+ METHOD_NAME WS* '(' WS* argmentsa* WS* argmentsb* WS* ')' WS* ';' WS* ;
argmentsa : TAG? WS* ARG_TYPE WS+ ARG_NAME WS* ',' WS* ;
argmentsb : TAG? WS* ARG_TYPE WS+ ARG_NAME WS* ;
PAC_NAME : ~[; \n\r]+ ;
//PAC_NAME : [_a-zA-Z] [_.a-zA-Z0-9]* ;
IMP_NAME : ~[ ;\n\r]+ ;
PARCE_NAME : ~[ ;\n\r.]+ ;
INTER_TAG : 'oneway';
INTER_NAME : ~[ ;\n\r{.]+ ;
RETURN_TYPE : ~[ ;\n\r.]+ ;
METHOD_NAME : ~[ ;\n\r(]+ ;
TAG : 'in' | 'out' | 'inout' ;
//ARG_TYPE : ~[) ,\n\r]+ ;
ARG_TYPE : [a-zA-Z] ~' '* | [a-zA-Z] ~' '* ' ' '[' ']' ;
ARG_NAME : ~[ ,\n\r).]+ ;
WS: [ \t\n\r];
However, I've run into a weird problem: that is when I'm trying to analyze the .aidl files e.g.
package android.view.accessibility;
oneway interface IAccessibilityInteractionConnection {
void findAccessibilityNodeInfoByAccessibilityId(long accessibilityNodeId, in Region bounds,
int interactionId, IAccessibilityInteractionConnectionCallback callback, int flags,
int interrogatingPid, long interrogatingTid, in MagnificationSpec spec);
void findAccessibilityNodeInfosByViewId(long accessibilityNodeId, String viewId,
in Region bounds, int interactionId, IAccessibilityInteractionConnectionCallback callback,
int flags, int interrogatingPid, long interrogatingTid, in MagnificationSpec spec);
void findAccessibilityNodeInfosByText(long accessibilityNodeId, String text, in Region bounds,
int interactionId, IAccessibilityInteractionConnectionCallback callback, int flags,
int interrogatingPid, long interrogatingTid, in MagnificationSpec spec);
void findFocus(long accessibilityNodeId, int focusType, in Region bounds, int interactionId,
IAccessibilityInteractionConnectionCallback callback, int flags, int interrogatingPid,
long interrogatingTid, in MagnificationSpec spec);
void focusSearch(long accessibilityNodeId, int direction, in Region bounds, int interactionId,
IAccessibilityInteractionConnectionCallback callback, int flags, int interrogatingPid,
long interrogatingTid, in MagnificationSpec spec);
void performAccessibilityAction(long accessibilityNodeId, int action, in Bundle arguments,
int interactionId, IAccessibilityInteractionConnectionCallback callback, int flags,
int interrogatingPid, long interrogatingTid);
}
it would give this output:
[#0,0:6='package',<'package'>,1:0]
[#1,7:7=' ',<WS>,1:7]
[#2,8:41='android.view.accessibility;\noneway',<ARG_TYPE>,1:8]
[#3,42:42=' ',<WS>,2:6]
[#4,43:51='interface',<'interface'>,2:7]
[#5,52:52=' ',<WS>,2:16]
[#6,53:87='IAccessibilityInteractionConnection',<PAC_NAME>,2:17]
[#7,88:88=' ',<WS>,2:52]
[#8,89:89='{',<'{'>,2:53]
[#9,90:90='\n',<WS>,2:54]
[#10,91:91=' ',<WS>,3:0]
[#11,92:92=' ',<WS>,3:1]
[#12,93:93=' ',<WS>,3:2]
[#13,94:94=' ',<WS>,3:3]
[#14,95:98='void',<PAC_NAME>,3:4]
[#15,99:99=' ',<WS>,3:8]
[#16,100:146='findAccessibilityNodeInfoByAccessibilityId(long',<PAC_NAME>,3:9]
[#17,147:147=' ',<WS>,3:56]
[#18,148:167='accessibilityNodeId,',<PAC_NAME>,3:57]
...
You can see in output line 3 '[#2,8:41='android.view.accessibility;\noneway',ARG_TYPE,1:8]' , in which the expression 'pack' uses 'ARG_TYPE' to match 'android.view.accessibility;\noneway'. But.. How could it be? 'ARG_TYPE' never appears in the expression 'pack' and it should have used 'PAC_NAME' to match 'android.view.accessibility'
It would be nice if someone could help me figure this thing out, because I'm facing a close deadline.
In fact, I'm just a new learner and I know that my g4 file doesn't look good, so if possible, could you please tell me how to program the g4 for .aidl in a better way? Or even show me the write answer?
I would be really grateful if you can help me! Thanks!
ANTLR's lexer tries to create tokens with as much characters as possible. And since ARG_TYPE is able to match android.view.accessibility;\noneway (and no other rule can match more characters), an ARG_TYPE token is created. Only when 2 or more rules match the same characters, ANTLR will choose the one defined first.
You must understand that the lexer does not create tokens based on what the parser is trying to match. Tokenisation is a process that is done independently from the parsing phase. Therefor, most of your rules that look like ~[ ;\n\r(]+ are way too broad.
I suggest you take a look at an existing Java grammar, and use that in order to work with AIDL files.
EDIT
If I take the grammar file posted above, and change:
formalParameter
: variableModifier* unannType variableDeclaratorId
;
into:
formalParameter
: 'in'? variableModifier* unannType variableDeclaratorId
;
and change:
interfaceModifier
: annotation
| 'public'
| 'protected'
| 'private'
| 'abstract'
| 'static'
| 'strictfp'
;
into:
interfaceModifier
: annotation
| 'public'
| 'protected'
| 'private'
| 'abstract'
| 'static'
| 'strictfp'
| 'oneway'
;
then your example file parses correctly.

Parser to verify declarations of type int and float in C language

I'm trying to write a parser to verify the following declarations of type int and float in C language.
variables declarations, pointer variable declarations, array of any dimensions
float a , b , r = 5, area = r * r , * b;
int a , b , c , ** p ;
int x , mat [2][3];
This is my lex file
%{
#include "y.tab.h"
extern int yylval;
%}
%%
"int" return INT;
"float" return FLOAT;
[0-9]+ return NUM;
[_|a-z|A-Z]([_|a-z|A-Z|0-9])*{1,255} return NAME;
[+\-*/] return op;
[ \t\n];
. return yytext[0];
%%
This is my yacc file
%{
#include<stdio.h>
int yylex() ;
int yyerror();
%}
%token NUM NAME op INT FLOAT
%%
stmt_list: stmt | stmt_list stmt;
stmt: type id_list ';' { printf("Valid Declaration\n"); };
type: INT | FLOAT;
id_list: id ',' id_list | id ;
id: NAME'='expr | expr;
expr: expr op expr | POINT expr | expr MATRIX | '(' expr')' | NAME;
MATRIX: '[' NUM ']' | '[' NUM ']' MATRIX ;
POINT: '*' | '*'POINT;
%%
int main(){
yyparse();
return 0;
}
int yyerror(){
printf("Invalid Declaration\n");
return -1;
}
Even if I enter "int a;" as input, I get "Invalid Declaration". I'm not able to figure out what I'm doing wrong.

How to write yacc grammar rules to identify function definitions vs function calls?

I have started learning about YACC, and I have executed a few examples of simple toy programs. But I have never seen a practical example that demonstrates how to build a compiler that identifies and implements function definitions and function calls, array implementation and so on, nor has it been easy to find an example using Google search. Can someone please provide one example of how to generate the tree using YACC? C or C++ is fine.
Thanks in advance!
Let's parse this code with yacc.
file test contains valid C code that we want to parse.
int main (int c, int b) {
int a;
while ( 1 ) {
int d;
}
}
A lex file c.l
alpha [a-zA-Z]
digit [0-9]
%%
[ \t] ;
[ \n] { yylineno = yylineno + 1;}
int return INT;
float return FLOAT;
char return CHAR;
void return VOID;
double return DOUBLE;
for return FOR;
while return WHILE;
if return IF;
else return ELSE;
printf return PRINTF;
struct return STRUCT;
^"#include ".+ ;
{digit}+ return NUM;
{alpha}({alpha}|{digit})* return ID;
"<=" return LE;
">=" return GE;
"==" return EQ;
"!=" return NE;
">" return GT;
"<" return LT;
"." return DOT;
\/\/.* ;
\/\*(.*\n)*.*\*\/ ;
. return yytext[0];
%%
file c.y for input to YACC:
%{
#include <stdio.h>
#include <stdlib.h>
extern FILE *fp;
%}
%token INT FLOAT CHAR DOUBLE VOID
%token FOR WHILE
%token IF ELSE PRINTF
%token STRUCT
%token NUM ID
%token INCLUDE
%token DOT
%right '='
%left AND OR
%left '<' '>' LE GE EQ NE LT GT
%%
start: Function
| Declaration
;
/* Declaration block */
Declaration: Type Assignment ';'
| Assignment ';'
| FunctionCall ';'
| ArrayUsage ';'
| Type ArrayUsage ';'
| StructStmt ';'
| error
;
/* Assignment block */
Assignment: ID '=' Assignment
| ID '=' FunctionCall
| ID '=' ArrayUsage
| ArrayUsage '=' Assignment
| ID ',' Assignment
| NUM ',' Assignment
| ID '+' Assignment
| ID '-' Assignment
| ID '*' Assignment
| ID '/' Assignment
| NUM '+' Assignment
| NUM '-' Assignment
| NUM '*' Assignment
| NUM '/' Assignment
| '\'' Assignment '\''
| '(' Assignment ')'
| '-' '(' Assignment ')'
| '-' NUM
| '-' ID
| NUM
| ID
;
/* Function Call Block */
FunctionCall : ID'('')'
| ID'('Assignment')'
;
/* Array Usage */
ArrayUsage : ID'['Assignment']'
;
/* Function block */
Function: Type ID '(' ArgListOpt ')' CompoundStmt
;
ArgListOpt: ArgList
|
;
ArgList: ArgList ',' Arg
| Arg
;
Arg: Type ID
;
CompoundStmt: '{' StmtList '}'
;
StmtList: StmtList Stmt
|
;
Stmt: WhileStmt
| Declaration
| ForStmt
| IfStmt
| PrintFunc
| ';'
;
/* Type Identifier block */
Type: INT
| FLOAT
| CHAR
| DOUBLE
| VOID
;
/* Loop Blocks */
WhileStmt: WHILE '(' Expr ')' Stmt
| WHILE '(' Expr ')' CompoundStmt
;
/* For Block */
ForStmt: FOR '(' Expr ';' Expr ';' Expr ')' Stmt
| FOR '(' Expr ';' Expr ';' Expr ')' CompoundStmt
| FOR '(' Expr ')' Stmt
| FOR '(' Expr ')' CompoundStmt
;
/* IfStmt Block */
IfStmt : IF '(' Expr ')'
Stmt
;
/* Struct Statement */
StructStmt : STRUCT ID '{' Type Assignment '}'
;
/* Print Function */
PrintFunc : PRINTF '(' Expr ')' ';'
;
/*Expression Block*/
Expr:
| Expr LE Expr
| Expr GE Expr
| Expr NE Expr
| Expr EQ Expr
| Expr GT Expr
| Expr LT Expr
| Assignment
| ArrayUsage
;
%%
#include"lex.yy.c"
#include<ctype.h>
int count=0;
int main(int argc, char *argv[])
{
yyin = fopen(argv[1], "r");
if(!yyparse())
printf("\nParsing complete\n");
else
printf("\nParsing failed\n");
fclose(yyin);
return 0;
}
yyerror(char *s) {
printf("%d : %s %s\n", yylineno, s, yytext );
}
A Makefile to put it together. I use flex-lexer and bison but the example will also work with lex and yacc.
miniC: c.l c.y
bison c.y
flex c.l
gcc c.tab.c -ll -ly
Compile and parse the test code:
$ make
bison c.y
flex c.l
gcc c.tab.c -ll -ly
c.tab.c: In function ‘yyparse’:
c.tab.c:1273:16: warning: implicit declaration of function ‘yylex’ [-Wimplicit-function-declaration]
yychar = yylex ();
^
c.tab.c:1402:7: warning: implicit declaration of function ‘yyerror’ [-Wimplicit-function-declaration]
yyerror (YY_("syntax error"));
^
c.y: At top level:
c.y:155:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
yyerror(char *s) {
^
$ ls
a.out c.l CMakeLists.txt c.tab.c c.y lex.yy.c Makefile README.md test
$ ./a.out test
Parsing complete
For reading resources I can recommend the books Modern Compiler Implementation in C by Andrew Appel and the flex/bison book by John Levine.

Error "Cannot find a constructor"

I'm currently trying Rascal to create a small DSL. I tried to modify the Pico example, however I'm currently stuck. The following code parses examples like a = 3, b = 7 begin declare x : natural, field real # cells blubb; x := 5.7 end parses perfectly, but the implode function fails with the error message "Cannot find a constructor for PROGRAM". I tried various constructor declarations, however none seemed to fit. Is there a way to see what the expected constructor looks like?
Syntax:
module BlaTest::Syntax
import Prelude;
lexical Identifier = [a-z][a-z0-9]* !>> [a-z0-9];
lexical NaturalConstant = [0-9]+;
lexical IntegerConstant = [\-+]? NaturalConstant;
lexical RealConstant = IntegerConstant "." NaturalConstant;
lexical StringConstant = "\"" ![\"]* "\"";
layout Layout = WhitespaceAndComment* !>> [\ \t\n\r%];
lexical WhitespaceAndComment
= [\ \t\n\r]
| #category="Comment" "%" ![%]+ "%"
| #category="Comment" "%%" ![\n]* $
;
start syntax Program
= program: {ExaOption ","}* exadomain "begin" Declarations decls {Statement ";"}* body "end"
;
syntax Domain = "domain" "{" ExaOption ", " exaoptions "}"
;
syntax ExaOption = Identifier id "=" Expression val
;
syntax Declarations
= "declare" {Declaration ","}* decls ";" ;
syntax Declaration
= variable_declaration: Identifier id ":" Type tp
| field_declaration: "field" Type tp "#" FieldLocation fieldLocation Identifier id
;
syntax FieldLocation
= exacell: "cells"
| exanode: "nodes"
;
syntax Type
= natural:"natural"
| exareal: "real"
| string :"string"
;
syntax Statement
= asgStat: Identifier var ":=" Expression val
| ifElseStat: "if" Expression cond "then" {Statement ";"}* thenPart "else" {Statement ";"}* elsePart "fi"
| whileStat: "while" Expression cond "do" {Statement ";"}* body "od"
;
syntax Expression
= id: Identifier name
| stringConstant: StringConstant stringconstant
| naturalConstant: NaturalConstant naturalconstant
| realConstant: RealConstant realconstant
| bracket "(" Expression e ")"
> left conc: Expression lhs "||" Expression rhs
> left ( add: Expression lhs "+" Expression rhs
| sub: Expression lhs "-" Expression rhs
)
;
public start[Program] program(str s) {
return parse(#start[Program], s);
}
public start[Program] program(str s, loc l) {
return parse(#start[Program], s, l);
}
Abstract:
module BlaTest::Abstract
public data TYPE = natural() | string() | exareal();
public data FIELDLOCATION = exacell() | exanode();
public alias ExaIdentifier = str;
public data PROGRAM = program(list[OPTION] exadomain, list[DECL] decls, list[STATEMENT] stats);
public data DOMAIN
= domain_declaration(list[OPTION] options)
;
public data OPTION
= exaoption(ExaIdentifier name, EXP exp)
;
public data DECL
= variable_declaration(ExaIdentifier name, TYPE tp)
| field_declaration(TYPE tp, FIELDLOCATION fieldlocation, ExaIdentifier name)
;
public data EXP
= id(ExaIdentifier name)
| naturalConstant(int iVal)
| stringConstant(str sVal)
| realConstant(real rVal)
| add(EXP left, EXP right)
| sub(EXP left, EXP right)
| conc(EXP left, EXP right)
;
public data STATEMENT
= asgStat(ExaIdentifier name, EXP exp)
| ifElseStat(EXP exp, list[STATEMENT] thenpart, list[STATEMENT] elsepart)
| whileStat(EXP exp, list[STATEMENT] body)
;
anno loc TYPE#location;
anno loc PROGRAM#location;
anno loc DECL#location;
anno loc EXP#location;
anno loc STATEMENT#location;
anno loc OPTION#location;
public alias Occurrence = tuple[loc location, ExaIdentifier name, STATEMENT stat];
Load:
module BlaTest::Load
import IO;
import Exception;
import Prelude;
import BlaTest::Syntax;
import BlaTest::Abstract;
import BlaTest::ControlFlow;
import BlaTest::Visualize;
public PROGRAM exaload(str txt) {
PROGRAM p;
try {
p = implode(#PROGRAM, parse(#Program, txt));
} catch ParseError(loc l): {
println("Parse error at line <l.begin.line>, column <l.begin.column>");
}
return p; // return will fail in case of error
}
public Program exaparse(str txt) {
Program p;
try {
p = parse(#Program, txt);
} catch ParseError(loc l): {
println("Parse error at line <l.begin.line>, column <l.begin.column>");
}
return p; // return will fail in case of error
}
Thanks a lot,
Chris
Unfortunately the current implode facility depends on a hidden semantic assumption, namely that the non-terminals in the syntax definition have the same name as the types in the data definitions. So if the non-terminal is called "Program", it should not be called "PROGRAM" but "Program" in the data definition.
We are looking for a smoother way of integrating concrete and abstract syntax trees, but for now please decapitalize your data names.

Resources