Xtext to Acceleo - xtext

I have a xtext code for an expression like this:
expr : RelExp ( {LogicExp.args+=current} op=LO args+=RelExp)* ;
RelExp returns expr : ArithExp ( {RelExp.args+=current} op=RO args+=ArithExp)* ;
ArithExp returns expr : Term ( {ArithExp.args+=current} op=AO1 args+=Term)* ;
Term returns expr : Factor ( {Term.args+=current} op=AO2 args+=Factor)* ;
Factor returns expr : Atom ({PostfixOp.arg=current} uo=UO)?
| {PrefixOp} uo=UO arg=Atom ;
Atom returns expr : Literal
| {Parenteses} '(' exp=expr ')'
| lValue ;
lValue returns expr : {Var} valor=ID (
({FuncCall.def=current} '(' arg=Argument? ')') |
({FieldAccess.obj=current} '.' field=ID) |
({ArrayAccess.arr=current} '[' index=expr ']')
)*
| PointerExp ;
PointerExp : {PointerExp} '**' '(' exp=expr ')' ;
//Case : 'case' val=Atom ':' (commands+=Command)* ;
//Type : tipo=TYPELIT ('[' exp=expr? ']')?;
Literal : {IntLit} val=NUMBER | {TrueLit} val='TRUE' | {FalseLit} val='FALSE' | {StrLit} val=STRING;
I am trying to write an acceleo code to print an expression. But everytime I write (stat.exp/) in acceleo it prints org.xtext.example.scldsl.sclDsl.impl.TrueLitImpl#67af833b(val: 0).
But I needed only (val: 0)
Can anyone please help!!

When you call [stat.exp/] in Acceleo, it adds an implicit toString() to obtain a String representation from your AST element.
If you want to use your Xtext grammar to obtain a String representation, you will need to find a way to use the Xtext serializer generated for your DSL.
As a first step, you should add a Java service in your Acceleo, and implement your Java service in such a way that it takes an AST element (probably EObject or some common super-type in your metamodel if you have one), has access to the Xtext serializer, and returns the serialized version of your AST element.

Related

Problems with precedence in ANTLR4 Grammar

I developed this small grammar here i have an issue with:
grammar test;
term : above_term | below_term;
above_term :
<assoc=right> 'forall' binders ',' forall_term
| <assoc=right> above_term '->' above_term
| <assoc=right> above_term '->' below_term
| <assoc=right> below_term '->' above_term
| <assoc=right> below_term '->' below_term
;
below_term :
<assoc = right> below_term arg (arg)*
| '#' qualid (term)*
| below_term '%' IDENT
| qualid
| sort
| '(' term ')'
;
forall_term : term;
arg : term| '(' IDENT ':=' term ')';
binders : binder (binder)*;
binder : name |<assoc=right>name (name)* ':' term | '(' name (name)* ':' term ')' |<assoc=right> name (':' term)? ':=' term;
name : IDENT | '_';
qualid : IDENT | qualid ACCESS_IDENT;
sort : 'Prop' | 'Set' | 'Type' ;
/**************************************
* LEXER RULES
**************************************/
/*
* STRINGS
*/
STRING : '"' (~["])* '"';
/*
* IDENTIFIER AND ACCESS IDENTIFIER
*/
ACCESS_IDENT : '.' IDENT;
IDENT : FIRST_LETTER (SUBSEQUENT_LETTER)*;
fragment FIRST_LETTER : [a-z] | [A-Z] | '_' | UNICODE_LETTER;
fragment SUBSEQUENT_LETTER : [a-z] | [A-Z] | DIGIT | '_' | '"' | UNICODE_LETTER | UNICODE_ID_PART;
fragment UNICODE_LETTER : '\\' 'u' HEX HEX HEX HEX;
fragment UNICODE_ID_PART : '\\' 'u' HEX HEX HEX HEX;
fragment HEX : [0-9a-fA-F];
/*
* NATURAL NUMBERS AND INTEGERS
*/
NUM : DIGIT (DIGIT)*;
INTEGER : ('-')? NUM;
fragment DIGIT : [0-9];
WS : [ \n\t\r] -> skip;
You can copy this grammar and test it with antlr if you want, it will work. Now for my question:
Let's consider an expression like this: a b -> c d -> forall n:nat, c.
Now according to my grammar the ("->") rule (right after forall rule) has the highest precedence.
As for this I want this term to be parsed so that both ("->") rules are on top of the parse tree. like this: (Please note, that this is an abstract view, i know that there are many above and below terms between the leafs)
However sadly it doesn't get parsed this way but this way:
Howcome the parser doesn't see the (->) rules both on top of the parse tree? Is this a precedence issue?
By changing term to below_term in the (arg) rule we can fix the problem arg : below_term| '(' IDENT ':=' term ')'; .
Lets take this expression as an example: a b c.
Once the parser sees, that the pattern a b matches this rule: below_term arg (arg)* he puts a as a below_term and trys to match b with the arg rule. However since arg points to the below_term rule now, no above_term is alowed except when it is braced. This solved my problem.
The term a b -> a b c -> forall n:nat, n now gets parsed this way:

How to make certain rules mandatory in Antlr

I wrote the following grammar which should check for a conditional expression.
Examples below is what I want to achieve using this grammar:
test invalid
test = 1 valid
test = 1 and another_test>=0.2 valid
test = 1 kasd y = 1 invalid (two conditions MUST be separated by AND/OR)
a = 1 or (b=1 and c) invalid (there cannot be a lonely character like 'c'. It should always be a triplet. i.e, literal operator literal)
grammar expression;
expr
: literal_value
| expr ( '='|'<>'| '<' | '<=' | '>' | '>=' ) expr
| expr K_AND expr
| expr K_OR expr
| function_name '(' ( expr ( ',' expr )* | '*' )? ')'
| '(' expr ')'
;
literal_value
: NUMERIC_LITERAL
| STRING_LITERAL
| IDENTIFIER
;
keyword
: K_AND
| K_OR
;
name
: any_name
;
function_name
: any_name
;
database_name
: any_name
;
table_name
: any_name
;
column_name
: any_name
;
any_name
: IDENTIFIER
| keyword
| STRING_LITERAL
| '(' any_name ')'
;
K_AND : A N D;
K_OR : O R;
IDENTIFIER
: '"' (~'"' | '""')* '"'
| '`' (~'`' | '``')* '`'
| '[' ~']'* ']'
| [a-zA-Z_] [a-zA-Z_0-9]*
;
NUMERIC_LITERAL
: DIGIT+ ( '.' DIGIT* )? ( E [-+]? DIGIT+ )?
| '.' DIGIT+ ( E [-+]? DIGIT+ )?
;
STRING_LITERAL
: '\'' ( ~'\'' | '\'\'' )* '\''
;
fragment DIGIT : [0-9];
fragment A : [aA];
fragment B : [bB];
fragment C : [cC];
fragment D : [dD];
fragment E : [eE];
fragment F : [fF];
fragment G : [gG];
fragment H : [hH];
fragment I : [iI];
fragment J : [jJ];
fragment K : [kK];
fragment L : [lL];
fragment M : [mM];
fragment N : [nN];
fragment O : [oO];
fragment P : [pP];
fragment Q : [qQ];
fragment R : [rR];
fragment S : [sS];
fragment T : [tT];
fragment U : [uU];
fragment V : [vV];
fragment W : [wW];
fragment X : [xX];
fragment Y : [yY];
fragment Z : [zZ];
WS: [ \n\t\r]+ -> skip;
So my question is, how can I get the grammar to work for the examples mentioned above? Can we make certain words as mandatory between two triplets (literal operator literal)? In a sense I'm just trying to get a parser to validate the where clause condition but only simple condition and functions are permitted. I also want have a visitor that retrieves the values like function, parenthesis, any literal etc in Java, how to achieve that?
Yes and no.
You can change your grammar to only allow expressions that are comparisons and logical operations on the same:
expr
: term ( '='|'<>'| '<' | '<=' | '>' | '>=' ) term
| expr K_AND expr
| expr K_OR expr
| '(' expr ')'
;
term
: literal_value
| function_name '(' ( expr ( ',' expr )* | '*' )? ')'
;
The issue comes if you want to allow boolean variables or functions -- you need to classify the functions/vars in your lexer and have a different terminal for each, which is tricky and error prone.
Instead, it is generally better to NOT do this kind of checking in the parser -- have your parser be permissive and accept anything expression-like, and generate an expression tree for it. Then have a separate pass over the tree (called a type checker) that checks the types of the operands of operations and the arguments to functions.
This latter approach (with a separate type checker) generally ends up being much simpler, clearer, more flexible, and gives better error messages (rather than just 'syntax error').

antlr4 does't parse obvious tree

I want to create a Grammar that will parse the input statement
myvar is 43+23
and
otherVar of myvar is "hallo"
But the parser doesn't recognize anything here.
(sorry, I am not allowed to post images :( imagine a statement node with the Tokens
[myvar] [is] [43] [+] [23] as children all marked red. Same goes for the other statement)
I'm getting error messages that confuse me:
line 2:7 no viable alternative at input 'myvaris'
line 3:19 no viable alternative at input 'otherVarofmyvaris'
Where are the spaces gone? I assume, It's something with my lexer, but I can't see what the problem is. Just in case here is the grammar for these statements:
statement
: envCall #call_Environment_Function
| identifier IS expression # assignment_statement // This one should be used
| loopHeader statement_block # loop_statement
etc...
expression
: '(' expression ')' #bracket_Expression
| mathExpression #math_Expression
| identifier #identifier_Expression // this one should be used
| objectExpression #object_Expression
etc ...
identifier //both of these should be used
: selector=IDENTIFIER OF object=expression #ofIdentifier
| selector=IDENTIFIER #idLocal
;
here are all the Lexer rules I have so far:
IdentifierNamespace: IDENTIFIER '.' IDENTIFIER;
FromIn: FROM | IN;
OPENBLOCK: NEWLINE? '{';
CLOSEBLOCK: '}' NEWLINE;
NEWLINE: ['\n''\t']+;
NUMBER: INT | FLOAT;
INT: [0-9]+;
FLOAT: [0-9]* '.' [0-9]+;
IsAre: IS | ARE;
OF: 'of';
IS: 'is';
ARE: 'are';
DO: 'do';
FROM: 'from';
IN: 'in';
IDENTIFIER : [a-zA-Z]+ ;
//WHITESPACE: [ \t]+ -> skip;
fragment UNICODE : 'u' HEX HEX HEX HEX ;
fragment HEX : [0-9a-fA-F] ;
fragment ESC : '\\' (["\\/bfnrt] | UNICODE) ;
STRING : '"' (ESC | ~["\\])* '"' ;
END: 'END'[.]* EOF;
WHITESPACE : ( '\t' | ' ' )+ -> skip ;
Ok, found it. There was a compOP defined for the parser, and it was messing up the treegeneration.
compOP: '<'
| '>'
| '=' // the programmers '=='
| '>='
| '<='
| '<>'
| '!='
| 'in'
| 'not' 'in'
| 'is' <- removed this one and it works now
;
So: never assign the same keyword to Parser and Lexer, I guess.

Context-Free-Grammar for assignment statements in ANTLR

I'm writing an ANTLR lexer/parser for context free grammar.
This is what I have now:
statement
: assignment_statement
;
assignment_statement
: IDENTIFIER '=' expression ';'
;
term
: IDENT
| '(' expression ')'
| INTEGER
| STRING_LITERAL
| CHAR_LITERAL
| IDENT '(' actualParameters ')'
;
negation
: 'not'* term
;
unary
: ('+' | '-')* negation
;
mult
: unary (('*' | '/' | 'mod') unary)*
;
add
: mult (('+' | '-') mult)*
;
relation
: add (('=' | '/=' | '<' | '<=' | '>=' | '>') add)*
;
expression
: relation (('and' | 'or') relation)*
;
IDENTIFIER : LETTER (LETTER | DIGIT)*;
fragment DIGIT : '0'..'9';
fragment LETTER : ('a'..'z' | 'A'..'Z');
So my assignment statement is identified by the form
IDENTIFIER = expression;
However, assignment statement should also take into account cases when the right hand side is a function call (the return value of the statement). For example,
items = getItems();
What grammar rule should I add for this? I thought of adding a function call to the "expression" rule, but I wasn't sure if function call should be regarded as expression..
Thanks
This grammar looks fine to me. I am assuming that IDENT and IDENTIFIER are the same and that you have additional productions for the remaining terminals.
This production seems to define a function call.
| IDENT '(' actualParameters ')'
You need a production for the actual parameters, something like this.
actualParameters : nothing | expression ( ',' expression )*

Why does ANTLR not parse the entire input?

I am quite new to ANTLR, so this is likely a simple question.
I have defined a simple grammar which is supposed to include arithmetic expressions with numbers and identifiers (strings that start with a letter and continue with one or more letters or numbers.)
The grammar looks as follows:
grammar while;
#lexer::header {
package ConFreeG;
}
#header {
package ConFreeG;
import ConFreeG.IR.*;
}
#parser::members {
}
arith:
term
| '(' arith ( '-' | '+' | '*' ) arith ')'
;
term returns [AExpr a]:
NUM
{
int n = Integer.parseInt($NUM.text);
a = new Num(n);
}
| IDENT
{
a = new Var($IDENT.text);
}
;
fragment LOWER : ('a'..'z');
fragment UPPER : ('A'..'Z');
fragment NONNULL : ('1'..'9');
fragment NUMBER : ('0' | NONNULL);
IDENT : ( LOWER | UPPER ) ( LOWER | UPPER | NUMBER )*;
NUM : '0' | NONNULL NUMBER*;
fragment NEWLINE:'\r'? '\n';
WHITESPACE : ( ' ' | '\t' | NEWLINE )+ { $channel=HIDDEN; };
I am using ANTLR v3 with the ANTLR IDE Eclipse plugin. When I parse the expression (8 + a45) using the interpreter, only part of the parse tree is generated:
Why does the second term (a45) not get parsed? The same happens if both terms are numbers.
You'll want to create a parser rule that has an EOF (end of file) token in it so that the parser will be forced to go through the entire token stream.
Add this rule to your grammar:
parse
: arith EOF
;
and let the interpreter start at that rule instead of the arith rule:

Resources