I can't get Tatsu to parse a grammar that includes a literal '#'.
Here is a minimal example:
G = r'''
atom = /[0-9]+/
| '#' atom
;
'''
p = tatsu.compile(G)
p.parse('#345', trace=True)
The parse throws a FailedParse exception. The trace seems to show that the parser is not matching the '#' literal:
<atom ~1:1
#345
!'' /[0-9]+/
!'#'
!atom ~1:1
#345
If I change the grammar to use a symbol other than '#', it works fine. For example this works:
G1 = r'''
atom = /[0-9]+/
| '#' atom
;
'''
tatsu.parse(G1, '#345') --> ['#', '345']
Unfortunately, I can't change the format of the input data.
This is likely a bug in the version of TatSu you are using.
If you need to stick to that version, please try including ##eol_comments :: // or a similar pattern in the grammar.
This works for me:
[ins] In [1]: import tatsu
[ins] In [2]: G = r'''
...: atom = /[0-9]+/
...: | '#' atom
...: ;
...: '''
...:
...: p = tatsu.compile(G)
...: p.parse('#345', trace=True)
↙atom ~1:1
#345
≢'' /[0-9]+/
#345
≡'#'
345
↙atom↙atom ~1:2
345
≡'345' /[0-9]+/
≡atom↙atom
≡atom
Out[2]: ('#', '345')
AFTERNOTE: Yes, the above output is from the master version of TatSu (sequences return tuple), but I just checked against v4.4.0, and it's equivalent.
Related
I m trying to create a list in xtext, If anybody can help me to create a grammar for that, it will be really helpful. I tried writing this but its not the xtext format so i getting errors on that.
List:
'List' name=ID type = Nlist;
Nlist:
Array | Object
;
Array:
{Array} "[" values*=Value[','] "]"
;
Value:
STRING | FLOAT | BOOL | Object | Array | "null"
;
Object:
"{" members*=Member[','] "}"
;
Member:
key=STRING ':' value=Value
I m new to this one, Any help will be appreciated.
Thank you.
the default syntax for comma separated lists is e.g.
MyList: '#[' (elements+=Element (',' elements+=Element )*)? ']';
We are working on a new version of a JavaScript grammar using Rascal. Based on the language specification (version 6) and existing JavaScript grammars, the following are valid productions for expressions:
syntax Expression = ...
| new: "new" Expression
| call: Expression Args
syntax Args = ...
However, when we try to parse an expression like "new Date().getTime()" we get an Ambiguity error. We tried to fix it using a combination of the "left" and ">" operators, something like
| left "new" Expression
> Expression Args
but we were not able to fix the problem. I believe that this might be simple to solve, but after spending a couple of hours, we could not figure out a solution.
I've tried to complete your example here, and this works without throwing an Ambiguity() exception.
module Test
import IO;
import ParseTree;
lexical Ident = [A-Za-z]+ !>> [a-zA-Z];
layout Whitespace = [\t\n\r\ ]*;
syntax Expression
= Ident
| "new" Expression
> Expression "(" {Expression ","}* ")"
> right Expression "." Expression
;
void main() {
Expression ex = parse(#Expression, "a().b()");
println(ex);
Expression ex2 = parse(#Expression, "new a().b()");
println(ex2);
}
I have tried to cut down my problem to the simplest problem I can in xtext - I would like to use the following grammar:
M: lines += T*;
T:
DT
| BDT
| N
;
BDT:
name = ('a' | 'b' | 'c')
;
DT:
'd' name=ID
('(' (ts += BDT (','ts += BDT)*) ')')?
;
N:
'n' name=ID ':' type=[T]
;
I am intending to parse expressions of the form d f(a,b,b) for example which works fine. I would also like to be able to parse n g:f which also works, but not n g:a - where a here is part of the BDT rule. The error given is "Missing RULE_ID at 'a'".
I'd like to allow the grammar to parse n g:a for example, and I'd be very grateful if anyone could point out where I'm going wrong here on this very simple grammar.
Lexing is done context free. A keyword can never be an ID. You can address this trough parser rules.
You can introduce a datatype rule
MyID: ID | "a" | ... | "c";
And use it where you use ID
When I import the Lisra recipe,
import demo::lang::Lisra::Syntax;
This creates the syntax:
layout Whitespace = [\t-\n\r\ ]*;
lexical IntegerLiteral = [0-9]+ !>> [0-9];
lexical AtomExp = (![0-9()\t-\n\r\ ])+ !>> ![0-9()\t-\n\r\ ];
start syntax LispExp
= IntegerLiteral
| AtomExp
| "(" LispExp* ")"
;
Through the start syntax-definition, layout should be ignored around the input when it is parsed, as is stated in the documentation: http://tutor.rascal-mpl.org/Rascal/Declarations/SyntaxDefinition/SyntaxDefinition.html
However, when I type:
rascal>(LispExp)` (something)`
This gives me a concrete syntax fragment error (or a ParseError when using the parse-function), in contrast to:
rascal>(LispExp)`(something)`
Which succesfully parses. I tried this both with one of the latest versions of Rascal as well as the Eclipse plugin version. Am I doing something wrong here?
Thank you.
Ps. Lisra's parse-function:
public Lval parse(str txt) = build(parse(#LispExp, txt));
Also fails on the example:
rascal>parse(" (something)")
|project://rascal/src/org/rascalmpl/library/ParseTree.rsc|(10329,833,<253,0>,<279,60>): ParseError(|unknown:///|(0,1,<1,0>,<1,1>))
at *** somewhere ***(|project://rascal/src/org/rascalmpl/library/ParseTree.rsc|(10329,833,<253,0>,<279,60>))
at parse(|project://rascal/src/org/rascalmpl/library/demo/lang/Lisra/Parse.rsc|(163,3,<7,44>,<7,47>))
at $shell$(|stdin:///|(0,13,<1,0>,<1,13>))
When you define a start non-terminal Rascal defines two non-terminals in one go:
rascal>start syntax A = "a";
ok
One non-terminal is A, the other is start[A]. Given a layout non-terminal in scope, say L, the latter is automatically defined by (something like) this rule:
syntax start[A] = L before A top L after;
If you call a parser or wish to parse a concrete fragment, you can use either non-terminal:
parse(#start[A], " a ") // parse using the start non-terminal and extra layout
parse(A, "a") // parse only an A
(start[A]) ` a ` // concrete fragment for the start-non-terminal
(A) `a` // concrete fragment for only an A
[start[A]] " a "
[A] "a"
I have a little grammar containing a few commands which have to be used with Numbers and some of these commands return Numbers as well.
My grammar snippet looks like this:
Command:
name Numbers
| Numbers "test"
;
name:
"abs"
| "acos"
;
Numbers:
NUMBER
| numberReturn
;
numberReturn:
name Numbers
;
terminal NUMBER:
('0'..'9')+("."("0".."9")+)?
;
After having inserted the "Numbers 'test'" part in rule command the compiler complains about non-LL() decicions and tells me I have to work around these (left-factoring, syntactic predicates, backtracking) but my problem is that I have no idea what kind of input wouldn't be non-LL() in this case nor do I have an idea how to left-factor my grammar (I don't want toturn on backtracking).
EDIT:
A few examples of what this grammar should match:
abs 3;
acos abs 4; //interpreted as "acos (abs 4)"
acos 3 test; //(acos 3) test
Best regards
Raven
The grammar you are trying to achieve is left-recursive; that means the parser does not know how to tell between (acos 10) test and acos (10 test) (without the parentheses). However, you can give the parser some hints for it to know the correct order, such as parenthesized expressions.
This would be a valid Xtext grammar, with testparenthesized expressions:
grammar org.xtext.example.mydsl.MyDsl with org.eclipse.xtext.common.Terminals
generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
Model
: operations += UnaryOperation*
;
UnaryOperation returns Expression
: 'abs' exp = Primary
| 'acos' exp = Primary
| '(' exp = Primary 'test' ')'
;
Primary returns Expression
: NumberLiteral
| UnaryOperation
;
NumberLiteral
: value = INT
;
The parser will correctly recognize expressions such as:
(acos abs (20 test) test)
acos abs 20
acos 20
(20 test)
These articles may be helpful for you:
https://dslmeinte.wordpress.com/tag/unary-operator/
http://blog.efftinge.de/2010/08/parsing-expressions-with-xtext.html