Why ANTLR generated parser has not parse/start/begin functions? - parsing

I am trying to generate parser with ANTLR for the minimalistic grammar of First Order Logic, that can be found at the ANTLR source code: https://github.com/antlr/grammars-v4/blob/master/fol/fol.g4
The strange thing is, that the generated Parser has no parse, begin, start methods that can be seen in every tutorial. Listener is generated as well, but I am interested to get the parse tree (for later manipulation) and, besides, many tutorials that mention listeners, use one of mentioned 3 functions anyway. What has gone wrong? Are there parser generation options?
The mentioned methods are not included in the generated code, yes, they may be in the base class, but Eclipse consider their us as error (undefined methods).
I am using antlr 4.8.

The generated parser class will have methods with the same names as the rules you define in your grammar. So if your grammar has a rule named foobar and you want to parse your input according to that rule, you'd call parser.foobar() to do so.
If the code in your tutorial calls a method named parse, begin or start, then the grammar in that tutorial almost certainly defines a rule with that name.
In the grammar you linked, the main rule is called condition, so that's the method that you should be calling.

Related

No completion for Xbase expressions when embedded in another Xtext DSL

I am building a Xtext DSL and I want to embed Xbase expressions in some specific places to interpret part of my models using the Xbase interpreter, but I am not able to have method completion in the generated editor.
I reused the examples provided here: https://www.eclipse.org/Xtext/documentation/201_sevenlang_introduction.html, and manage to integrate Xbase as part of my grammar. Keyword completion proposal is working fine (i.e. do, for, while ...), but I can't find a way to have completion for Java/Xbase methods (e.g. newArrayList, or myArray.add(X)).
Clarification from comments below: if I write var x = newArrayList in the editor the method is not styled in italic but I don't have any error either.
This is a sample version of the grammar I am using:
grammar org.xtext.example.common2.Common2 with org.eclipse.xtext.xbase.Xbase
generate common2 "http://www.xtext.org/example/common2/Common2"
import "http://www.eclipse.org/xtext/xbase/Xbase"
Test returns Test:
{Test}
'test'
expressions+=Script
;
Script returns XBlockExpression:
{Script}
'{'
(expressions+=XExpressionOrVarDeclaration ';'?)*
'}'
;
I found out that if I change my grammar to the following one I can have the completion as expected:
grammar org.xtext.example.common2.Common2 with org.eclipse.xtext.xbase.Xbase
generate common2 "http://www.xtext.org/example/common2/Common2"
import "http://www.eclipse.org/xtext/xbase/Xbase"
Test returns XBlockExpression:
{Test}
'test'
expressions+=Script
;
Script returns XBlockExpression:
{Script}
'{'
(expressions+=XExpressionOrVarDeclaration ';'?)*
'}'
;
My guess is that all the tree must be composed of instances of XExpression to enable the completion, but I don't understand why? To me Test should not be a subclass of XBlockExpression (in my real-world use case Test has additional attributes/references), but it should contain an XBlockExpression.
Is there a way to achieve this? Any help/resource to look at would be much appreciated
Note
I already checked this SO question How to embed XBase expressions in an Xtext DSL, I already have xbase.lib in my build path.

ANTLR check for matching XML start and end tags

When using ANTLR to parse XML, can ANTLR validate that an end tag matches its start tag? The XML parser in the ANTLR book doesn’t check for this.
I could imagine a generic approach like this (but never actually tried it myself):
tag: openTag = TAG_OPEN content closeTag = TAG_CLOSE { tagsMatch($openTag, $closeTag); }?;
You'd use a validating predicate to fail the entire rule if the tag parts don't match. Might give you problems with error reporting, but that can be solved.
For arbitrary XML tags, a context free parser can't do this. ANTLR in its pure state is essentially context-free.
You can hack most parsers (probably including ANTLR) to build a tag stack. When <tagname... gets parsed (or lexed as you find convenient), you can push the tag name on the stack. When </tagname... is parsed/lexed, you can match the tagname to the top of the stack, and complain if a mismatch occurs.
I used the lexer version in my XML parser (see bio), seems to work pretty well.

How to simulate a user's input (only internally)?

I need to parse something during the runtime of my eclipse plugin (created with Xtext) which would then be treated as if the user has typed it in but without actually popping up and beeing visible for the user as an input of himself.
For exmample I have this input:
for "i" from 1 to 3 do {};
My problem woth this input is that the variable i is not declared as a normal declaration with a "=" but I need the parser to tell that it is one. So I want to let the parser parse
i = 1;
so it recognizes it as a normal declaration and allows cross-references to it.
Greeting Krzmbrzl
EDIT:
All in all the thing I want is to add a statement i=1; to the AST
I just want to have eclipse support for an existing language so I'm not writing an interpreter nor a generator. The problem is that when I have a for-loop like above the actual interpreter of that language declares a variable i (or however it's named in the loop header) and therefore this variable is available in the loop body. In my case my parser doesn't recognise i as a variable because it only knows that a declaration is done via "=" so I can't use i in the loop body (if I try so I get the error that the declaration i cannot be resolved). This is why I want to add this declaration manually when such a loop is created. I don't need to worry about any compiling or interpreting difficulties because I don't do this myself. As I already said I just want to have all the cool eclipse features for this language.
Ok, now I understand your problem. It is still no good idea to add any element to the AST to resolve any cross reference! You don't do that! Instead you should try to refactor your grammar in a way that "i" in for "i" from ... is a compatible declaration of a variable. There are several tricks to do that. Have you completely read the Xtext documentation? Have also read the Xtext book? Both documents tell a lot about how to make Xtext do things you will not expect.
Anyway, two tricks I often use are:
Introduce an unused, abstract Parser Rule which you can then use as destination of a cross reference, but which is never used as an attribute (containment reference).
AbstractDecl:
VarDecl | ForVarDecl;
VarDecl:
name=ID ...;
ForVarDecl:
'"' name=ID '"';
For:
'for' decl=ForVarDecl 'from' from=INT 'to' to=INT 'do' block=Block;
...
StatementWithCR:
ref=[AbstractDecl] ...;
Define any ParserRule, which returns an other type.
ForDecl returns VarDecl:
'"' name=ID '"';
If you would post the grammar which corresponds to this specific problem, we could develop a solution which is safe. Adding anything to the AST during live processing content of the editor will lead to a faulty state which can destroy your document.

Xtext: refering objects from other languages; namespaces and aliases for importURI?

I'm developing a xtext-based language which should refer to objects defined in a vendor-specific file format.
E.g. this file format defines messages, my language shall define Rules that work with these messages. Of course i want to use xtext features e.g. to autocomplete/validate message names, attributes etc.
Not sure if that is a good idea, but I came up with the following:
Use one xtext project to describe the file format
Add a dependency for this project to my DSL project, import the file format grammar to my grammar
import the description files via importURI
FileFormat grammar:
grammar com.example.xtext.fileformat.FileFormat;
generate fileformat "http://xtext.example.com/fileformat/FileFormat"
[...]
DSL grammar:
grammar com.example.xtext.dsl.DSL;
import "http://xtext.example.com/fileformat/FileFormat" AS ff;
Model:
rules += Rule*;
Rule: ImportFileRule | SampleRule;
ImportFileRule: "IMPORT" importURI=STRING "AS" name=ID ";";
SampleRule: "FORWARD" msg=[ff::Message] ";"
First of all: This works fine.
Now, different imported files may define messages with colliding names,
and I want to use fully qualified names for messages anyways.
The prefix for the message names should be defined in my DSL, e.g. the name of the ImportFileRule.
So I would like to use something like:
IMPORT "first-incredibly-long-filename-with-version-and-stuff.ff" AS first;
IMPORT "second-incredibly-long-filename-with-version-and-stuff.ff" AS second;
FORWARD first.msg_1; // references to msg_1 in first file
FORWARD second.msg_1; // references to msg_1 in second file
Unfortunately I don't see a easy way to achieve this with xtext.
At the moment I'm using a ID for the namespace qualifier and custom ProposalProvider/Validator classes,
which is ugly in detail and bypasses the EMF index, becoming slow with files of 1000 messages and 50000 attributes...
Would there be a right way to do it?
Was it a good idea to use xtext to parse the definition files in the first place?
I have two ideas what to check.
Xtext has a specific global scope provider called ImportedNameSpaceAwareScopeProvider. By using an overridden version of this, you could specify other headers to consider.
Check the implementation of the xtext grammar itself, as it supports such a feature with EPackage imports. I am not exactly sure, how it operates, but should work this way.
Finally, I ended up using the SimpleNamesFragment, ImportURIScopingFragment and a custom ScopeProvider derived from AbstractDeclarativeScopeProvider.
That way, I had to implement ScopeProvider methods for quiet a few rules but was much more flexible in using my "namespace prefix".
E.g. it is simple to implement syntaxes like
FORWARD FROM first: msg_01, msg_02;

Flex input buffer reset after error

I'm using flex & bison to parse a custom language and I'm in the situation described here: http://www.gnu.org/software/bison/manual/html_node/How-Can-I-Reset-the-Parser.html.
To be more precise
I invoke yyparse several times, and on correct input it works
properly; but when a parse error is found, all the other calls fail
too. How can I reset the error flag of yyparse?
My parser and scanner run inside a separate thread, but there is only one thread working with the input file. In my understanding I don't need to write a reentrant scanner since there is only one thread working with the input file. In that page the problem is clearly explained but the solution is not clear to me.
It says:
Therefore, whenever you change yyin, you must tell the Lex-generated
scanner to discard its current buffer and switch to the new one. This
depends upon your implementation of Lex; see its documentation for
more. For Flex, it suffices to call ‘YY_FLUSH_BUFFER’ after each
change to yyin. If your Flex-generated scanner needs to read from
several input streams to handle features like include files, you might
consider using Flex functions like ‘yy_switch_to_buffer’ that
manipulate multiple input buffers
My parser thread calls yyparse in order to build my AST. What is not clear to me is when and where I have to call yy_flush_buffer to fix the problem. In my understanding the scanner code (generated by Flex) is called by the parser code (generated by Bison). The Bison generated code is generated by the grammar. As a result the parser code is not under my direct control. This means I cannot include the call to yy_flush_buffer into the parser code since it would be overwritten every time I generate the parser code by the grammar. It means that I should put the yy_flush_buffer in the grammr file somewhere. But where?
I fixed the problem by doing:
...
FILE *f = fopen(_filename, "r");
yyrestart(f);
yyparse();
...
I leave the question since it could be useful for other people.

Resources