Antlr Time-out waiting to connect to the remote parser - timeout

I am using the newest ANTLR. I get this error message while trying to debug this grammar:
grammar Grammar;
options { language = Java;
}
#header {
package parser;
import java.util.HashMap;
import viewmodel.*;
import java.util.List;
}
#members {
/** Map variable name to Integer object holding value */
HashMap memory = new HashMap();
}
prog returns [DiagramNode node]
: clas
{$node = $clas.node;}
;
clas returns [DiagramNode node]
:VISIBILITY* CLASSORINTERFACE name=NAME '{' classDef '}' NEWLINE
{$node = $classDef.node;
$node.setName(name.getText());
}
;
classDef returns [DiagramNode node]
:{$node = new DiagramNode(); }
fieldDef ';' NEWLINE?
{$node.getFields().add($fieldDef.field);}
;
fieldDef returns [DiagramField field]
:{$field = new DiagramField();}
type=NAME name=NAME ';' NEWLINE?
{$field.setType(type.getText());
$field.setName(name.getText());
}
;
VISIBILITY
: ('public' | 'private' | 'protected');
CLASSORINTERFACE
: ('class' | 'inerface');
NAME
: ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9') *;
INT : '0'..'9'+ ;
NEWLINE:'\r'? '\n' {skip();};
WS : (' '|'\t')+ {skip();} ;
The input is:
class Abc {
Type1 Name1;
Type2 Name2;
}
I am assuming that it's grammar's fault, cause another one was compiling and working fine. Could you point me possible errors?

I see that there are no people here who know anything about ANTLR. Luckily I've figured it out on my own. The cause of the problem was the logic inside the grammar - I don't know what exactly, it could be wrong import, using variables, arguments or returns in a wrong way, or eventually missing package declaration for lexer:
#lexer::header{
package parser;
}

Related

A case where ANTLR4 terminates parsing successfully before the end of file is reached due to a parsing error

I gave ANTLR4 the following parser and lexer grammar in separate files (referring to a simple grammar for BNF grammar )
parser grammar BNFParser;
options {tokenVocab = BNFLexer;}
compileUnit
: grammar_rule+
;
grammar_rule : NON_TERMINAL COLON (OR? grammar_rule_alternative)* SEMICOLON
;
grammar_rule_alternative : (NON_TERMINAL|TERMINAL)+
;
and
lexer grammar BNFLexer;
TERMINAL : [A-Z][A-Za-z0-9_]*;
NON_TERMINAL : [a-z][A-Za-z0-9_]*;
OR : '|';
COLON : ':';
SEMICOLON : ';';
WS
: [ \t\r\n]+ -> skip
;
The main program
private static void Main(string[] args) {
StreamReader reader = new StreamReader(args[0]);
AntlrInputStream stream = new AntlrInputStream(reader);
BNFLexer lexer = new BNFLexer(stream);
CommonTokenStream tokens = new CommonTokenStream(lexer);
BNFParser parser = new BNFParser(tokens);
IParseTree root = parser.compileUnit();
Console.WriteLine(root.ToStringTree());
}
Also supplied the following test file for testing the grammar
compileunit : x a
;
x : S b
;
S : compileunit f
;
Please notice from the lexer grammar that Non-Terminals begin with a lowercase letters while Terminals begin with an uppercase letter. This given grammar has an error. The third rule uses a capital letter ( S ) to define Non-Terminal S. The expected behaviour would be to report this as an error. In the contrary parsing succeeds by consuming the first 2 rules and ignoring the third for S without reporting any error. I have also seen the generated files and i noticed the following
try {
EnterOuterAlt(_localctx, 1);
{
State = 7;
_errHandler.Sync(this);
_la = _input.La(1);
do {
{
{
State = 6; grammar_rule();
}
}
State = 9;
_errHandler.Sync(this);
_la = _input.La(1);
} while ( _la==NON_TERMINAL );
}
}
catch (RecognitionException re) {
_localctx.exception = re;
_errHandler.ReportError(this, re);
_errHandler.Recover(this, re);
}
The above code shows that the parser expects a Non-Terminal symbol at the start of a grammar_rule which is what i expect. However what happens when this is not the case? Also another weird issue is that the CommonTokenStream object that contains the tokens recognized by the lexer contains only the tokens until the end of the second rule but non of the tokens of the third rule (S). Is this proper behaviour?
Add an EOF token to your main rule (compileUnit). That will force the parser to use all input until EOF and report an error if that didn't fully match.

Grammatically restricted JVMModelInferrer inheritence with Xtext and XBase

The following canonical XBase entities grammar (from "Implementing Domain-Specific Languages with Xtext and Xtend" Bettini) permits entities to extend any Java class. As the commented line indicates, I would like to grammatically force entities to only inherit from entities.
grammar org.example.xbase.entities.Entities with org.eclipse.xtext.xbase.Xbase
generate entities "http://www.example.org/xbase/entities/Entities"
Model:
importSection=XImportSection?
entities+=Entity*;
Entity:
'entity' name=ID ('extends' superType=JvmParameterizedTypeReference)? '{'
// 'entity' name=ID ('extends' superType=[Entity|QualifiedName])? '{'
attributes += Attribute*
constructors+=Constructor*
operations += Operation*
'}';
Attribute:
'attr' (type=JvmTypeReference)? name=ID ('=' initexpression=XExpression)? ';';
Operation:
'op' (type=JvmTypeReference)? name=ID
'(' (params+=FullJvmFormalParameter (',' params+=FullJvmFormalParameter)*)? ')'
body=XBlockExpression;
Constructor: 'new'
'(' (params+=FullJvmFormalParameter (',' params+=FullJvmFormalParameter)*)? ')'
body=XBlockExpression;
Here is a working JVMModelInferrer for the model above, again where the commented line (and extra method) reflect my intention.
package org.example.xbase.entities.jvmmodel
import com.google.inject.Inject
import org.eclipse.xtext.common.types.JvmTypeReference
import org.eclipse.xtext.naming.IQualifiedNameProvider
import org.eclipse.xtext.xbase.jvmmodel.AbstractModelInferrer
import org.eclipse.xtext.xbase.jvmmodel.IJvmDeclaredTypeAcceptor
import org.eclipse.xtext.xbase.jvmmodel.JvmTypesBuilder
import org.example.xbase.entities.entities.Entity
class EntitiesJvmModelInferrer extends AbstractModelInferrer {
#Inject extension JvmTypesBuilder
#Inject extension IQualifiedNameProvider
def dispatch void infer(Entity entity, IJvmDeclaredTypeAcceptor acceptor, boolean isPreIndexingPhase) {
acceptor.accept(entity.toClass("entities." + entity.name)) [
documentation = entity.documentation
if (entity.superType !== null) {
superTypes += entity.superType.cloneWithProxies
//superTypes += entity.superType.jvmTypeReference.cloneWithProxies
}
entity.attributes.forEach [ a |
val type = a.type ?: a.initexpression?.inferredType
members += a.toField(a.name, type) [
documentation = a.documentation
if (a.initexpression != null)
initializer = a.initexpression
]
members += a.toGetter(a.name, type)
members += a.toSetter(a.name, type)
]
entity.operations.forEach [ op |
members += op.toMethod(op.name, op.type ?: inferredType) [
documentation = op.documentation
for (p : op.params) {
parameters += p.toParameter(p.name, p.parameterType)
}
body = op.body
]
]
entity.constructors.forEach [ con |
members += entity.toConstructor [
for (p : con.params) {
parameters += p.toParameter(p.name, p.parameterType)
}
body = con.body
]
]
]
}
def JvmTypeReference getJvmTypeReference(Entity e) {
e.toClass(e.fullyQualifiedName).typeRef
}
}
The following simple instance parses and infers perfectly (with the comments in place).
entity A {
attr String y;
new(String y) {
this.y=y
}
}
entity B extends A {
new() {
super("Hello World!")
}
}
If, however, I uncomment (and comment in the corresponding line above) both the grammar and the inferrer (and regenerate), the above instance no longer parses. The message is "The method super(String) is undefined".
I understand how to leave the inheritance "loose" and restrict using validators, etc., but would far prefer to strongly type this into the model.
I am lost as to how to solve this, as I am not sure where things are breaking, given the role of XBase and the JvmModelInferrer. A pointer (or reference) would suffice.
[... I am able to implement all the scoping issues for a non-xbase version of this grammar ...]
this won't work. you either have to leave the grammar as is and customize proposal provider and validation. or you have to use "f.q.n.o.y.Entity".typeRef. You can use NodeModelUtils to read the FQN or try something like ("entities."+entity.superType.name).typeRef

Disable wrapping in Xtext formatter

I have a xtext grammar which consists of one declaration per line. When I format the code, all the declarations end up in the same line, the line breaks are removed.
As I didn't manage to change the grammar to require line breaks, I would like to disable the removal of line breaks. How do I do that? Bonus points if someone can tell me how to require line breaks at the end of each declaration.
Part of the Grammar:
grammar com.example.Msg with org.eclipse.xtext.common.Terminals
hidden(WS, SL_COMMENT)
import "http://www.eclipse.org/emf/2002/Ecore" as ecore
generate msg_idl "http://www.example.com/ex/ample/msg"
Model:
MsgDef
;
MsgDef:
(definitions+=definition)+
;
definition:
type=fieldType ' '+ name=ValidID (' '* '=' ' '* const=Value)?
;
fieldType:
value = ( builtinType | header)
;
builtinType:
BOOL = "bool"
| INT32 = "int32"
| CHAR = "char"
;
header:
value="Header"
;
Bool_l:
target=BOOL_E
;
String_l:
target = ('""'|STRING)
;
Number_l:
Double_l | Integer_l | NegInteger_l
;
NegInteger_l:
target=NEG_INT
;
Integer_l :
target=INT
;
Double_l:
target=DOUBLE
;
terminal NEG_INT returns ecore::EInt:
'-' INT
;
terminal DOUBLE returns ecore::EDouble :
('-')? ('0'..'9')* ('.' INT) |
('-')? INT ('.') |
('-')? INT ('.' ('0'..'9')*)? (('e'|'E')('-'|'+')? INT )|
'nan' | 'inf' | '-inf'
;
enum BOOL_E :
true | false
;
ValidID:
"bool"
| "string"
| "time"
| "duration"
| "char"
| ID ;
Value:
String_l | Number_l
;
terminal SL_COMMENT :
' '* '#' !('\n'|'\r')* ('\r'? '\n')?
;
Example data
string left
string top
string right
string bottom
I already tried:
class MsgFormatter extends AbstractDeclarativeFormatter {
extension MsgGrammarAccess msgGrammarAccess = grammarAccess as MsgGrammarAccess
override protected void configureFormatting(FormattingConfig c) {
c.setLinewrap(0, 1, 2).before(SL_COMMENTRule)
c.setLinewrap(0, 1, 2).before(ML_COMMENTRule)
c.setLinewrap(0, 1, 1).after(ML_COMMENTRule)
c.setLinewrap().before(definitionRule); // does not work
c.setLinewrap(1,1,2).before(definitionRule); // does not work
c.setLinewrap().before(fieldTypeRule); // does not work
}
}
In general it is a bad idea to encode whitespace into the language itself. Most of the time it is better to write the language in a way that you can use all kinds of whitespaces (blanks, tabs, newlines ...) to separate tokens.
You should implement a custom formatter for your language that inserts the line breaks after each statement. Xtext comes with two formatter APIs (an old one and a new one starting with Xtext 2.8). I propose to use the new one.
Here you extend AbstractFormatter2 and implement the format methods.
You can find a bit information in the online manual: https://www.eclipse.org/Xtext/documentation/303_runtime_concepts.html#formatting
Some more explanation in the folowing blog post: https://blogs.itemis.com/en/tabular-formatting-with-the-new-formatter-api
Some technical background: https://de.slideshare.net/meysholdt/xtexts-new-formatter-api

Matching a "text" in line by line file with XText

I try to write the Xtext BNF for Configuration files (known with the .ini extension)
For instance, I'd like to successfully parse
[Section1]
a = Easy123
b = This *is* valid too
[Section_2]
c = Voilà # inline comments are ignored
My problem is matching the property value (what's on the right of the '=').
My current grammar works if the property matches the ID terminal (eg a = Easy123).
PropertyFile hidden(SL_COMMENT, WS):
sections+=Section*;
Section:
'[' name=ID ']'
(NEWLINE properties+=Property)+
NEWLINE+;
Property:
name=ID (':' | '=') value=ID ';'?;
terminal WS:
(' ' | '\t')+;
terminal NEWLINE:
// New line on DOS or Unix
'\r'? '\n';
terminal ID:
('A'..'Z' | 'a'..'z') ('A'..'Z' | 'a'..'z' | '_' | '-' | '0'..'9')*;
terminal SL_COMMENT:
// Single line comment
'#' !('\n' | '\r')*;
I don't know how to generalize the grammar to match any text (eg c = Voilà).
I certainly need to introduce a new terminal
Property:
name=ID (':' | '=') value=TEXT ';'?;
Question is: how should I define this TEXT terminal?
I have tried
terminal TEXT: ANY_OTHER+;
This raises a warning
The following token definitions can never be matched because prior tokens match the same input: RULE_INT,RULE_STRING,RULE_ML_COMMENT,RULE_ANY_OTHER
(I think it doesn't matter).
Parsing Fails with
Required loop (...)+ did not match anything at input 'à'
terminal TEXT: !('\r'|'\n'|'#')+;
This raises a warning
The following token definitions can never be matched because prior tokens match the same input: RULE_INT
(I think it doesn't matter).
Parsing Fails with
Missing EOF at [Section1]
terminal TEXT: ('!'|'$'..'~'); (which covers most characters, except # and ")
No warning during the generation of the lexer/parser.
However Parsing Fails with
Mismatch input 'Easy123' expecting RULE_TEXT
Extraneous input 'This' expecting RULE_TEXT
Required loop (...)+ did not match anything at 'is'
Thanks for your help (and I hope this grammar can be useful for others too)
This grammar does the trick:
grammar org.xtext.example.mydsl.MyDsl hidden(SL_COMMENT, WS)
generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
import "http://www.eclipse.org/emf/2002/Ecore"
PropertyFile:
sections+=Section*;
Section:
'[' name=ID ']'
(NEWLINE+ properties+=Property)+
NEWLINE+;
Property:
name=ID value=PROPERTY_VALUE;
terminal PROPERTY_VALUE: (':' | '=') !('\n' | '\r')*;
terminal WS:
(' ' | '\t')+;
terminal NEWLINE:
// New line on DOS or Unix
'\r'? '\n';
terminal ID:
('A'..'Z' | 'a'..'z') ('A'..'Z' | 'a'..'z' | '_' | '-' | '0'..'9')*;
terminal SL_COMMENT:
// Single line comment
'#' !('\n' | '\r')*;
Key is, that you do not try to cover the complete semantics only in the grammar but take other services into account, too. The terminal rule PROPERTY_VALUE consumes the complete value including leading assignment and optional trailing semicolon.
Now just register a value converter service for that language and take care of the insignificant parts of the input, there:
import org.eclipse.xtext.conversion.IValueConverter;
import org.eclipse.xtext.conversion.ValueConverter;
import org.eclipse.xtext.conversion.ValueConverterException;
import org.eclipse.xtext.conversion.impl.AbstractDeclarativeValueConverterService;
import org.eclipse.xtext.conversion.impl.AbstractIDValueConverter;
import org.eclipse.xtext.conversion.impl.AbstractLexerBasedConverter;
import org.eclipse.xtext.nodemodel.INode;
import org.eclipse.xtext.util.Strings;
import com.google.inject.Inject;
public class PropertyConverters extends AbstractDeclarativeValueConverterService {
#Inject
private AbstractIDValueConverter idValueConverter;
#ValueConverter(rule = "ID")
public IValueConverter<String> ID() {
return idValueConverter;
}
#Inject
private PropertyValueConverter propertyValueConverter;
#ValueConverter(rule = "PROPERTY_VALUE")
public IValueConverter<String> PropertyValue() {
return propertyValueConverter;
}
public static class PropertyValueConverter extends AbstractLexerBasedConverter<String> {
#Override
protected String toEscapedString(String value) {
return " = " + Strings.convertToJavaString(value, false);
}
public String toValue(String string, INode node) {
if (string == null)
return null;
try {
String value = string.substring(1).trim();
if (value.endsWith(";")) {
value = value.substring(0, value.length() - 1);
}
return value;
} catch (IllegalArgumentException e) {
throw new ValueConverterException(e.getMessage(), node, e);
}
}
}
}
The follow test case will succeed, after you registered the service in the runtime module like this:
#Override
public Class<? extends IValueConverterService> bindIValueConverterService() {
return PropertyConverters.class;
}
Test case:
import org.junit.runner.RunWith
import org.eclipse.xtext.junit4.XtextRunner
import org.xtext.example.mydsl.MyDslInjectorProvider
import org.eclipse.xtext.junit4.InjectWith
import org.junit.Test
import org.eclipse.xtext.junit4.util.ParseHelper
import com.google.inject.Inject
import org.xtext.example.mydsl.myDsl.PropertyFile
import static org.junit.Assert.*
#RunWith(typeof(XtextRunner))
#InjectWith(typeof(MyDslInjectorProvider))
class ParserTest {
#Inject
ParseHelper<PropertyFile> helper
#Test
def void testSample() {
val file = helper.parse('''
[Section1]
a = Easy123
b : This *is* valid too;
[Section_2]
# comment
c = Voilà # inline comments are ignored
''')
assertEquals(2, file.sections.size)
val section1 = file.sections.head
assertEquals(2, section1.properties.size)
assertEquals("a", section1.properties.head.name)
assertEquals("Easy123", section1.properties.head.value)
assertEquals("b", section1.properties.last.name)
assertEquals("This *is* valid too", section1.properties.last.value)
val section2 = file.sections.last
assertEquals(1, section2.properties.size)
assertEquals("Voilà # inline comments are ignored", section2.properties.head.value)
}
}
The problem (or one problem anyway) with parsing a format like that is that, since the text part may contain = characters, a line like foo = bar will be interpreted as a single TEXT token, not an ID, followed by a '=', followed by a TEXT. I can see no way to avoid that without disallowing (or requiring escaping of) = characters in the text part.
If that is not an option, I think, the only solution would be to make a token type LINE that matches an entire line and then take that apart yourself. You'd do that by removing TEXT and ID from your grammar and replacing them with a token type LINE that matches everything up to the next line break or comment sign and must start with a valid ID. So something like this:
LINE :
('A'..'Z' | 'a'..'z') ('A'..'Z' | 'a'..'z' | '_' | '-' | '0'..'9')*
WS* '=' WS*
!('\r' | '\n' | '#')+
;
This token would basically replace your Property rule.
Of course this is a rather unsatisfactory solution as it will give you the entire line as a string and you still have to pick it apart yourself to separate the ID from the text part. It also prevents you from highlighting the ID part or the = sign as the entire line is one token and you can't highlight part of a token (as far as I know). Overall this does not buy you all that much over not using XText at all, but I don't see a better way.
As a workaround, I have changed
Property:
name=ID ':' value=ID ';'?;
Now, of course, = is not in conflict any more, but this is certainly not a good solution, because properties can usually defined with name=value
Edit: Actually, my input is a specific property file, and the properties are know in advance.
My code now looks like
Section:
'[' name=ID ']'
(NEWLINE (properties+=AbstractProperty)?)+;
AbstractProperty:
ADef
| BDef
ADef:
'A' (':'|'=') ID;
BDef:
'B' (':'|'=') Float;
There is an extra benefit, the property names are know as keywords, and colored as such. However, autocompletion only suggest '[' :(

Parsing unstructured text with ANTLR

As an example, lets say I want to parse mostly unstructured text with single markup element, double star **. This is my ANTLR grammar:
text : (plain | tag)+ ;
plain : ~(TAG) ;
tag : TAG tag_inner TAG ;
tag_inner : ~(TAG) ;
TAG : '**' ;
TEXT : ('a'..'z' | ' ' | '.')+ ;
This grammar works just fine if the text I'm parsing is syntactically correct, that is for every opening ** there is a closing **. If there is an odd number of **s, ANTLR complains, and errors out.
How would one fix this, so that ANTLR will look ahead for a closing double star, and if there is no one treat that lone double star as plain text? I'm pretty sure ANTLR can do this and that syntactic/semantic predicates are the answer, but after an our spent reading the docs, I still can't work it out.
This will get messy when you expand your grammar! :)
But, sure, it is possible using predicates. Here's a demo:
T.g
grammar T;
options {
output=AST;
}
tokens {
ROOT;
PROPER_TAG;
}
parse
: text+ EOF -> ^(ROOT text+)
;
text
: (tag)=> tag // syntactic predicate here! (the `(...)=>`)
| plain
| TAG
;
plain
: ~TAG
;
tag
: TAG plain TAG -> ^(PROPER_TAG plain)
;
TAG : '**' ;
TEXT : ('a'..'z' | ' ' | '.')+ ;
Main.java
import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;
public class Main {
public static void main(String[] args) throws Exception {
TLexer lexer = new TLexer(new ANTLRStringStream("this **is** just **a simple** demo **."));
TParser parser = new TParser(new CommonTokenStream(lexer));
CommonTree tree = (CommonTree)parser.parse().getTree();
DOTTreeGenerator gen = new DOTTreeGenerator();
StringTemplate st = gen.toDOT(tree);
System.out.println(st);
}
}
Run the demo
java -cp antlr-3.3.jar org.antlr.Tool T.g
javac -cp antlr-3.3.jar *.java
java -cp .:antlr-3.3.jar Main
which will produce some DOT-output that corresponds to the following AST:
(image created using graphviz-dev.appspot.com)

Resources