Scope Provider for Nested Objects - xtext

I am trying to program a scope provider for an Entity mapping Model where Entities attributes can reference other entities and mapping part of the DSL can use the attributes of the nested entities.
I followed the instructions in this blog and it works perfectly but I have complication with QualifiedNames.
My DSL looks like the following.
Model:
elements+=BaseElement*;
BaseElement:
Context | Entity | Mapping;
QualifiedName:
ID ('.' ID)*;
QualifiedNameWithWildcard:
QualifiedName '.*'?;
Context:
'context' name=QualifiedName '{'
entities+=Entity*
'}';
Entity:
'entity' name=ID
fields+=Field*;
Field:
Attribute | Reference
;
Attribute:
name=ID type=TYPE_ENUM
;
Reference:
name=ID type=[Entity]
;
enum TYPE_ENUM:
INT | LONG | STRING | BOOLEAN | UUID
;
Mapping:
'mapping' name=ID
mappings+=FieldMapping*
;
FieldMapping:
name=ID '=>' ref=DotExpression
;
DotExpression returns Ref:
EntityRef ({DotExpression.ref=current} "." tail=[Field])*
;
EntityRef returns Ref:
{EntityRef} entity=[Entity]
;
and with this model everything works perfectly.
entity entityA {
x STRING
y STRING
z INT
}
entity EntityB {
a INT
b STRING
}
entity EntityC {
m STRING
n INT
p EntityA
}
mapping Absender
Sender_PostCode => Entity.p.x
The complication arises, if I use the Context element and change the DSL accordingly for the Qualified Name.
context org.test.example {
entity entityA {
x STRING
y STRING
z INT
}
entity EntityB {
a INT
b STRING
}
entity EntityC {
m STRING
n INT
p EntityA
}
}
mapping Absender
Sender_PostCode => Entity.p.x
and language change.
Reference:
name=ID type=[Entity|QualifiedName]
;
EntityRef returns Ref:
{EntityRef} entity=[Entity|QualifiedName]
;
Now Eclipse smart help is not able to identify the 'Entity.p.x'.
I guess QualifiedName confuses the parser for DotExpression for the following ScopeProvider.
#Override
public IScope getScope(EObject context, EReference reference) {
if(context instanceof DotExpression) {
DotExpression dotExpression = (DotExpression) context;
Ref head = dotExpression.getRef();
return recursive(head);
}
return super.getScope(context, reference);
}
private IScope recursive(Ref head) {
if(head instanceof EntityRef) {
EntityRef entity = (EntityRef) head;
return Scopes.scopeFor(entity.getEntity().getFields());
} else if(head instanceof DotExpression) {
DotExpression nextDE = (DotExpression) head;
if(nextDE.getTail() instanceof Attribute) {
return IScope.NULLSCOPE;
} else if(nextDE.getTail() instanceof Reference) {
Reference ref = (Reference) nextDE.getTail();
return Scopes.scopeFor(ref.getType().getFields());
} else if(nextDE.getTail() instanceof DotExpression){
DotExpression tail = (DotExpression) nextDE.getTail();
return recursive((tail.getRef()));
} else {
return IScope.NULLSCOPE;
}
} else {
return IScope.NULLSCOPE;
}
}
Any ideas how can I integrate the QualifiedName with DotExpression?

Related

How to change the delimiter of QualifiedNames

I want to change the delimiter of the QualifiedName from '.' to '#'. Below is my try. The following example from the online documentation.
grammar org.xtext.example.mydsl.MyDsl with org.eclipse.xtext.common.Terminals
generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
Domainmodel:
(elements+=AbstractElement)*;
PackageDeclaration:
'package' name=QualifiedName '{'
(elements+=AbstractElement)*
'}';
AbstractElement:
PackageDeclaration | Type;
QualifiedName:
ID ('#' ID)*;
Type:
DataType | Entity;
DataType:
'datatype' name=ID;
Entity:
'entity' name=ID ('extends' superType=[Entity|QualifiedName])? '{'
(features+=Feature)*
'}';
Feature:
(many?='many')? name=ID ':' type=[Type|QualifiedName];
package org.xtext.example.mydsl
import org.eclipse.xtext.naming.IQualifiedNameConverter
import org.eclipse.xtext.naming.DefaultDeclarativeQualifiedNameProvider
/**
* Use this class to register components to be used at runtime / without the Equinox extension registry.
*/
class MyDslRuntimeModule extends AbstractMyDslRuntimeModule {
override bindIQualifiedNameProvider() {
return MyDslQualifiedNameProvider
}
}
class MyDslQualifiedNameProvider extends DefaultDeclarativeQualifiedNameProvider {
val converter = new MyDslQualifiedNameConverter();
override getConverter(){
converter
}
}
class MyDslQualifiedNameConverter extends IQualifiedNameConverter.DefaultImpl {
override getDelimiter() {
return "#";
}
}
I could not refer any Entity in a package such as "my#company#blog#Blog" in the following. IDE suggests the expression "my#company#blog.Blog" but that doesn't work either.
datatype String
package my#company#blog{
entity Blog{
title : String
}
}
entity Blog2 extends my#company#blog#Blog{
title : String
}
Guice usage solves. Following is how is done. There is no bind method to override in 'AbstractMyDslRuntimeModule'. Parent class, 'AbstractGenericModule', does the trick via 'getBindings' method of itself.
class MyDslRuntimeModule extends AbstractMyDslRuntimeModule {
def Class<? extends IQualifiedNameConverter> bindIQualifiedNameConverter() {
return MyDslQualifiedNameConverter;
}
}
class MyDslQualifiedNameConverter extends IQualifiedNameConverter.DefaultImpl {
override getDelimiter() {
return "#";
}
}

ANTLR's parser enters into "wrong" rule

I'm trying to create an interpreter for a simple programming language using ANTLR. So far it consists of print and numeric expressions.
I created a 'simpleExpr' parser rule to handle negative numbers. I tried other ways, too, but that's the only one which seems to work right for me. However, for some reason my visitor enters to this rule, even if I would expect it to visit my 'number' rule. I really think, that it's not the visitor's fault, because even the tree drawn by ANTLR shows this behavior. It's weird, but it would be OK, but my problem is, that when I try to print out the result of a simple addition, e.g. print(1+2); then it doesn't do that, but enters into 'number' rule instead of 'Plus' rule.
My grammar:
grammar BatshG;
/*
* Parser Rules
*/
compileUnit: (expression | ( println ';') | ( print ';' ))+;
expression:
left=expression '/' right=simpleExpr #Divi
| left=expression '*' right=simpleExpr #Mult
| left=expression '-' right=simpleExpr #Minus
| left=expression '+' right=simpleExpr #Plus
| number=simpleExpr #Number
;
println: 'println' '(' argument=expression ')';
print: 'print' '(' argument=expression ')';
simpleExpr
: (MINUS)?
(FLOAT | INTEGER)
;
MINUS: '-';
INTEGER: [0-9] [0-9]*;
DIGIT : [0-9] ;
FRAC : '.' DIGIT+ ;
EXP : [eE] [-+]? DIGIT+ ;
FLOAT : DIGIT* FRAC EXP? ;
WS: [ \n\t\r]+ -> channel(HIDDEN);
If it helps, here is my visualized tree generated by ANTLR for
print(1+2);
Update:
The visitor class, if it counts:
public class BatshGVisitor : BatshGBaseVisitor<ResultValue>
{
public ResultValue Result { get; set; }
public StringBuilder OutputForPrint { get; set; }
public override ResultValue VisitCompileUnit([NotNull] BatshGParser.CompileUnitContext context)
{
OutputForPrint = new StringBuilder("");
var resultvalue = VisitChildren(context);
Result = new ResultValue() { ExpType = "string", ExpValue = resultvalue.ExpValue };
return Result;
}
public override ResultValue VisitPlus([NotNull] BatshGParser.PlusContext context)
{
var leftExp = VisitChildren(context.left);
var rigthExp = VisitChildren(context.right);
return new ResultValue()
{
ExpType = "number",
ExpValue = (double)leftExp.ExpValue + (double)rigthExp.ExpValue
};
}
//public override ResultValue VisitNumber([NotNull] BatshGParser.NumberContext context)
//{
// return new ResultValue()
// {
// ExpType = "number",
// ExpValue = Double.Parse(context.GetChild(0).GetText()
// + context.GetChild(1).GetText()
// + context.GetChild(2).GetText()
// , CultureInfo.InvariantCulture)
// };
//}
public override ResultValue VisitPrint([NotNull] BatshGParser.PrintContext context)
{
var viCh = VisitChildren(context.argument);
var viChVa = viCh.ExpValue;
string printInner = viChVa.ToString();
var toPrint = new ResultValue()
{
ExpType = viCh.ExpType,
ExpValue = printInner
};
OutputForPrint.Append(toPrint.ExpValue);
return toPrint;
}
public override ResultValue VisitSimpleExpr([NotNull] BatshGParser.SimpleExprContext context)
{
string numberToConvert = "";
if (context.ChildCount == 1)
{
numberToConvert = context.GetChild(0).GetText();
}
else if (context.GetChild(0).ToString() == "-")
{
if (context.ChildCount == 2)
{
numberToConvert = "-" + context.GetChild(1);
}
if (context.ChildCount == 4)
{
numberToConvert = context.GetChild(0).ToString() + context.GetChild(1).ToString() +
context.GetChild(2).ToString() + context.GetChild(3).ToString();
}
}
return new ResultValue()
{
ExpType = "number",
ExpValue = Double.Parse(numberToConvert, CultureInfo.InvariantCulture)
};
}
protected override ResultValue AggregateResult(ResultValue aggregate, ResultValue nextResult)
{
if (aggregate == null)
return new ResultValue()
{
ExpType = nextResult.ExpType,
ExpValue = nextResult.ExpValue
};
if (nextResult == null)
{
return aggregate;
}
return null;
}
}
What's the problem with my grammar?
Thank you!
Inside the visit method for print statements, you have this:
var viCh = VisitChildren(context.argument);
So let's say your input was print(1+2);. Then context.argument would be the PlusContext for 1+2 and the children of context.argument would be a NumberContext for 1, a Token object for + and a SimpleExpression object for 2. So by calling VisitChildren, you're going to visit those children, which is why it never runs VisitPlus and goes directly to the numbers.
Generally, you rarely want to visit the children of some other node. You usually want to visit your own children, not skip the children and directly visit the grand children. So what you should do instead is to call Visit(context.argument);.

How to input null-value into specflow step definition table

How can I input a null value in Specflow through a table?
Let's look at an overly simplistic example:
When a tire is attached to a car
| CarId | TireModel | FabricationDate | Batch |
| 1 | Nokian Hakka R | 2015-09-1 | |
The empty string in the Batch column is interpreted as text by specflow and as such, empty string. Is there a special syntax to mark that column as null?
You can create your own IValueRetriever and replace default one with yours
public class StringValueRetriver : IValueRetriever
{
public bool CanRetrieve(KeyValuePair<string, string> keyValuePair, Type targetType, Type propertyType)
{
return propertyType == typeof(string);
}
public object Retrieve(KeyValuePair<string, string> keyValuePair, Type targetType, Type propertyType)
{
return string.IsNullOrEmpty(keyValuePair.Value) ? null : keyValuePair.Value;
}
}
Some where in your scenario steps
[BeforeScenario]
public void BeforeScenario()
{
Service.Instance.ValueRetrievers.Unregister<TechTalk.SpecFlow.Assist.ValueRetrievers.StringValueRetriever>();
Service.Instance.ValueRetrievers.Register(new StringValueRetriver());
}
older syntax:
[BeforeScenario]
public void BeforeScenario()
{
var defaultStringValueRetriever = Service.Instance.ValueRetrievers.FirstOrDefault(vr => vr is TechTalk.SpecFlow.Assist.ValueRetrievers.StringValueRetriever);
if (defaultStringValueRetriever != null)
{
Service.Instance.UnregisterValueRetriever(defaultStringValueRetriever);
Service.Instance.RegisterValueRetriever(new StringValueRetriver());
}
}
From SpecFlow 3 on-wards, in your Steps class, you can just put the following code. And in the feature file just put null value like this. Now when you use the CreateSet function then it will be deserialized correctly.
Id | Value
1 | <null>
[Binding]
public static class YourStepClass
{
[BeforeTestRun]
public static void BeforeTestRun()
{
Service.Instance.ValueRetrievers.Register(new NullValueRetriever("<null>"));
}
}
I don't believe there is a special syntax for null and I think you'll have to just handle the conversion yourself. The value retrievers have been revised in the v2 branch and you might be able to handle this by deregistering the standard string value retriever and registering your own implementation which looks for some special syntax and returns null.
In the current 1.9.* version though I think you'll just have to check for empty string and return null yourself.
I've just chosen to do this on a case by case manner using a simple extension method.
In the handler I convert the passed in example value parameter and call NullIfEmpty()
Example usage
AndICheckTheBatchNumber(string batch) {
batch = batch.NullIfEmpty();
//use batch as null how you intended
}
Extension method
using System;
namespace Util.Extensions
{
public static class StringExtensions
{
public static string NullIfEmpty(this string str)
{
if (string.IsNullOrEmpty(str))
{
return null;
}
return str;
}
}
}
Combining answers, I did the following:
using TechTalk.SpecFlow;
using TechTalk.SpecFlow.Assist;
using TechTalk.SpecFlow.Assist.ValueRetrievers;
namespace Util.Extensions
{
public class NullValueComparer : IValueComparer
{
private readonly string _nullValue;
public NullValueComparer(string nullValue)
{
_nullValue = nullValue;
}
public bool CanCompare(object actualValue)
{
return actualValue is null || actualValue is string;
}
public bool Compare(string expectedValue, object actualValue)
{
if (_nullValue == expectedValue)
{
return actualValue == null;
}
return expectedValue == (string)actualValue;
}
}
}
And referenced it like this:
[Binding]
public class MyStepDefinitions
{
private MyTestDto _testDto;
private AnotherDtoFromElsewhere _actual;
[BeforeScenario]
public void BeforeTestRun()
{
Service.Instance.ValueRetrievers.Register(new NullValueRetriever("<null>"));
Service.Instance.ValueComparers.Register(new NullValueComparer("<null>"));
}
[When(#"Some test with table:")]
public void WhenTestWithTable(Table table)
{
_testDto = table.CreateInstance<MyTestDto>();
var actual = new AnotherDtoFromElsewhere();
table.CompareToInstance(actual);
}
[Then(#"X should match:")]
public void ThenShouldMatch(Table table)
{
table.CompareToInstance(_actual);
}
}

Grails GORM: How do I create a composite primary key and use it for a table relationship?

I have two tables, one of which (legacy table: A) has two fields that should serve as a composite foreign key and the other one (new table: B) should use a composite primary key for a each row:A has one row:B relationship. How do I describe these tables in terms of GORM?
So far I've been able to create a domain class that reflects the legacy table:A
class A {
...
//composite foreign key to link B class
String className;
String eventName;
B b; //instance of B to be related
static mapping = {
table 'a_table';
id column: 'id';
className column: 'class_name';
eventName column: 'event_name';
//b: ???
}
}
which works, but I can't create a new class:B and the relationship.
I tried to declare B as:
class B implements Serializable{
static auditable = true;
String name;
String className;
String eventName;
static mapping = {
//supposed to make a composite PK
id composite:[className, eventName]
}
}
but this won't compile with a
ERROR context.GrailsContextLoader - Error executing bootstraps: Error evaluating ORM mappings block for domain [com.package.B]: No such property: eventName for class: org.codehaus.groovy.grails.orm.hibernate.cfg.HibernateMappingBuilder
What I want is something like:
static mapping = {
...
b composite: [b.className:className, b.eventName:eventName]
//or whatever is the right way for this to be done.
}
for the A class to make GORM handle this relation.
Did you try to use attribute name instead of use attribute value ?
class B implements Serializable{
String name;
String className;
String eventName;
static mapping = {
//supposed to make a composite PK
id composite:['className', 'eventName']
}
}
And mapping in A :
class A {
static hasMany = [ b : B ]
}
No need to have className or eventName in A
The B class has to be declared Serializable and implements the equals and hashCode methods
import org.apache.commons.lang.builder.HashCodeBuilder
class B implements Serializable{
static auditable = true;
String name;
String className;
String eventName;
boolean equals(other) {
if (!(other instanceof B)) {
return false
}
other.className == className && other.eventName == eventName
}
int hashCode() {
def builder = new HashCodeBuilder()
builder.append className
builder.append eventName
builder.toHashCode()
}
static mapping = {
//supposed to make a composite PK
id composite:["className", "eventName"]
}
}
and the A class just must to have an attribute of the B class and the GORM make the composite foreign key
class A {
//composite foreign key to link B class
B b; //instance of B to be related
static mapping = {
table 'a_table';
id column: 'id';
}
}
it create two tables in the database
+--------------------------------------------------+
| A_TABLE |
+--------+-----------+--------------+--------------+
| id | version | b_className | b_eventName |
+--------+-----------+--------------+--------------+
--where the Primary key is "id" and the foreign key are "b_className and b_eventName"
+--------------------------------------------------+
| B |
+--------+-----------+--------------+--------------+
| name | version | className | eventName |
+--------+-----------+--------------+--------------+
--where the Primary key are "className and eventName"
if you what to change the name of the columns to other just add the clausule in the mapping statement
class A {
//composite foreign key to link B class
B b; //instance of B to be related
static mapping = {
table 'a_table';
id column: 'id';
columns {
b {
column name: "className"
column name: "eventName"
}
}
}
}
import org.apache.commons.lang.builder.HashCodeBuilder
class Person implements Serializable {
String firstName
String lastName
boolean equals(other) {
if (!(other instanceof Person)) {
return false
}
other.firstName == firstName && other.lastName == lastName
}
int hashCode() {
def builder = new HashCodeBuilder()
builder.append firstName
builder.append lastName
builder.toHashCode()
}
static mapping = {
id composite: ['firstName', 'lastName']
}
}
this is what you can find from official docs of grails,
http://grails.org/doc/latest/guide/GORM.html#compositePrimaryKeys
just blindly follow this and your problem will be solved. for explanation refer above link

mapping list of string into hierarchical structure of objects

This is not a homework problem. This questions was asked to one of my friend in an interview test.
I have a list of lines read from a file as input. Each line has a identifier such as (A,B,NN,C,DD) at the start of line. Depending upon the identifier, I need to map the list of records into a single object A which contains a hierarchy structure of objects.
Description of Hierarchy :
Each A can have zero or more B types.
Each B identifier can have zero or more NN and C as child. Similarly each C segment can have zero or more NN and DD child. Abd each DD can have zero or more NN as child.
Mapping classes and their hierarchy:
All the class will have value to hold the String value from current line.
**A - will have list of B**
class A {
List<B> bList;
String value;
public A(String value) {
this.value = value;
}
public void addB(B b) {
if (bList == null) {
bList = new ArrayList<B>();
}
bList.add(b);
}
}
**B - will have list of NN and list of C**
class B {
List<C> cList;
List<NN> nnList;
String value;
public B(String value) {
this.value = value;
}
public void addNN(NN nn) {
if (nnList == null) {
nnList = new ArrayList<NN>();
}
nnList.add(nn);
}
public void addC(C c) {
if (cList == null) {
cList = new ArrayList<C>();
}
cList.add(c);
}
}
**C - will have list of DDs and NNs**
class C {
List<DD> ddList;
List<NN> nnList;
String value;
public C(String value) {
this.value = value;
}
public void addDD(DD dd) {
if (ddList == null) {
ddList = new ArrayList<DD>();
}
ddList.add(dd);
}
public void addNN(NN nn) {
if (nnList == null) {
nnList = new ArrayList<NN>();
}
nnList.add(nn);
}
}
**DD - will have list of NNs**
class DD {
String value;
List<NN> nnList;
public DD(String value) {
this.value = value;
}
public void addNN(NN nn) {
if (nnList == null) {
nnList = new ArrayList<NN>();
}
nnList.add(nn);
}
}
**NN- will hold the line only**
class NN {
String value;
public NN(String value) {
this.value = value;
}
}
What I Did So Far :
The method public A parse(List<String> lines) reads the input list and returns the object A. Since, there might be multiple B, i have created separate method 'parseB to parse each occurrence.
At parseB method, loops through the i = startIndex + 1 to i < lines.size() and checks the start of lines. Occurrence of "NN" is added to current object of B. If "C" is detected at start, it calls another method parseC. The loop will break when we detect "B" or "A" at start.
Similar logic is used in parseC_DD.
public class GTTest {
public A parse(List<String> lines) {
A a;
for (int i = 0; i < lines.size(); i++) {
String curLine = lines.get(i);
if (curLine.startsWith("A")) {
a = new A(curLine);
continue;
}
if (curLine.startsWith("B")) {
i = parseB(lines, i); // returns index i to skip all the lines that are read inside parseB(...)
continue;
}
}
return a; // return mapped object
}
private int parseB(List<String> lines, int startIndex) {
int i;
B b = new B(lines.get(startIndex));
for (i = startIndex + 1; i < lines.size(); i++) {
String curLine = lines.get(i);
if (curLine.startsWith("NN")) {
b.addNN(new NN(curLine));
continue;
}
if (curLine.startsWith("C")) {
i = parseC(b, lines, i);
continue;
}
a.addB(b);
if (curLine.startsWith("B") || curLine.startsWith("A")) { //ending condition
System.out.println("B A "+curLine);
--i;
break;
}
}
return i; // return nextIndex to read
}
private int parseC(B b, List<String> lines, int startIndex) {
int i;
C c = new C(lines.get(startIndex));
for (i = startIndex + 1; i < lines.size(); i++) {
String curLine = lines.get(i);
if (curLine.startsWith("NN")) {
c.addNN(new NN(curLine));
continue;
}
if (curLine.startsWith("DD")) {
i = parseC_DD(c, lines, i);
continue;
}
b.addC(c);
if (curLine.startsWith("C") || curLine.startsWith("A") || curLine.startsWith("B")) {
System.out.println("C A B "+curLine);
--i;
break;
}
}
return i;//return next index
}
private int parseC_DD(C c, List<String> lines, int startIndex) {
int i;
DD d = new DD(lines.get(startIndex));
c.addDD(d);
for (i = startIndex; i < lines.size(); i++) {
String curLine = lines.get(i);
if (curLine.startsWith("NN")) {
d.addNN(new NN(curLine));
continue;
}
if (curLine.startsWith("DD")) {
d=new DD(curLine);
continue;
}
c.addDD(d);
if (curLine.startsWith("NN") || curLine.startsWith("C") || curLine.startsWith("A") || curLine.startsWith("B")) {
System.out.println("NN C A B "+curLine);
--i;
break;
}
}
return i;//return next index
}
public static void main(String[] args) {
GTTest gt = new GTTest();
List<String> list = new ArrayList<String>();
list.add("A1");
list.add("B1");
list.add("NN1");
list.add("NN2");
list.add("C1");
list.add("NNXX");
list.add("DD1");
list.add("DD2");
list.add("NN3");
list.add("NN4");
list.add("DD3");
list.add("NN5");
list.add("B2");
list.add("NN6");
list.add("C2");
list.add("DD4");
list.add("DD5");
list.add("NN7");
list.add("NN8");
list.add("DD6");
list.add("NN7");
list.add("C3");
list.add("DD7");
list.add("DD8");
A a = gt.parse(list);
//show values of a
}
}
My logic is not working properly. Is there any other approach you can figure out? Do you have any suggestions/improvements to my way?
Use hierarchy of objects:
public interface Node {
Node getParent();
Node getLastChild();
boolean addChild(Node n);
void setValue(String value);
Deque getChildren();
}
private static abstract class NodeBase implements Node {
...
abstract boolean canInsert(Node n);
public String toString() {
return value;
}
...
}
public static class A extends NodeBase {
boolean canInsert(Node n) {
return n instanceof B;
}
}
public static class B extends NodeBase {
boolean canInsert(Node n) {
return n instanceof NN || n instanceof C;
}
}
...
public static class NN extends NodeBase {
boolean canInsert(Node n) {
return false;
}
}
Create a tree class:
public class MyTree {
Node root;
Node lastInserted = null;
public void insert(String label) {
Node n = NodeFactory.create(label);
if (lastInserted == null) {
root = n;
lastInserted = n;
return;
}
Node current = lastInserted;
while (!current.addChild(n)) {
current = current.getParent();
if (current == null) {
throw new RuntimeException("Impossible to insert " + n);
}
}
lastInserted = n;
}
...
}
And then print the tree:
public class MyTree {
...
public static void main(String[] args) {
List input;
...
MyTree tree = new MyTree();
for (String line : input) {
tree.insert(line);
}
tree.print();
}
public void print() {
printSubTree(root, "");
}
private static void printSubTree(Node root, String offset) {
Deque children = root.getChildren();
Iterator i = children.descendingIterator();
System.out.println(offset + root);
while (i.hasNext()) {
printSubTree(i.next(), offset + " ");
}
}
}
A mealy automaton solution with 5 states:
wait for A,
seen A,
seen B,
seen C, and
seen DD.
The parse is done completely in one method. There is one current Node that is the last Node seen except the NN ones. A Node has a parent Node except the root. In state seen (0), the current Node represents a (0) (e.g. in state seen C, current can be C1 in the example above). The most fiddling is in state seen DD, that has the most outgoing edges (B, C, DD, and NN).
public final class Parser {
private final static class Token { /* represents A1 etc. */ }
public final static class Node implements Iterable<Node> {
/* One Token + Node children, knows its parent */
}
private enum State { ExpectA, SeenA, SeenB, SeenC, SeenDD, }
public Node parse(String text) {
return parse(Token.parseStream(text));
}
private Node parse(Iterable<Token> tokens) {
State currentState = State.ExpectA;
Node current = null, root = null;
while(there are tokens) {
Token t = iterator.next();
switch(currentState) {
/* do stuff for all states */
/* example snippet for SeenC */
case SeenC:
if(t.Prefix.equals("B")) {
current.PN.PN.AddChildNode(new Node(t, current.PN.PN));
currentState = State.SeenB;
} else if(t.Prefix.equals("C")) {
}
}
return root;
}
}
I'm not satisfied with those trainwrecks to go up the hierarchy to insert a Node somewhere else (current.PN.PN). Eventually, explicit state classes would make the private parse method more readable. Then, the solution gets more akin to the one provided by #AlekseyOtrubennikov. Maybe a straight LL approach yields code that is more beautiful. Maybe best to just rephrase the grammar to a BNF one and delegate parser creation.
A straightforward LL parser, one production rule:
// "B" ("NN" || C)*
private Node rule_2(TokenStream ts, Node parent) {
// Literal "B"
Node B = literal(ts, "B", parent);
if(B == null) {
// error
return null;
}
while(true) {
// check for "NN"
Node nnLit = literal(ts, "NN", B);
if(nnLit != null)
B.AddChildNode(nnLit);
// check for C
Node c = rule_3(ts, parent);
if(c != null)
B.AddChildNode(c);
// finished when both rules did not match anything
if(nnLit == null && c == null)
break;
}
return B;
}
TokenStream enhances Iterable<Token> by allowing to lookahead into the stream - LL(1) because parser must choose between literal NN or deep diving in two cases (rule_2 being one of them). Looks nice, however, missing some C# features here...
#Stefan and #Aleksey are correct: this is simple parsing problem.
You can define your hierarchy constraints in Extended Backus-Naur Form:
A ::= { B }
B ::= { NN | C }
C ::= { NN | DD }
DD ::= { NN }
This description can be transformed into state machine and implemented. But there are a lot of tools that can effectively do this for you: Parser generators.
I am posting my answer only to show that it's quite easy to solve such problems with Haskell (or some other functional language).
Here is complete program that reads strings form stdin and prints parsed tree to the stdout.
-- We are using some standard libraries.
import Control.Applicative ((<$>), (<*>))
import Text.Parsec
import Data.Tree
-- This is EBNF-like description of what to do.
-- You can almost read it like a prose.
yourData = nodeA +>> eof
nodeA = node "A" nodeB
nodeB = node "B" (nodeC <|> nodeNN)
nodeC = node "C" (nodeNN <|> nodeDD)
nodeDD = node "DD" nodeNN
nodeNN = (`Node` []) <$> nodeLabel "NN"
node lbl children
= Node <$> nodeLabel lbl <*> many children
nodeLabel xx = (xx++)
<$> (string xx >> many digit)
+>> newline
-- And this is some auxiliary code.
f +>> g = f >>= \x -> g >> return x
main = do
txt <- getContents
case parse yourData "" txt of
Left err -> print err
Right res -> putStrLn $ drawTree res
Executing it with your data in zz.txt will print this nice tree:
$ ./xxx < zz.txt
A1
+- B1
| +- NN1
| +- NN2
| `- C1
| +- NN2
| +- DD1
| +- DD2
| | +- NN3
| | `- NN4
| `- DD3
| `- NN5
`- B2
+- NN6
+- C2
| +- DD4
| +- DD5
| | +- NN7
| | `- NN8
| `- DD6
| `- NN9
`- C3
+- DD7
`- DD8
And here is how it handles malformed input:
$ ./xxx
A1
B2
DD3
(line 3, column 1):
unexpected 'D'
expecting "B" or end of input

Resources