How I can convert concrete syntax values to other kinds of values? - rascal

Given some concrete syntax value, how I can I map it to a different type of value (in this case an int)?
// Syntax
start syntax MyTree = \node: "(" MyTree left "," MyTree right ")"
| leaf: Leaf leaf
;
layout MyLayout = [\ \t\n\r]*;
lexical Leaf = [0-9]+;
This does not work unfortunately:
public Tree increment() {
MyTree tree = (MyTree)`(3, (1, 10))`;
return visit(tree) {
case l:(Leaf)`3` => l + 1
};
}
Or is the only way to implode into an ADT where I specified the types?

Your question has different possible answers:
using implode you can convert a parse tree to an abstract tree. If the constructors of the target abstract language expect int, then lexical trees which happen to match [0-9]+ will be automatically converted. For example the syntax tree for syntax Exp = intValue: IntValue; could be converted to constructor data Exp = intValue(int i); and it will actually build an i.
in general to convert one type of values to another in Rascal you write (mutually) recursive functions, as in int eval (MyTree t) and int (Leaf l).
if you want to actually increment the syntactic representation of a Leaf value, you have to convert back (parse or via a concrete pattern) from the resulting int back to the Leaf.
Example:
import String;
MyTree increment() {
MyTree tree = (MyTree)`(3, (1, 10))`;
return visit(tree) {
case Leaf l => [Leaf] "<toInt("<l>") + 1>";
};
}
First the lexical is converted to a string "<l>", this is then parsed as an int using toInt() and we add 1 using + 1 and then map the int back to a string "< ... >", after which we can call the Leaf parser using [Leaf].

Related

Get children count of a tree node

The docs for Node only mention following methods:
Equal, GreaterThan, GreaterThanOrEqual, LessThan, LessThanOrEqual, NotEqual, Slice, Subscription
It does mention how to access child by index using Subscription, but how can I find out the count of children node has to iterate over them?
Here is my use case:
Exp parsed = parse(#Exp, "2+(4+3)*48");
println("the number of root children is: " + size(parsed));
But it yields error, as size() seems to only work with a List.
Different answers, different aspects that are better or worse. Here are a few:
import ParseTree;
int getChildrenCount1(Tree parsed) {
return (0 | it + 1 | _ <- parsed.args);
}
getChildrenCount1 iterates over the raw children of a parse tree node. This includes whitespace and comment nodes (layout) and keywords (literals). You might want to filter for those, or compensate by division.
On the other hand, this seems a bit indirect. We could also just directly ask for the length of the children list:
import List;
import ParseTree;
int getChildrenCount2(Tree parsed) {
return size(parsed.args) / 2 + 1; // here we divide by two assuming every other node is a layout node
}
There is also the way of meta-data. Every parse tree node has a declarative description of the production directly there which can be queried and explored:
import ParseTree;
import List;
// immediately match on the meta-structure of a parse node:
int getChildrenCount3(appl(Production prod, list[Tree] args)) {
return size(prod.symbols);
}
This length of symbols should be the same as the length of args.
// To filter for "meaningful" children in a declarative way:
int getChildrenCount4(appl(prod(_, list[Symbol] symbols, _), list[Tree] args)) {
return (0 | it + 1 | sort(_) <- symbols);
}
The sort filters for context-free non-terminals as declared with syntax rules. Lexical children would match lex and layout and literals with layouts and lit.
Without all that pattern matching:
int getChildrenCount4(Tree tree) {
return (0 | it + 1 | s <- tree.prod.symbols, isInteresting(s));
}
bool isInteresting(Symbol s) = s is sort || s is lex;
So far this seems to work, but it is awful:
int getChildrenCount(Tree parsed) {
int infinity = 1000;
for (int i <- [0..infinity]) {
try parsed[i];
catch: return i;
}
return infinity;
}
void main() {
Exp parsed = parse(#Exp, "132+(4+3)*48");
println("the number of root children is: ");
println(getChildrenCount(parsed));
}

Making Rascal structures looking better

To keep structures clear is it possible to name them. So essentially I asking for a 'struct' in Rascal. So eg:
list[tupple[map[str,int],int]]
to:
treeLabel :: str
occurences :: int
treeData :: map[treeLabel,int]
treeNode :: tupple[treeData,int]
tree :: list[treeNode]
tree x=[];
Tx
Jos
How about using Abstract Data Types?
See Rascal Tutor. The above could then look like this:
data MyStruct = ms(str treeLabel,
int occurrence,
map[treeLabel, int] treeData,
tuple[TreeData td, int n] treeNode,
list[TreeNode] tree);
given some variable m with a myStruct value you can access elements with the usual dot notation:
m.treeLabel;
m.treeLabel = "xyz";
etc.

Parsing values contained inside nested brackets

I'm just fooling about and strangely found it a bit tricky to parse nested brackets in a simple recursive function.
For example, if the program's purpose it to lookup user details, it may go from {{name surname} age} to {Bob Builder age} and then to Bob Builder 20.
Here is a mini-program for summing totals in curly brackets that demonstrates the concept.
// Parses string recursively by eliminating brackets
def parse(s: String): String = {
if (!s.contains("{")) s
else {
parse(resolvePair(s))
}
}
// Sums one pair and returns the string, starting at deepest nested pair
// e.g.
// {2+10} lollies and {3+{4+5}} peanuts
// should return:
// {2+10} lollies and {3+9} peanuts
def resolvePair(s: String): String = {
??? // Replace the deepest nested pair with it's sumString result
}
// Sums values in a string, returning the result as a string
// e.g. sumString("3+8") returns "11"
def sumString(s: String): String = {
val v = s.split("\\+")
v.foldLeft(0)(_.toInt + _.toInt).toString
}
// Should return "12 lollies and 12 peanuts"
parse("{2+10} lollies and {3+{4+5}} peanuts")
Any ideas to a clean bit of code that could replace the ??? would be great. It's mostly out of curiosity that I'm searching for an elegant solution to this problem.
Parser combinators can handle this kind of situation:
import scala.util.parsing.combinator.RegexParsers
object BraceParser extends RegexParsers {
override def skipWhitespace = false
def number = """\d+""".r ^^ { _.toInt }
def sum: Parser[Int] = "{" ~> (number | sum) ~ "+" ~ (number | sum) <~ "}" ^^ {
case x ~ "+" ~ y => x + y
}
def text = """[^{}]+""".r
def chunk = sum ^^ {_.toString } | text
def chunks = rep1(chunk) ^^ {_.mkString} | ""
def apply(input: String): String = parseAll(chunks, input) match {
case Success(result, _) => result
case failure: NoSuccess => scala.sys.error(failure.msg)
}
}
Then:
BraceParser("{2+10} lollies and {3+{4+5}} peanuts")
//> res0: String = 12 lollies and 12 peanuts
There is some investment before getting comfortable with parser combinators but I think it is really worth it.
To help you decipher the syntax above:
regular expression and strings have implicit conversions to create primitive parsers with strings results, they have type Parser[String].
the ^^ operator allows to apply a function to the parsed elements
it can convert a Parser[String] into a Parser[Int] by doing ^^ {_.toInt}
Parser is a monad and Parser[T].^^(f) is equivalent to Parser[T].map(f)
the ~, ~> and <~ requires some inputs to be in a certain sequence
the ~> and <~ drop one side of the input out of the result
the case a ~ b allows to pattern match the results
Parser is a monad and (p ~ q) ^^ { case a ~ b => f(a, b) } is equivalent to for (a <- p; b <- q) yield (f(a, b))
(p <~ q) ^^ f is equivalent to for (a <- p; _ <- q) yield f(a)
rep1 is a repetition of 1 or more element
| tries to match an input with the parser on its left and if failing it will try the parser on the right
How about
def resolvePair(s: String): String = {
val open = s.lastIndexOf('{')
val close = s.indexOf('}', open)
if((open >= 0) && (close > open)) {
val (a,b) = s.splitAt(open+1)
val (c,d) = b.splitAt(close-open-1)
resolvePair(a.dropRight(1)+sumString(c).toString+d.drop(1))
} else
s
}
I know it's ugly but I think it works fine.

Scala read-eval-print using RegexParser without so much boilerplate?

I'm implementing part of a Scala program that takes input strings of the form "functionName arg1=x1 arg2=x2 ...", parses the xi to the correct types, and then calls a corresponding Scala function functionName(x1,x2,...). The code below is an example implementation with two functions foo and bar, which take different kinds of arguments.
Notice that the types and argument names of foo and bar have to be handwritten into the code in several places: the original function definitions, defining the case classes that the parser returns, and the parsers themselves. The case classes returned by the parser also do basically nothing interesting -- I'm tempted to just call foo and bar from within the parser, but I feel like that would be icky.
My question is: can this implementation be simplified? In practice, I will have many functions with complicated argument types, and I'd prefer to be able to specify those types as few times as possible, and perhaps also not have to define corresponding case classes.
type Word = String
// the original function definitions
def foo(x: Int, w: Word) = println("foo called with " + x + " and " + w)
def bar(y: Int, z: Int) = println("bar called with " + y + " and " + z)
// the return type for the parser
abstract class Functions
case class Foo(x: Int, w: Word) extends Functions
case class Bar(y: Int, z: Int) extends Functions
object FunctionParse extends RegexParsers {
val int = """-?\d+""".r ^^ (_.toInt)
val word = """[a-zA-Z]\w*""".r
val foo = "foo" ~> ("x=" ~> int) ~ ("w=" ~> word) ^^ { case x~w => Foo(x,w) }
val bar = "bar" ~> ("y=" ~> int) ~ ("z=" ~> int) ^^ { case y~z => Bar(y,z) }
val function = foo | bar
def parseString(s: String) = parse(function, s)
}
def main(args: Array[String]) = {
FunctionParse.parseString(args.mkString(" ")) match {
case FunctionParse.Success(result, _) => result match {
case Foo(x, w) => foo(x, w)
case Bar(y, z) => bar(y, z)
}
case _ => println("sux.")
}
}
Edit: I should note that in my case, the specific format above for the input string is not very important -- I'm happy to change it (use xml or whatever) if it results in cleaner, simpler Scala code.
You want reflection, to put it simply. Reflection means finding out, instantiating and calling classes and methods at runtime instead of compile time. For example:
scala> val clazz = Class forName "Foo"
clazz: Class[_] = class Foo
scala> val constructors = clazz.getConstructors
constructors: Array[java.lang.reflect.Constructor[_]] = Array(public Foo(int,java.lang.String))
scala> val constructor = constructors(0)
constructor: java.lang.reflect.Constructor[_] = public Foo(int,java.lang.String)
scala> constructor.getParameter
getParameterAnnotations getParameterTypes
scala> val parameterTypes = constructor.getParameterTypes
parameterTypes: Array[Class[_]] = Array(int, class java.lang.String)
scala> constructor.newInstance(5: Integer, "abc")
res6: Any = Foo(5,abc)
This is all Java reflection. Scala 2.9 still doesn't have a Scala-specific reflection interface, though one is already in development and might well be available on the next version of Scala.
What you're doing looks very reasonable. The only way to 'simplify' it in my mind would be to have less explicit types and/or use reflection to look up the appropriate function...
Update: Daniel's answer is a good example of how to use reflection. In terms of less explicit types, you would have to have the function arguments to be Any...

Scala: Using StandardTokenParser for parsing hexadecimal numbers

I am using Scala combinatorial parser by extending scala.util.parsing.combinator.syntactical.StandardTokenParser. This class provides following methods
def ident : Parser[String] for parsing identifiers and
def numericLit : Parser[String] for parsing a number (decimal I suppose)
I am using scala.util.parsing.combinator.lexical.Scannersfrom scala.util.parsing.combinator.lexical.StdLexicalfor lexing.
My requirement is to parse a hexadecimal number (without the 0x prefix) which can be of any length. Basically a grammar like: ([0-9]|[a-f])+
I tried integrating Regex parser but there are type issues there. Other ways to extend the definition of lexer delimiter and grammar rules lead to token not found!
As I thought the problem can be solved by extending the behavior of Lexer and not the Parser. The standard lexer takes only decimal digits, so I created a new lexer:
class MyLexer extends StdLexical {
override type Elem = Char
override def digit = ( super.digit | hexDigit )
lazy val hexDigits = Set[Char]() ++ "0123456789abcdefABCDEF".toArray
lazy val hexDigit = elem("hex digit", hexDigits.contains(_))
}
And my parser (which has to be a StandardTokenParser) can be extended as follows:
object ParseAST extends StandardTokenParsers{
override val lexical:MyLexer = new MyLexer()
lexical.delimiters += ( "(" , ")" , "," , "#")
...
}
The construction of the "number" from digits is taken care by StdLexical class:
class StdLexical {
...
def token: Parser[Token] =
...
| digit~rep(digit)^^{case first ~ rest => NumericLit(first :: rest mkString "")}
}
Since StdLexical gives just the parsed number as a String it is not a problem for me, as I am not interested in numeric value either.
You can use the RegexParsers with an action associated to the token in question.
import scala.util.parsing.combinator._
object HexParser extends RegexParsers {
val hexNum: Parser[Int] = """[0-9a-f]+""".r ^^
{ case s:String => Integer.parseInt(s,16) }
def seq: Parser[Any] = repsep(hexNum, ",")
}
This will define a parser that reads comma separated hex number with no prior 0x. And it will actually return a Int.
val result = HexParser.parse(HexParser.seq, "1, 2, f, 10, 1a2b34d")
scala> println(result)
[1.21] parsed: List(1, 2, 15, 16, 27439949)
Not there is no way to distinguish decimal notation numbers. Also I'm using the Integer.parseInt, this is limited to the size of your Int. To get any length you may have to make your own parser and use BigInteger or arrays.

Resources