The docs for Node only mention following methods:
Equal, GreaterThan, GreaterThanOrEqual, LessThan, LessThanOrEqual, NotEqual, Slice, Subscription
It does mention how to access child by index using Subscription, but how can I find out the count of children node has to iterate over them?
Here is my use case:
Exp parsed = parse(#Exp, "2+(4+3)*48");
println("the number of root children is: " + size(parsed));
But it yields error, as size() seems to only work with a List.
Different answers, different aspects that are better or worse. Here are a few:
import ParseTree;
int getChildrenCount1(Tree parsed) {
return (0 | it + 1 | _ <- parsed.args);
}
getChildrenCount1 iterates over the raw children of a parse tree node. This includes whitespace and comment nodes (layout) and keywords (literals). You might want to filter for those, or compensate by division.
On the other hand, this seems a bit indirect. We could also just directly ask for the length of the children list:
import List;
import ParseTree;
int getChildrenCount2(Tree parsed) {
return size(parsed.args) / 2 + 1; // here we divide by two assuming every other node is a layout node
}
There is also the way of meta-data. Every parse tree node has a declarative description of the production directly there which can be queried and explored:
import ParseTree;
import List;
// immediately match on the meta-structure of a parse node:
int getChildrenCount3(appl(Production prod, list[Tree] args)) {
return size(prod.symbols);
}
This length of symbols should be the same as the length of args.
// To filter for "meaningful" children in a declarative way:
int getChildrenCount4(appl(prod(_, list[Symbol] symbols, _), list[Tree] args)) {
return (0 | it + 1 | sort(_) <- symbols);
}
The sort filters for context-free non-terminals as declared with syntax rules. Lexical children would match lex and layout and literals with layouts and lit.
Without all that pattern matching:
int getChildrenCount4(Tree tree) {
return (0 | it + 1 | s <- tree.prod.symbols, isInteresting(s));
}
bool isInteresting(Symbol s) = s is sort || s is lex;
So far this seems to work, but it is awful:
int getChildrenCount(Tree parsed) {
int infinity = 1000;
for (int i <- [0..infinity]) {
try parsed[i];
catch: return i;
}
return infinity;
}
void main() {
Exp parsed = parse(#Exp, "132+(4+3)*48");
println("the number of root children is: ");
println(getChildrenCount(parsed));
}
Related
I've got more of my expression parser working (Dart PetitParser to get at AST datastructure created with ExpressionBuilder). It appears to be generating accurate ASTs for floats, parens, power, multiply, divide, add, subtract, unary negative in front of both numbers and expressions. (The nodes are either literal strings, or an object that has a precedence with a List payload that gets walked and concatenated.)
I'm stuck now on visiting the nodes. I have clean access to the top node (thanks to Lukas), but I'm stuck on deciding whether or not to add a paren. For example, in 20+30*40, we don't need parens around 30*40, and the parse tree correctly has the node for this closer to the root so I'll hit it first during traversal. However, I don't seem to have enough data when looking at the 30*40 node to determine if it needs parens before going on to the 20+.. A very similar case would be (20+30)*40, which gets parsed correctly with 20+30 closer to the root, so once again, when visiting the 20+30 node I need to add parens before going on to *40.
This has to be a solved problem, but I never went to compiler school, so I know just enough about ASTs to be dangerous. What "a ha" am I missing?
// rip-common.dart:
import 'package:petitparser/petitparser.dart';
// import 'package:petitparser/debug.dart';
class Node {
int precedence;
List<dynamic> args;
Node([this.precedence = 0, this.args = const []]) {
// nodeList.add(this);
}
#override
String toString() => 'Node($precedence $args)';
String visit([int fromPrecedence = -1]) {
print('=== visiting $this ===');
var buf = StringBuffer();
var parens = (precedence > 0) &&
(fromPrecedence > 0) &&
(precedence < fromPrecedence);
print('<$fromPrecedence $precedence $parens>');
// for debugging:
var curlyOpen = '';
var curlyClose = '';
buf.write(parens ? '(' : curlyOpen);
for (var arg in args) {
if (arg is Node) {
buf.write(arg.visit(precedence));
} else if (arg is String) {
buf.write(arg);
} else {
print('not Node or String: $arg');
buf.write('$arg');
}
}
buf.write(parens ? ')' : curlyClose);
print('$buf for buf');
return '$buf';
}
}
class RIPParser {
Parser _make_parser() {
final builder = ExpressionBuilder();
var number = char('-').optional() &
digit().plus() &
(char('.') & digit().plus()).optional();
// precedence 5
builder.group()
..primitive(number.flatten().map((a) => Node(0, [a])))
..wrapper(char('('), char(')'), (l, a, r) => Node(0, [a]));
// negation is a prefix operator
// precedence 4
builder.group()..prefix(char('-').trim(), (op, a) => Node(4, [op, a]));
// power is right-associative
// precedence 3
builder.group()..right(char('^').trim(), (a, op, b) => Node(3, [a, op, b]));
// multiplication and addition are left-associative
// precedence 2
builder.group()
..left(char('*').trim(), (a, op, b) => Node(2, [a, op, b]))
..left(char('/').trim(), (a, op, b) => Node(2, [a, op, b]));
// precedence 1
builder.group()
..left(char('+').trim(), (a, op, b) => Node(1, [a, op, b]))
..left(char('-').trim(), (a, op, b) => Node(1, [a, op, b]));
final parser = builder.build().end();
return parser;
}
Result _result(String input) {
var parser = _make_parser(); // eventually cache
var result = parser.parse(input);
return result;
}
String parse(String input) {
var result = _result(input);
if (result.isFailure) {
return result.message;
} else {
print('result.value = ${result.value}');
return '$result';
}
}
String visit(String input) {
var result = _result(input);
var top_node = result.value; // result.isFailure ...
return top_node.visit();
}
}
// rip_cmd_example.dart
import 'dart:io';
import 'package:rip_common/rip_common.dart';
void main() {
print('start');
String input;
while (true) {
input = stdin.readLineSync();
if (input.isEmpty) {
break;
}
print(RIPParser().parse(input));
print(RIPParser().visit(input));
}
;
print('done');
}
As you've observed, the ExpressionBuilder already assembles the tree in the right precedence order based on the operator groups you've specified.
This also happens for the wrapping parens node created here: ..wrapper(char('('), char(')'), (l, a, r) => Node(0, [a])). If I test for this node, I get back the input string for your example expressions: var parens = precedence == 0 && args.length == 1 && args[0] is Node;.
Unless I am missing something, there should be no reason for you to track the precedence manually. I would also recommend that you create different node classes for the different operators: ValueNode, ParensNode, NegNode, PowNode, MulNode, ... A bit verbose, but much easier to understand what is going on, if each of them can just visit (print, evaluate, optimize, ...) itself.
I am making a singly-linked list. When you delete a node, the previous node's next should become the current node's next (prev->next = curr->next;) and return data if the index matches. Otherwise, the previous node becomes the current node and the current node becomes the next node (prev = curr; curr = curr->next;):
struct Node<T> {
data: T,
next: Option<Box<Node<T>>>,
}
struct LinkedList<T> {
head: Option<Box<Node<T>>>,
}
impl LinkedList<i64> {
fn remove(&mut self, index: usize) -> i64 {
if self.len() == 0 {
panic!("LinkedList is empty!");
}
if index >= self.len() {
panic!("Index out of range: {}", index);
}
let mut count = 0;
let mut head = &self.head;
let mut prev: Option<Box<Node<i64>>> = None;
loop {
match head {
None => {
panic!("LinkedList is empty!");
}
Some(c) => {
// I have borrowed here
if count == index {
match prev {
Some(ref p) => {
p.next = c.next;
// ^ cannot move out of borrowed content
}
_ => continue,
}
return c.data;
} else {
count += 1;
head = &c.next;
prev = Some(*c);
// ^^ cannot move out of borrowed content
}
}
}
}
}
fn len(&self) -> usize {
unimplemented!()
}
}
fn main() {}
error[E0594]: cannot assign to field `p.next` of immutable binding
--> src/main.rs:31:33
|
30 | Some(ref p) => {
| ----- consider changing this to `ref mut p`
31 | p.next = c.next;
| ^^^^^^^^^^^^^^^ cannot mutably borrow field of immutable binding
error[E0507]: cannot move out of borrowed content
--> src/main.rs:31:42
|
31 | p.next = c.next;
| ^ cannot move out of borrowed content
error[E0507]: cannot move out of borrowed content
--> src/main.rs:40:37
|
40 | prev = Some(*c);
| ^^ cannot move out of borrowed content
Playground Link for more info.
How can I do this? Is my approach wrong?
Before you start, go read Learning Rust With Entirely Too Many Linked Lists. People think that linked lists are easy because they've been taught them in languages that either don't care if you introduce memory unsafety or completely take away that agency from the programmer.
Rust does neither, which means that you have to think about things you might never have thought of before.
There are a number of issues with your code. The one that you ask about, "cannot move out of borrowed content" is already well-covered by numerous other questions, so there's no reason to restate all those good answers:
Cannot move out of borrowed content
Cannot move out of borrowed content when trying to transfer ownership
error[E0507]: Cannot move out of borrowed content
TL;DR: You are attempting to move ownership of next from out of a reference; you cannot.
p.next = c.next;
You are attempting to modify an immutable reference:
let mut head = &self.head;
You allow for people to remove one past the end, which doesn't make sense to me:
if index >= self.len()
You iterate the entire tree not once, but twice before iterating it again to perform the removal:
if self.len() == 0
if index >= self.len()
All of that pales in comparison to the fact that your algorithm is flawed in the eyes of Rust because you attempt to introduce mutable aliasing. If your code were able to compile, you'd have a mutable reference to previous as well as a mutable reference to current. However, you can get a mutable reference to current from previous. This would allow you to break Rust's memory safety guarantees!
Instead, you can only keep track of current and, when the right index is found, break it apart and move the pieces:
fn remove(&mut self, index: usize) -> T {
self.remove_x(index)
.unwrap_or_else(|| panic!("index {} out of range", index))
}
fn remove_x(&mut self, mut index: usize) -> Option<T> {
let mut head = &mut self.head;
while index > 0 {
head = match { head }.as_mut() {
Some(n) => &mut n.next,
None => return None,
};
index -= 1;
}
match head.take().map(|x| *x) {
Some(Node { data, next }) => {
*head = next;
Some(data)
}
None => None,
}
}
See also:
Cannot obtain a mutable reference when iterating a recursive structure: cannot borrow as mutable more than once at a time
How do I get an owned value out of a `Box`?
Playground Link for more info.
There are numerous problems with the rest of your code, such as the fact that the result of your insert method is unlike any I've ever seen before.
How I'd write it.
I am learning F# at the moment but I'm having a hard time understanding this:
let allPrimes =
let rec allPrimes' n =
seq {
if isPrime n then
yield n
yield! allPrimes' (n + 1) }
allPrimes' 2
I am not able to figure out what the yield! operator exactly does even though I've read other simpler examples and it seems yield! returns an inner sequence.
The yield bang operator merges the sub sequence produced by the called sequence expressions into the final sequence. Or in simpler words: it "flattens" the returned sequence to include the elements of the sub sequence in the final sequence.
For your example: Without the yield bang operator you would get something like
{ prime1 { prime2 { prime3 .... }}}
with the yield bang operator you get
{ prime1 prime2 prime3 ... }
where each { denotes a new sequence. Side node: The actual result from my first example would even include more sequences, as it would return sequences only containing sequences as the prime is only returned if n is prime.
Given some concrete syntax value, how I can I map it to a different type of value (in this case an int)?
// Syntax
start syntax MyTree = \node: "(" MyTree left "," MyTree right ")"
| leaf: Leaf leaf
;
layout MyLayout = [\ \t\n\r]*;
lexical Leaf = [0-9]+;
This does not work unfortunately:
public Tree increment() {
MyTree tree = (MyTree)`(3, (1, 10))`;
return visit(tree) {
case l:(Leaf)`3` => l + 1
};
}
Or is the only way to implode into an ADT where I specified the types?
Your question has different possible answers:
using implode you can convert a parse tree to an abstract tree. If the constructors of the target abstract language expect int, then lexical trees which happen to match [0-9]+ will be automatically converted. For example the syntax tree for syntax Exp = intValue: IntValue; could be converted to constructor data Exp = intValue(int i); and it will actually build an i.
in general to convert one type of values to another in Rascal you write (mutually) recursive functions, as in int eval (MyTree t) and int (Leaf l).
if you want to actually increment the syntactic representation of a Leaf value, you have to convert back (parse or via a concrete pattern) from the resulting int back to the Leaf.
Example:
import String;
MyTree increment() {
MyTree tree = (MyTree)`(3, (1, 10))`;
return visit(tree) {
case Leaf l => [Leaf] "<toInt("<l>") + 1>";
};
}
First the lexical is converted to a string "<l>", this is then parsed as an int using toInt() and we add 1 using + 1 and then map the int back to a string "< ... >", after which we can call the Leaf parser using [Leaf].
I would like to know how I can handle multiple optionals without concrete pattern matching for each possible permutation.
Below is a simplified example of the problem I am facing:
lexical Int = [0-9]+;
syntax Bool = "True" | "False";
syntax Period = "Day" | "Month" | "Quarter" | "Year";
layout Standard = [\ \t\n\f\r]*;
syntax Optionals = Int? i Bool? b Period? p;
str printOptionals(Optionals opt){
str res = "";
if(!isEmpty("<opt.i>")) { // opt has i is always true (same for opt.i?)
res += printInt(opt.i);
}
if(!isEmpty("<opt.b>")){
res += printBool(opt.b);
}
if(!isEmpty("<opt.p>")) {
res += printPeriod(opt.period);
}
return res;
}
str printInt(Int i) = "<i>";
str printBool(Bool b) = "<b>";
str printPeriod(Period p) = "<p>";
However this gives the error message:
The called signature: printInt(opt(lex("Int"))), does not match the declared signature: str printInt(sort("Int"));
How do I get rid of the opt part when I know it is there?
I'm not sure how ideal this is, but you could do this for now:
if (/Int i := opt.i) {
res += printInt(i);
}
This will extract the Int from within opt.i if it is there, but the match will fail if Int was not provided as one of the options.
The current master on github has the following feature to deal with optionals: they can be iterated over.
For example:
if (Int i <- opt.i) {
res += printInt(i);
}
The <- will produce false immediately if the optional value is absent, and otherwise loop once through and bind the value which is present to the pattern.
An untyped solution is to project out the element from the parse tree:
rascal>opt.i.args[0];
Tree: `1`
Tree: appl(prod(lex("Int"),[iter(\char-class([range(48,57)]))],{}),[appl(regular(iter(\char-class([range(48,57)]))),[char(49)])[#loc=|file://-|(0,1,<1,0>,<1,1>)]])[#loc=|file://-|(0,1,<1,0>,<1,1>)]
However, then to transfer this back to an Int you'd have to pattern match, like so:
rascal>if (Int i := opt.i.args[0]) { printInt(i); }
str: "1"
One could write a generic cast function to help out here:
rascal>&T cast(type[&T] t, value v) { if (&T a := v) return a; throw "cast exception"; }
ok
rascal>printInt(cast(#Int, opt.i.args[0]))
str: "1"
Still, I believe Rascal is missing a feature here. Something like this would be a good feature request:
rascal>Int j = opt.i.value;
rascal>opt.i has value
bool: true