Position of a node? - rascal

Rascal is rooted in term rewriting. Does it have built-in support for term/node position as commonly defined in term rewriting so that I can query for the position of a sub-term inside a term or the other way around?

I don't believe explicit positions are commonly defined in the semantics of term rewriting, but nevertheless Rascal defines all kinds of operations on terms such that positions are explicit or can be made explicit. Please also have a look at he manuals at http://www.rascal-mpl.org
The main operation on terms is pattern matching using normal first order congruence, deep (higher order) match, negative match, disjunctive match, etc:
if (and(and(_, _), _) := and(and(true(),false()), false())) // pattern match operator :=
println("yes!");
and(true(), b) = b; // function definition, aka rewrite rule
and(false(), _) = false();
[ a | and(a,b) <- booleanList]; // comprehension with pattern as filter on a generator
innermost visit (t) { // explicit automated traversal with strategies
case and(a,b) => or(a,b)
}
b.leftHandSide = true(); // assign new child term to the leftHandSide field of the term assigned to the b variable (non-destructively, you get a new b)
b[0] = false(); // same but to the anonymous first child.
Then there are the normal projection operators, index on the children term[0] and child-by-name: term.myChildName if there was a many sorted term signature defined using field labels.
if you want to know at which position a sub-child is, I would perhaps write it as such:
int getPos(node t, value child) = [*pre, child, *_] := getChildren(t) ? size(pre) : -1;
but there are other ways of achieving the same.
Rascal does not have pointers to the parents of a term.

Related

Creating an 'add' computation expression

I'd like the example computation expression and values below to return 6. For some the numbers aren't yielding like I'd expect. What's the step I'm missing to get my result? Thanks!
type AddBuilder() =
let mutable x = 0
member _.Yield i = x <- x + i
member _.Zero() = 0
member _.Return() = x
let add = AddBuilder()
(* Compiler tells me that each of the numbers in add don't do anything
and suggests putting '|> ignore' in front of each *)
let result = add { 1; 2; 3 }
(* Currently the result is 0 *)
printfn "%i should be 6" result
Note: This is just for creating my own computation expression to expand my learning. Seq.sum would be a better approach. I'm open to the idea that this example completely misses the value of computation expressions and is no good for learning.
There is a lot wrong here.
First, let's start with mere mechanics.
In order for the Yield method to be called, the code inside the curly braces must use the yield keyword:
let result = add { yield 1; yield 2; yield 3 }
But now the compiler will complain that you also need a Combine method. See, the semantics of yield is that each of them produces a finished computation, a resulting value. And therefore, if you want to have more than one, you need some way to "glue" them together. This is what the Combine method does.
Since your computation builder doesn't actually produce any results, but instead mutates its internal variable, the ultimate result of the computation should be the value of that internal variable. So that's what Combine needs to return:
member _.Combine(a, b) = x
But now the compiler complains again: you need a Delay method. Delay is not strictly necessary, but it's required in order to mitigate performance pitfalls. When the computation consists of many "parts" (like in the case of multiple yields), it's often the case that some of them should be discarded. In these situation, it would be inefficient to evaluate all of them and then discard some. So the compiler inserts a call to Delay: it receives a function, which, when called, would evaluate a "part" of the computation, and Delay has the opportunity to put this function in some sort of deferred container, so that later Combine can decide which of those containers to discard and which to evaluate.
In your case, however, since the result of the computation doesn't matter (remember: you're not returning any results, you're just mutating the internal variable), Delay can just execute the function it receives to have it produce the side effects (which are - mutating the variable):
member _.Delay(f) = f ()
And now the computation finally compiles, and behold: its result is 6. This result comes from whatever Combine is returning. Try modifying it like this:
member _.Combine(a, b) = "foo"
Now suddenly the result of your computation becomes "foo".
And now, let's move on to semantics.
The above modifications will let your program compile and even produce expected result. However, I think you misunderstood the whole idea of the computation expressions in the first place.
The builder isn't supposed to have any internal state. Instead, its methods are supposed to manipulate complex values of some sort, some methods creating new values, some modifying existing ones. For example, the seq builder1 manipulates sequences. That's the type of values it handles. Different methods create new sequences (Yield) or transform them in some way (e.g. Combine), and the ultimate result is also a sequence.
In your case, it looks like the values that your builder needs to manipulate are numbers. And the ultimate result would also be a number.
So let's look at the methods' semantics.
The Yield method is supposed to create one of those values that you're manipulating. Since your values are numbers, that's what Yield should return:
member _.Yield x = x
The Combine method, as explained above, is supposed to combine two of such values that got created by different parts of the expression. In your case, since you want the ultimate result to be a sum, that's what Combine should do:
member _.Combine(a, b) = a + b
Finally, the Delay method should just execute the provided function. In your case, since your values are numbers, it doesn't make sense to discard any of them:
member _.Delay(f) = f()
And that's it! With these three methods, you can add numbers:
type AddBuilder() =
member _.Yield x = x
member _.Combine(a, b) = a + b
member _.Delay(f) = f ()
let add = AddBuilder()
let result = add { yield 1; yield 2; yield 3 }
I think numbers are not a very good example for learning about computation expressions, because numbers lack the inner structure that computation expressions are supposed to handle. Try instead creating a maybe builder to manipulate Option<'a> values.
Added bonus - there are already implementations you can find online and use for reference.
1 seq is not actually a computation expression. It predates computation expressions and is treated in a special way by the compiler. But good enough for examples and comparisons.

Array product given a dynamic number of arguments

I have a function that does an array product:
arrayProduct(l1,l2,l3) = [[a, b, c] |
a := l1[_]
b := l2[_]
c := l3[_]
]
If I have three arrays defined as follows:
animals1 = ["hippo", "giraffe"]
animals2 = ["lion", "zebra"]
animals3 = ["deer", "bear"]
Then the output of arrayProduct(animals1, animals2, animals3) would be:
[["hippo","lion","deer"],["hippo","lion","bear"],["hippo","zebra","deer"],["hippo","zebra","bear"],["giraffe","lion","deer"],["giraffe","lion","bear"],["giraffe","zebra","deer"],["giraffe","zebra","bear"]]
If I can guarantee that the inputs will always be lists is there away I could make a function that would do the same thing except it could accept a dynamic number of lists as input instead of just 3?
I'm also exploring if it would also be possible to do this with only one argument containing all the arrays within it as opposed to accepting multiple arguments. For example:
[["hippo", "giraffe"], ["lion", "zebra"], ["deer", "bear"], ["ostrich", "flamingo"]]
Any insight into a solution with either approach would be appreciated.
There's no known way to compute an arbitrary N-way cross product in Rego without a builtin.
Why something can't be written in a language can be tricky to explain because it amounts to a proof-sketch. We need to make the argument that there is no policy in Rego that computes an N-way cross product. The formal proofs of expressiveness/complexity have not been worked out, so the best we can do is try to articulate why it might not be possible.
For the N-way cross product, it boils down to the fact that Rego guarantees termination for all policies on all inputs, and to do that it restricts how deeply nested iteration can be. In your example (using some and indentation for clarity) you have 3 nested loops with indexes i, j, k.
arrayProduct(l1,l2,l3) = [[a, b, c] |
some i
a := l1[i]
some j
b := l2[j]
some k
c := l3[k]
]
To implement an N-way cross product arrayProduct([l1, l2, ..., ln]) you would need something equivalent to N nested loops:
# NOT valid Rego
arrayProduct([l1,l2,...,ln]) = [[a, b, ..., n] |
some i1
a := l1[i1]
some i2
b := l2[i2]
...
n := ln[in]
]
where importantly the degree of nested iteration N depends on the input.
To guarantee termination, Rego restricts the degree of nested iteration in a policy. You can only nest iteration as many times as you have some (or more properly variables) appearing in your policy. This is analogous to SQL restricting the number of JOINs to those that appear in the query and view definitions.
Since the degree of nesting required for an N-way cross product is N, and N can be larger than the number of somes in the policy, there is no way to implement the N-way cross product.
As a point of contrast, the number of keys or values that are iterated over inside any one loop CAN and usually DO depend on the input. It's the number of loops that cannot depend on the input.
It's not possible to compute an n-ary product of lists/arrays (or sets or objects) in Rego without adding a built-in function.
In the scenario described above, providing a dynamic number of arrays as input to the function would be equivalent to passing an array of arrays (like you mentioned at the end):
arrayProduct([arr1, arr2, ..., arrN])
This works, except that when we try to implement arrayProduct we get stuck because Rego does not permit recursion and iteration only occurs when you inject a variable into a reference. In your original example l1[_] is a reference to the elements in the first list and _ is a unique variable referring to the array indices in that list.
OPA/Rego evaluates that expression by finding assignments to each _ that satisfy the query. The "problem" is that this requires one variable for each list in the input. If the length of the array of arrays is unknown, we would need an infinite number of variables.
If you really need an n-ary product function I would suggest you implement a custom built-in function for now.

maps,filter,folds and more? Do we really need these in Erlang?

Maps, filters, folds and more : http://learnyousomeerlang.com/higher-order-functions#maps-filters-folds
The more I read ,the more i get confused.
Can any body help simplify these concepts?
I am not able to understand the significance of these concepts.In what use cases will these be needed?
I think it is majorly because of the syntax,diff to find the flow.
The concepts of mapping, filtering and folding prevalent in functional programming actually are simplifications - or stereotypes - of different operations you perform on collections of data. In imperative languages you usually do these operations with loops.
Let's take map for an example. These three loops all take a sequence of elements and return a sequence of squares of the elements:
// C - a lot of bookkeeping
int data[] = {1,2,3,4,5};
int squares_1_to_5[sizeof(data) / sizeof(data[0])];
for (int i = 0; i < sizeof(data) / sizeof(data[0]); ++i)
squares_1_to_5[i] = data[i] * data[i];
// C++11 - less bookkeeping, still not obvious
std::vec<int> data{1,2,3,4,5};
std::vec<int> squares_1_to_5;
for (auto i = begin(data); i < end(data); i++)
squares_1_to_5.push_back((*i) * (*i));
// Python - quite readable, though still not obvious
data = [1,2,3,4,5]
squares_1_to_5 = []
for x in data:
squares_1_to_5.append(x * x)
The property of a map is that it takes a collection of elements and returns the same number of somehow modified elements. No more, no less. Is it obvious at first sight in the above snippets? No, at least not until we read loop bodies. What if there were some ifs inside the loops? Let's take the last example and modify it a bit:
data = [1,2,3,4,5]
squares_1_to_5 = []
for x in data:
if x % 2 == 0:
squares_1_to_5.append(x * x)
This is no longer a map, though it's not obvious before reading the body of the loop. It's not clearly visible that the resulting collection might have less elements (maybe none?) than the input collection.
We filtered the input collection, performing the action only on some elements from the input. This loop is actually a map combined with a filter.
Tackling this in C would be even more noisy due to allocation details (how much space to allocate for the output array?) - the core idea of the operation on data would be drowned in all the bookkeeping.
A fold is the most generic one, where the result doesn't have to contain any of the input elements, but somehow depends on (possibly only some of) them.
Let's rewrite the first Python loop in Erlang:
lists:map(fun (E) -> E * E end, [1,2,3,4,5]).
It's explicit. We see a map, so we know that this call will return a list as long as the input.
And the second one:
lists:map(fun (E) -> E * E end,
lists:filter(fun (E) when E rem 2 == 0 -> true;
(_) -> false end,
[1,2,3,4,5])).
Again, filter will return a list at most as long as the input, map will modify each element in some way.
The latter of the Erlang examples also shows another useful property - the ability to compose maps, filters and folds to express more complicated data transformations. It's not possible with imperative loops.
They are used in almost every application, because they abstract different kinds of iteration over lists.
map is used to transform one list into another. Lets say, you have list of key value tuples and you want just the keys. You could write:
keys([]) -> [];
keys([{Key, _Value} | T]) ->
[Key | keys(T)].
Then you want to have values:
values([]) -> [];
values([{_Key, Value} | T}]) ->
[Value | values(T)].
Or list of only third element of tuple:
third([]) -> [];
third([{_First, _Second, Third} | T]) ->
[Third | third(T)].
Can you see the pattern? The only difference is what you take from the element, so instead of repeating the code, you can simply write what you do for one element and use map.
Third = fun({_First, _Second, Third}) -> Third end,
map(Third, List).
This is much shorter and the shorter your code is, the less bugs it has. Simple as that.
You don't have to think about corner cases (what if the list is empty?) and for experienced developer it is much easier to read.
filter searches lists. You give it function, that takes element, if it returns true, the element will be on the returned list, if it returns false, the element will not be there. For example filter logged in users from list.
foldl and foldr are used, when you have to do additional bookkeeping while iterating over the list - for example summing all the elements or counting something.
The best explanations, I've found about those functions are in books about Lisp: "Structure and Interpretation of Computer Programs" and "On Lisp" Chapter 4..

Extending Query Expressions

Are there any documents or examples out there on how one can extend/add new keywords to query expressions? Is this even possible?
For example, I'd like to add a lead/lag operator.
In addition to the query builder for the Rx Framework mentioned by #pad, there is also a talk by Wonseok Chae from the F# team about Computation Expressions that includes query expressions. I'm not sure if the meeting was recorded, but there are very detailed slides with a cool example on query syntax for generating .NET IL code.
The source code of the standard F# query builder is probably the best resource for finding out what types of operations are supported and how to annotate them with attributes.
The key attributes that you'll probably need are demonstrated by the where clause:
[<CustomOperation("where",MaintainsVariableSpace=true,AllowIntoPattern=true)>]
member Where :
: source:QuerySource<'T,'Q> *
[<ProjectionParameter>] predicate:('T -> bool) -> QuerySource<'T,'Q>
The CustomOperation attribute defines the name of the operation. The (quite important) parameter MaintainsVariableSpace allows you to say that the operation returns the same type of values as it takes as the input. In that case, the variables defined earlier are still available after the operation. For example:
query { for p in db.Products do
let name = p.ProductName
where (p.UnitPrice.Value > 100.0M)
select name }
Here, the variables p and name are still accessible after where because where only filters the input, but it does not transform the values in the list.
Finally, the ProjectionParameter allows you to say that p.UnitValue > 100.0M should actually be turned into a function that takes the context (available variables) and evaluates this expression. If you do not specify this attribute, then the operation just gets the value of the argument as in:
query { for p in .. do
take 10 }
Here, the argument 10 is just a simple expression that cannot use values in p.
Pretty cool feature for the language. Just implemented the reverse to query QuerySource.
Simple example, but just a demonstration.
module QueryExtensions
type ExtendedQueryBuilder() =
inherit Linq.QueryBuilder()
/// Defines an operation 'reverse' that reverses the sequence
[<CustomOperation("reverse", MaintainsVariableSpace = true)>]
member __.Reverse (source : Linq.QuerySource<'T,System.Collections.IEnumerable>) =
let reversed = source.Source |> List.ofSeq |> List.rev
new Linq.QuerySource<'T,System.Collections.IEnumerable>(reversed)
let query = ExtendedQueryBuilder()
And now it being used.
let a = [1 .. 100]
let specialReverse =
query {
for i in a do
select i
reverse
}

How to create a recursive data structure value in (functional) F#?

How can a value of type:
type Tree =
| Node of int * Tree list
have a value that references itself generated in a functional way?
The resulting value should be equal to x in the following Python code, for a suitable definition of Tree:
x = Tree()
x.tlist = [x]
Edit: Obviously more explanation is necessary. I am trying to learn F# and functional programming, so I chose to implement the cover tree which I have programmed before in other languages. The relevant thing here is that the points of each level are a subset of those of the level below. The structure conceptually goes to level -infinity.
In imperative languages a node has a list of children which includes itself. I know that this can be done imperatively in F#. And no, it doesn't create an infinite loop given the cover tree algorithm.
Tomas's answer suggests two possible ways to create recursive data structures in F#. A third possibility is to take advantage of the fact that record fields support direct recursion (when used in the same assembly that the record is defined in). For instance, the following code works without any problem:
type 'a lst = Nil | NonEmpty of 'a nelst
and 'a nelst = { head : 'a; tail : 'a lst }
let rec infList = NonEmpty { head = 1; tail = infList }
Using this list type instead of the built-in one, we can make your code work:
type Tree = Node of int * Tree lst
let rec x = Node(1, NonEmpty { head = x; tail = Nil })
You cannot do this directly if the recursive reference is not delayed (e.g. wrapped in a function or lazy value). I think the motivation is that there is no way to create the value with immediate references "at once", so this would be awkward from the theoretical point of view.
However, F# supports recursive values - you can use those if the recursive reference is delayed (the F# compiler will then generate some code that initializes the data structure and fills in the recursive references). The easiest way is to wrap the refernece inside a lazy value (function would work too):
type Tree =
| Node of int * Lazy<Tree list>
// Note you need 'let rec' here!
let rec t = Node(0, lazy [t; t;])
Another option is to write this using mutation. Then you also need to make your data structure mutable. You can for example store ref<Tree> instead of Tree:
type Tree =
| Node of int * ref<Tree> list
// empty node that is used only for initializataion
let empty = Node(0, [])
// create two references that will be mutated after creation
let a, b = ref empty, ref empty
// create a new node
let t = Node(0, [a; b])
// replace empty node with recursive reference
a := t; b := t
As James mentioned, if you're not allowed to do this, you can have some nice properties such as that any program that walks the data structure will terminate (because the data-structrue is limited and cannot be recursive). So, you'll need to be a bit more careful with recursive values :-)

Resources