Dafny recursion hits every element in sequence, can't verify - dafny

The following function, seqToSet, takes a sequence of elements and returns a set containing all (and only) the elements in the given sequence. It does this by calling a recursive helper function, addSeqToSet, with the same sequence and the empty set. The helper function adds each element in the sequence to the set it is given and returns the result. It does this via recursion on the head/tail structure of the sequence. It is done when the sequence is empty, and otherwise calls itself recursively with the tail of the sequence and the union of the set with the singleton set containing the head of the sequence.
Dafny can't verify on its own the postcondition stating that the resulting set contains all of the elements in the original sequence.
What's the right strategy for helping it to see that this is the case?
function method seqToSet(q: seq<int>): set<int>
{
addSeqToSet(q, {})
}
function method addSeqToSet(q: seq<int>, t: set<int>): (r: set<int>)
ensures forall i :: i in q ==> i in r
{
if | q | == 0 then t
else addSeqToSet (q[1..], t + { q[0] })
}

When Dafny tries to verify postconditions of recursive functions, it reasons by induction: assume the postcondition holds on any recursive calls, and show that it also holds for the current call.
Imagine how Dafny reasons about addSeqToSet.
In the first branch, | q | == 0 implies q is empty, so the postcondition holds trivially, since there are no elements i in q.
In the second branch, Dafny assumes the postcondition of the recursive call:
forall i :: i in q[1..] ==> i in r
and then tries to prove the postcondition for the current call:
forall i :: i in q ==> i in r
Notice that since addSeqToSet directly returns the answer from the recursive call, r is the same in both of the above formulas.
If you think about it for a minute, you can see that the postcondition of the outer call does not follow from the postcondition of the recursive call, because the recursive call says nothing about q[0].
You need to strengthen the postcondition in some way so that you know that q[0] in r as well.
One such strengthening is to add the additional postcontition about t
ensures forall i :: i in t ==> i in r
Dafny then verifies both postconditions.

Related

How to count number of non-empty nodes in binary tree in F#

Consider the binary tree algebraic datatype
type btree = Empty | Node of btree * int * btree
and a new datatype deļ¬ned as follows:
type finding = NotFound | Found of int
Heres my code so far:
let s = Node (Node(Empty, 5, Node(Empty, 2, Empty)), 3, Node (Empty, 6, Empty))
(*
(3)
/ \
(5) (6)
/ \ | \
() (2) () ()
/ \
() ()
*)
(* size: btree -> int *)
let rec size t =
match t with
Empty -> false
| Node (t1, m, t2) -> if (m != Empty) then sum+1 || (size t1) || (size t2)
let num = occurs s
printfn "There are %i nodes in the tree" num
This probably isn't close, I took a function that would find if an integer existed in a tree and tried changing the code for what I was trying to do.
I am very new to using F# and would appreciate any help. I am trying to count all non empty nodes in the tree. For example the tree I'm using should print the value 4.
I did not run the compiler on your code, but I believe this does even compile.
However your idea to use a pattern match in a recursive function is good.
As rmunn commented, you want to determine the number of nodes in each case:
An empty tree has no nodes, hence the result is zero.
A non-empty tree, has at least the root node plus the count of its left and right subtrees.
So something along the lines of the following should work
let rec size t =
match t with
| Empty -> 0
| Node (t1, _, t2) -> 1 + (size t1) + (size t2)
The most important detail here is, that you do not need a global variable sum to store any intermediate values. The whole idea of a recursive function is that those intermediate values are the results of recursive calls.
As a remark, your tree in the comment should look like this, I believe.
(*
(3)
/ \
(5) (6)
/ \ | \
() (2) () ()
/ \
() ()
*)
Edit: I misread the misaligned () as leaves of an empty tree, where in fact they are leaves of the subtree (2). So it was just an ASCII art issue :-)
Friedrich already posted a simple version of the size function that will work for most trees. However, the solution is not "tail-recursive", so it can cause a Stack Overflow for large trees. In functional programming languages like F#, recursion is often the preferred technique for things like counting and other aggregate functions. However, recursive functions generally consume a stack frame for each recursive call. This means that for large structures, the call stack can be exhausted before the function completes. In order to avoid this problem, compilers can optimize functions that are considered "tail-recursive" so that they use only one stack frame regardless of how many times they recurse. Unfortunately, this optimization cannot just be implemented for any recursive algorithm. It requires that the recursive call be the last thing that the function does, thereby ensuring that the compiler does not have to worry about jumping back into the function after the call, allowing it to overwrite the stack frame instead of adding another one.
In order to change the size function to be tail-recursive, we need some way to avoid having to call it twice in the case of a non-empty node, so that the call can be the last step of the function, instead of the addition between the two calls in Friedrich's solution. This can be accomplished using a couple different techniques, generally either using an accumulator or using Continuation Passing Style. The simpler solution is often to use an accumulator to keep track of the total size instead of having it be the return value, while Continuation Passing Style is a more general solution that can handle more complex recursive algorithms.
In order to make an accumulator pattern work for a tree where we have to sum both the left and right sub-trees, we need some way to make one tail-call at the end of the function, while still making sure that both sub-trees are evaluated. A simple way to do that is to also accumulate the right sub-trees in addition to the total count, so we can make subsequent tail-calls to evaluate those trees while evaluating the left sub-trees first. That solution might look something like this:
let size t =
let rec size acc ts = function
| Empty ->
match ts with
| [] -> acc
| head :: tail -> head |> size acc tail
| Node (t1, _, t2) ->
t1 |> size (acc + 1) (t2 :: ts)
t |> size 0 []
This adds the acc parameter and the ts parameter to represent the total count and remaining unevaluated sub-trees. When we hit a populated node, we evaluate the left sub-tree while adding the right sub-tree to our list of trees to evaluate later. When we hit the an empty node, we start evaluating any ts we've accumulated, until we have no further populated nodes or unevaluated sub-trees. This isn't the best possible solution for computing the tree-size, and most real solutions would use Continuation Passing Style to make it tail-recusive, but that should make a good exercise as you get more familiar with the language.

Surprising Dafny failure to verify boundedness of set comprehension

Dafny has no problem with this definition of a set intersection function.
function method intersection(A: set<int>, B: set<int>): (r: set<int>)
{
set x | x in A && x in B
}
But when it comes to union, Dafny complains, "a set comprehension must produce a finite set, but Dafny's heuristics can't figure out how to produce a bounded set of values for 'x'". A and B are finite, and so, clearly the union is, too.
function method union(A: set<int>, B: set<int>): (r: set<int>)
{
set x | x in A || x in B
}
What explains this, to-a-beginner seemingly discrepant, behavior?
This is indeed potentially surprising!
First, let me note that in practice, Dafny has built-in operators for intersection and union that it knows preserve finiteness. So you don't need to use set comprehensions to express these ideas. Instead you could just say A * B and A + B respectively.
However, my guess is that you're running into a more complicated example where you're using a set comprehension with a disjunction and are confused about why Dafny can't prove it finite.
Dafny uses syntactic heuristics to determine whether a set comprehension is finite. Unfortunately, these heuristics are not well documented anywhere. For purposes of this question, the key point is that the heuristics either depend on the type of the comprehension's bound variables, or look for a conjunct that constrains elements to be bounded in some other way. For example, Dafny can prove
set x: int | 0 <= x < 10 && ...
finite, as well as
set x:A | x in S && ...
In both cases, it is essential that the relevant bounds be conjuncts. Dafny has no syntactic heuristic for proving a bound for disjunctions, although one could imagine adding one. That is why Dafny cannot prove your union function finite.
As an aside, another work around would be to use potentially infinite sets (written iset in Dafny). If you don't need use the cardinality of the sets, then these might work better.

Erlang implementing an amb operator.

On wikipedia it says that using call/cc you can implement the amb operator for nondeterministic choice, and my question is how would you implement the amb operator in a language in which the only support for continuations is to write in continuation passing style, like in erlang?
If you can encode the constraints for what constitutes a successful solution or choice as guards, list comprehensions can be used to generate solutions. For example, the list comprehension documentation shows an example of solving Pythagorean triples, which is a problem frequently solved using amb (see for example exercise 4.35 of SICP, 2nd edition). Here's the more efficient solution, pyth1/1, shown on the list comprehensions page:
pyth1(N) ->
[ {A,B,C} ||
A <- lists:seq(1,N-2),
B <- lists:seq(A+1,N-1),
C <- lists:seq(B+1,N),
A+B+C =< N,
A*A+B*B == C*C
].
One important aspect of amb is efficiently searching the solution space, which is done here by generating possible values for A, B, and C with lists:seq/2 and then constraining and testing those values with guards. Note that the page also shows a less efficient solution named pyth/1 where A, B, and C are all generated identically using lists:seq(1,N); that approach generates all permutations but is slower than pyth1/1 (for example, on my machine, pyth(50) is 5-6x slower than pyth1(50)).
If your constraints can't be expressed as guards, you can use pattern matching and try/catch to deal with failing solutions. For example, here's the same algorithm in pyth/1 rewritten as regular functions triples/1 and the recursive triples/5:
-module(pyth).
-export([triples/1]).
triples(N) ->
triples(1,1,1,N,[]).
triples(N,N,N,N,Acc) ->
lists:reverse(Acc);
triples(N,N,C,N,Acc) ->
triples(1,1,C+1,N,Acc);
triples(N,B,C,N,Acc) ->
triples(1,B+1,C,N,Acc);
triples(A,B,C,N,Acc) ->
NewAcc = try
true = A+B+C =< N,
true = A*A+B*B == C*C,
[{A,B,C}|Acc]
catch
error:{badmatch,false} ->
Acc
end,
triples(A+1,B,C,N,NewAcc).
We're using pattern matching for two purposes:
In the function heads, to control values of A, B and C with respect to N and to know when we're finished
In the body of the final clause of triples/5, to assert that conditions A+B+C =< N and A*A+B*B == C*C match true
If both conditions match true in the final clause of triples/5, we insert the solution into our accumulator list, but if either fails to match, we catch the badmatch error and keep the original accumulator value.
Calling triples/1 yields the same result as the list comprehension approaches used in pyth/1 and pyth1/1, but it's also half the speed of pyth/1. Even so, with this approach any constraint could be encoded as a normal function and tested for success within the try/catch expression.

Erlang pass-by-reference nuances

9> A = lists:seq(1,10).
[1,2,3,4,5,6,7,8,9,10]
13> Fn = fun (L) -> [0|L] end.
#Fun<erl_eval.6.90072148>
14> Fn(A).
[0,1,2,3,4,5,6,7,8,9,10]
15> A.
[1,2,3,4,5,6,7,8,9,10]
If erlang internally passes by reference (see this), why does the value of A not reflect the change?
What fundamental am I missing about passing-by-reference or erlang?
a list is a recursive construction of the form L=[Head|Tail] where Head is any valid erlang term and Tail should be a list (if it is something else L is called an improper list, out of the scope of this discussion).
Saying that L is passed as a reference means that:
it is not necessary to make a copy of the list in the function parameters (good for the process stack :o);
the function returns a value, it never modify any parameter;
and in your particular case, it is even not necessary to make a copy of A to create the returned list. As the variable are not mutable, if you write B = Fn(A), then B will contain A, it will be exactly [0|A].

Tail recursion in Erlang

I am really struggling to understand tail recursion in Erlang.
I have the following eunit test:
db_write_many_test() ->
Db = db:new(),
Db1 = db:write(francesco, london, Db),
Db2 = db:write(lelle, stockholm, Db1),
?assertEqual([{lelle, stockholm},{francesco, london}], Db2).
And here is my implementation:
-module(db) .
-include_lib("eunit/include/eunit.hrl").
-export([new/0,write/3]).
new() ->
[].
write(Key, Value, Database) ->
Record = {Key, Value},
[Record|append(Database)].
append([H|T]) ->
[H|append(T)];
append([]) ->
[].
Is my implementation tail recursive and if not, how can I make it so?
Thanks in advance
Your implementation is not tail recursive because append must hold onto the head of the list while computing the tail. In order for a function to be tail-recursive the return value must not rely on an value other than the what is returned from the function call.
you could rewrite it like so:
append(Acc, []) -> %% termination;
Acc;
append(Acc, [H|T]) ->
Acc2 = Acc ++ dosomethingto(H); %% maybe you meant this to be your write function?
append(Acc2, T); %% tail rercursive
Notice that all the work is finished once the tail recursive call occurs. So the append function can forget everthing in the function body and only needs to remember the values of the arguments it passes into the next call.
Also notice that I put the termination clause before the recursive clause. Erlang evaluates the clauses in order and since termination clauses are typically more specific the less specific recursive clauses will hide them thus preventing the function from ever returning, which is most likey not your desired behaviour.

Resources