Specifying modification of part of an array in Dafny - dafny

I am writing a partition method in Dafny as part of a quicksort implementation, and I want to specify that this method only modifies part of the backing array.
Here is the header for my method:
method partitionSegment (a : array<int>, first : int, len : int) returns (p : int)
modifies a
...
The idea is that the first and len parameters specify a segment of array a (elements a[first] ... a[first+len-1]); partitionSegment partitions this array, returning the index of the pivot, which will be between first and first+len-1.
In my modifies clause I would like to indicate that only a[first] ... a[first+len-1] can be modified, rather than all of a. However, when I try using a set comprehension such as:
method partitionSegment (a : array<int>, first : int, len : int) returns (p : int)
modifies (set x | first <= x < first+len :: a[x])
the type checker balks, saying this is a set of integers rather than a set of memory locations. (So a[x] is being interpreted as the value stored at a[x], and not the memory location a[x].)
Is there any way I can do this in dafny: specify part of an array in a modifies annotation?

The best way to do this is to keep the modifies clause as modifies a and then add a postcondition that uses old to guarantee that only the intended parts of a change.
Something like this:
ensures forall i | 0 <= i < a.Length ::
!(first <= i < first + len) ==> a[i] == old(a[i])
In other words, this says that all indices outside the intended range are equal to their values from before the method executed.
In general, you should think of Dafny's modifies clauses as being relatively coarse-grained. They typically constrain which objects can be modified, not which parts of those objects. If you want to specify that some fields of an object don't change, then you can do that with a postcondition like foo.f == old(foo.f).

Related

How to count number of non-empty nodes in binary tree in F#

Consider the binary tree algebraic datatype
type btree = Empty | Node of btree * int * btree
and a new datatype deļ¬ned as follows:
type finding = NotFound | Found of int
Heres my code so far:
let s = Node (Node(Empty, 5, Node(Empty, 2, Empty)), 3, Node (Empty, 6, Empty))
(*
(3)
/ \
(5) (6)
/ \ | \
() (2) () ()
/ \
() ()
*)
(* size: btree -> int *)
let rec size t =
match t with
Empty -> false
| Node (t1, m, t2) -> if (m != Empty) then sum+1 || (size t1) || (size t2)
let num = occurs s
printfn "There are %i nodes in the tree" num
This probably isn't close, I took a function that would find if an integer existed in a tree and tried changing the code for what I was trying to do.
I am very new to using F# and would appreciate any help. I am trying to count all non empty nodes in the tree. For example the tree I'm using should print the value 4.
I did not run the compiler on your code, but I believe this does even compile.
However your idea to use a pattern match in a recursive function is good.
As rmunn commented, you want to determine the number of nodes in each case:
An empty tree has no nodes, hence the result is zero.
A non-empty tree, has at least the root node plus the count of its left and right subtrees.
So something along the lines of the following should work
let rec size t =
match t with
| Empty -> 0
| Node (t1, _, t2) -> 1 + (size t1) + (size t2)
The most important detail here is, that you do not need a global variable sum to store any intermediate values. The whole idea of a recursive function is that those intermediate values are the results of recursive calls.
As a remark, your tree in the comment should look like this, I believe.
(*
(3)
/ \
(5) (6)
/ \ | \
() (2) () ()
/ \
() ()
*)
Edit: I misread the misaligned () as leaves of an empty tree, where in fact they are leaves of the subtree (2). So it was just an ASCII art issue :-)
Friedrich already posted a simple version of the size function that will work for most trees. However, the solution is not "tail-recursive", so it can cause a Stack Overflow for large trees. In functional programming languages like F#, recursion is often the preferred technique for things like counting and other aggregate functions. However, recursive functions generally consume a stack frame for each recursive call. This means that for large structures, the call stack can be exhausted before the function completes. In order to avoid this problem, compilers can optimize functions that are considered "tail-recursive" so that they use only one stack frame regardless of how many times they recurse. Unfortunately, this optimization cannot just be implemented for any recursive algorithm. It requires that the recursive call be the last thing that the function does, thereby ensuring that the compiler does not have to worry about jumping back into the function after the call, allowing it to overwrite the stack frame instead of adding another one.
In order to change the size function to be tail-recursive, we need some way to avoid having to call it twice in the case of a non-empty node, so that the call can be the last step of the function, instead of the addition between the two calls in Friedrich's solution. This can be accomplished using a couple different techniques, generally either using an accumulator or using Continuation Passing Style. The simpler solution is often to use an accumulator to keep track of the total size instead of having it be the return value, while Continuation Passing Style is a more general solution that can handle more complex recursive algorithms.
In order to make an accumulator pattern work for a tree where we have to sum both the left and right sub-trees, we need some way to make one tail-call at the end of the function, while still making sure that both sub-trees are evaluated. A simple way to do that is to also accumulate the right sub-trees in addition to the total count, so we can make subsequent tail-calls to evaluate those trees while evaluating the left sub-trees first. That solution might look something like this:
let size t =
let rec size acc ts = function
| Empty ->
match ts with
| [] -> acc
| head :: tail -> head |> size acc tail
| Node (t1, _, t2) ->
t1 |> size (acc + 1) (t2 :: ts)
t |> size 0 []
This adds the acc parameter and the ts parameter to represent the total count and remaining unevaluated sub-trees. When we hit a populated node, we evaluate the left sub-tree while adding the right sub-tree to our list of trees to evaluate later. When we hit the an empty node, we start evaluating any ts we've accumulated, until we have no further populated nodes or unevaluated sub-trees. This isn't the best possible solution for computing the tree-size, and most real solutions would use Continuation Passing Style to make it tail-recusive, but that should make a good exercise as you get more familiar with the language.

assertion violation when verifying Max function in Dafny?

The following program results in an assertion violation on assert v==40: why ? The program can be verified when the array a contains only one element.
method Max(a:array<int>) returns(max:int)
requires 1<=a.Length
ensures forall j:int :: 0<=j< a.Length ==> max >= a[j]
ensures exists j:int :: 0<=j< a.Length && max == a[j]
{
max:=a[0];
var i :=1;
while(i < a.Length)
invariant 1<=i<=a.Length
decreases a.Length-i
invariant forall j:int :: 0<=j<i ==> max >= a[j]
invariant exists j:int :: 0<=j<i && max == a[j]
{
if(a[i] >= max){max := a[i];}
i := i + 1;
}
}
method Test(){
var a := new int[2];
a[0],a[1] := 40,10;
var v:int:=Max(a);
assert v==40;
}
This is indeed strange! It boils down to the way Dafny handles quantifiers.
Let's start with a human-level proof that the assertion is actually valid. From the postconditions of Max, we know two things about v: (1) it is at least as big as every element in a, and (2) it is equal to some element of a. By (2), v is either 40 or 10, and by (1), v is at least 40 (because it's at least as big as a[0], which is 40). Since 10 is not at least 40, v can't be 10, so it must be 40.
Now, why does Dafny fail to understand this automatically? It's because of the forall quantifier in (1). Dafny (really Z3) internally uses "triggers" to approximate the behavior of universal quantifiers. (Without any approximation, reasoning with quantifiers is undecidable in general, so some restriction like this is required.) The way triggers work is that for each quantifier in the program, a syntactic pattern called the trigger is inferred. Then, that quantifier is completely ignored unless the trigger matches some expression in the context.
In this example, fact (1) will have a trigger of a[j]. (You can see what triggers are inferred in Visual Studio or VSCode or emacs by hovering over the quantifier. Or on the command line, by passing the option /printTooltips and looking for the line number.) That means that the quantifier will be ignored unless there is some expression of the form a[foo] in the context, for any expression foo. Then (1) will be instantiated with foo for j, and we'll learn max >= a[foo].
Since your Test method's assertion doesn't mention any expression of the form a[foo], Dafny will not be able to use fact (1) at all, which results in the spurious assertion violation.
One way to fix your Test method is add the assertion
assert v >= a[0];
just before the other assertion. This is the key consequence of fact (1) that we needed in our human level proof, and it contains the expression a[0], which matches the trigger, allowing Dafny to instantiate the quantifier. The rest of the proof then goes through automatically.
For more information about triggers in general and how to write them manually, see this answer.

Defining algebraic datatypes with constraints in Z3

I've seen some online materials for defining algebraic datatypes like an IntList in Z3. I'm wondering how to define an algebraic datatype with logical constraints. For example, how to define a PosSort that stands for positive integers.
Total functions in SMT
Functions are always total in SMT, which raises the question how to encode partial functions such a data type constructor for PosSort. Thus, I would be surprised if Z3's/SMT's built-in support for algebraic data types supports partial data type constructors (and the SMT-LIB 2.6 standard appears to agree).
Encoding partial functions: the theory
However, not all hope is lost, but you'll probably have to encode ADTs yourself. Assume a total function f: A -> B, which should model a partial data type constructor function f': A ~> B whose domain are all a that satisfy p(a). Here, A could be Int, B could be List[A], p(a) could be 0 < a and f(a) could be defined as f(a) := a :: Nil (I am using pseudo-code here, but you should get the idea).
One approach is to ensure that f is never applied to an a that is not positive. Depending on where your SMT code comes from, it might be possible to check that constrain before each application of f (and to raise an error of f isn't applicable).
The other approach is to underspecify f and conditionally define it, e.g. along the lines of 0 < a ==> f(a) := a :: Nil. This way, f remains total (which, as said before, you'll most likely have to live with), but its value is undefined for a <= 0. Hence, when you try to prove something about f(a), e.g. that head(f(a)) == a, then this should fail (assuming that head(a :: _) is defined as a).
Encoding partial functions: a practical example
I am too lazy to code up an example in SMT, but this encoding of an integer list (in a verification language called Viper) should give you a very concrete idea of how to encode an integer list using uninterpreted functions and axioms. The example can basically be translated to SMT-LIB in a one-to-one manner.
Changing that example such that it axiomatises a list of positive integers is straight-forward: just add the constrain head < 0 to every axiom that talks about list heads. I.e. use the following alternative axioms:
axiom destruct_over_construct_Cons {
forall head: Int, tail: list :: {Cons(head, tail)}
0 < head ==>
head_Cons(Cons(head, tail)) == head
&& tail_Cons(Cons(head, tail)) == tail
}
...
axiom type_of_Cons {
forall head: Int, tail: list ::
0 < head ==> type(Cons(head, tail)) == type_Cons()
}
If you run the example online with these changes, the test method test_quantifiers() should fail immediately. Adding the necessary constraints on the list elements, i.e. changing it to
method test_quantifiers() {
/* The elements of a deconstructed Cons are equivalent to the corresponding arguments of Cons */
assert forall head: Int, tail: list, xs: list ::
0 < head ==>
is_Cons(xs) ==> (head == head_Cons(xs) && tail == tail_Cons(xs) <==> Cons(head, tail) == xs)
/* Two Cons are equal iff their constructors' arguments are equal */
assert forall head1: Int, head2: Int, tail1: list, tail2: list ::
(0 < head1 && 0 < head2) ==>
(Cons(head1, tail1) == Cons(head2, tail2)
<==>
head1 == head2 && tail1 == tail2)
}
should make the verification succeed again.
What you are looking for is called predicate-subtyping; and as far as I know Yices is the only SMT solver that supported it out of the box: http://yices.csl.sri.com/old/language.shtml
In particular, see the examples here: http://yices.csl.sri.com/old/language.shtml#language_dependent_types
Unfortunately, this is "old" Yices, and I don't think this particular input-language is supported any longer. As Malte mentioned, SMTLib doesn't have support for predicate subtyping either.
Assuming your output SMTLib is "generated," you can insert "checks" to make sure all elements remain within the domain. But this is rather cumbersome and it is not clear how to deal with partiality. Underspecification is a nice trick, but it can get really hairy and lead to specifications that are very hard to debug.
If you really need predicate subtyping, perhaps SMT solvers are not the best choice for your problem domain. Theorem provers, dependently typed languages, etc. might be more suitable. A practical example, for instance, is the LiquidHaskell system for Haskell programs, which allows predicates to be attached to types to do precisely what you are trying; and uses an SMT-solver to discharge the relevant conditions: https://ucsd-progsys.github.io/liquidhaskell-blog/
If you want to stick to SMT-solvers and don't mind using an older system, I'd recommend Yices with its support for predicate subtyping for modeling such problems. It was (and still is) one of the finest implementations of this very idea in the context of SMT-solving.

Dafny recursion hits every element in sequence, can't verify

The following function, seqToSet, takes a sequence of elements and returns a set containing all (and only) the elements in the given sequence. It does this by calling a recursive helper function, addSeqToSet, with the same sequence and the empty set. The helper function adds each element in the sequence to the set it is given and returns the result. It does this via recursion on the head/tail structure of the sequence. It is done when the sequence is empty, and otherwise calls itself recursively with the tail of the sequence and the union of the set with the singleton set containing the head of the sequence.
Dafny can't verify on its own the postcondition stating that the resulting set contains all of the elements in the original sequence.
What's the right strategy for helping it to see that this is the case?
function method seqToSet(q: seq<int>): set<int>
{
addSeqToSet(q, {})
}
function method addSeqToSet(q: seq<int>, t: set<int>): (r: set<int>)
ensures forall i :: i in q ==> i in r
{
if | q | == 0 then t
else addSeqToSet (q[1..], t + { q[0] })
}
When Dafny tries to verify postconditions of recursive functions, it reasons by induction: assume the postcondition holds on any recursive calls, and show that it also holds for the current call.
Imagine how Dafny reasons about addSeqToSet.
In the first branch, | q | == 0 implies q is empty, so the postcondition holds trivially, since there are no elements i in q.
In the second branch, Dafny assumes the postcondition of the recursive call:
forall i :: i in q[1..] ==> i in r
and then tries to prove the postcondition for the current call:
forall i :: i in q ==> i in r
Notice that since addSeqToSet directly returns the answer from the recursive call, r is the same in both of the above formulas.
If you think about it for a minute, you can see that the postcondition of the outer call does not follow from the postcondition of the recursive call, because the recursive call says nothing about q[0].
You need to strengthen the postcondition in some way so that you know that q[0] in r as well.
One such strengthening is to add the additional postcontition about t
ensures forall i :: i in t ==> i in r
Dafny then verifies both postconditions.

Sum of the elements of a sequence of integers: loop invariant might not be maintained by the loop

After reading Getting Started with Dafny: A Guide, I decided to create my first program: given a sequence of integers, compute the sum of its elements. However, I am having a hard time in getting Dafny to verify the program.
function G(a: seq<int>): int
decreases |a|
{
if |a| == 0 then 0 else a[0] + G(a[1..])
}
method sum(a: seq<int>) returns (r: int)
ensures r == G(a)
{
r := 0;
var i: int := 0;
while i < |a|
invariant 0 <= i <= |a|
invariant r == G(a[0..i])
{
r := r + a[i];
i := i + 1;
}
}
I get
stdin.dfy(12,2): Error BP5003: A postcondition might not hold on this return path.
stdin.dfy(8,12): Related location: This is the postcondition that might not hold.
stdin.dfy(14,16): Error BP5005: This loop invariant might not be maintained by the loop.
I suspect that Dafny needs some "help" in order to verify the program (lemmas maybe?) but I do not know where to start.
Here is a version of your program that verifies.
There were two things to fix: the proof that the postcondition follows after the loop, and the proof that the loop invariant is preserved.
The postcondition
Dafny needs a hint that it might be helpful to try to prove a == a[..|a|]. Asserting that equality is enough to finish this part of the proof: Dafny automatically proves the equality and uses it to prove the postcondition from the loop invariant.
This is a common pattern. You can try to see what is bothering Dafny by doing the proof "by hand" in Dafny by making various assertions that you would use to prove it to yourself on paper.
The loop invariant
This one is a bit more complicated. We need to show that updating r and incrementing i preserves r == G(a[..i]). To do this, I used a calc statement, which let's one prove an equality via a sequence of intermediate steps. (It is always possible to prove such things without calc, if you prefer, by asserting all the relevant equalities as well as any assertions inside the calc. But I think calc is nicer.)
I placed the calc statement before the updates to r and i occur. I know that after the updates occur, I will need to prove r == G(a[..i]) for the updated values of r and i. Thus, before the updates occur, it suffices to prove r + a[i] == G(a[..i+1]) for the un-updated values. My calc statement starts with r + a[i] and works toward G(a[..i+1]).
First, by the loop invariant on entry to the loop, we know that r == G(a[i]) for the current values.
Next, we want to bring the a[i] inside the G. This fact is not entirely trivial, so we need a lemma. I chose to prove something slightly more general than necessary, which is that G(a + b) == G(a) + G(b) for any integer sequences a and b. I call this lemma G_append. Its proof is discussed below. For now, we just use it to get bring the a[i] inside as a singleton sequence.
The last step in this calc is to combine a[0..i] + [a[i]] into a[0..i+1]. This is another sequence extensionality fact, and thus needs to be asserted explicitly.
That completes the calc, which proves the invariant is preserved.
The lemma
The proof of G_append proceeds by induction on a. The base case where a == [] is handled automatically. In the inductive case, we need to show G(a + b) == G(a) + G(b), assuming the induction hypothesis for any subsequences of a. I use another calc statement for this.
Beginning with G(a + b), we first expand the definition of G. Next, we note that (a + b)[0] == a[0] since a != []. Similarly, we have that (a + b)[1..] == a[1..] + b, but since this is another sequence extensionality fact, it must be explicitly asserted. Finally, we can use the induction hypothesis (automatically invoked by Dafny) to show that G(a[1..] + b) == G(a[1..]) + G(b).

Resources