The following program results in an assertion violation on assert v==40: why ? The program can be verified when the array a contains only one element.
method Max(a:array<int>) returns(max:int)
requires 1<=a.Length
ensures forall j:int :: 0<=j< a.Length ==> max >= a[j]
ensures exists j:int :: 0<=j< a.Length && max == a[j]
{
max:=a[0];
var i :=1;
while(i < a.Length)
invariant 1<=i<=a.Length
decreases a.Length-i
invariant forall j:int :: 0<=j<i ==> max >= a[j]
invariant exists j:int :: 0<=j<i && max == a[j]
{
if(a[i] >= max){max := a[i];}
i := i + 1;
}
}
method Test(){
var a := new int[2];
a[0],a[1] := 40,10;
var v:int:=Max(a);
assert v==40;
}
This is indeed strange! It boils down to the way Dafny handles quantifiers.
Let's start with a human-level proof that the assertion is actually valid. From the postconditions of Max, we know two things about v: (1) it is at least as big as every element in a, and (2) it is equal to some element of a. By (2), v is either 40 or 10, and by (1), v is at least 40 (because it's at least as big as a[0], which is 40). Since 10 is not at least 40, v can't be 10, so it must be 40.
Now, why does Dafny fail to understand this automatically? It's because of the forall quantifier in (1). Dafny (really Z3) internally uses "triggers" to approximate the behavior of universal quantifiers. (Without any approximation, reasoning with quantifiers is undecidable in general, so some restriction like this is required.) The way triggers work is that for each quantifier in the program, a syntactic pattern called the trigger is inferred. Then, that quantifier is completely ignored unless the trigger matches some expression in the context.
In this example, fact (1) will have a trigger of a[j]. (You can see what triggers are inferred in Visual Studio or VSCode or emacs by hovering over the quantifier. Or on the command line, by passing the option /printTooltips and looking for the line number.) That means that the quantifier will be ignored unless there is some expression of the form a[foo] in the context, for any expression foo. Then (1) will be instantiated with foo for j, and we'll learn max >= a[foo].
Since your Test method's assertion doesn't mention any expression of the form a[foo], Dafny will not be able to use fact (1) at all, which results in the spurious assertion violation.
One way to fix your Test method is add the assertion
assert v >= a[0];
just before the other assertion. This is the key consequence of fact (1) that we needed in our human level proof, and it contains the expression a[0], which matches the trigger, allowing Dafny to instantiate the quantifier. The rest of the proof then goes through automatically.
For more information about triggers in general and how to write them manually, see this answer.
Related
I am trying to understand why the assertion I have below, fails. I can understand that it is because of the loop invariant, but why does Dafny do this? Why does it rely so much on the loop invariant when I have clearly stated the loop condition is while i < n? Is this because Dafny is looking at the verification code first? That is, the invariant line, before it looks at the actual code?
method T ()
{
var n := 10;
var i := 0;
while i < n
invariant 0 <= i <= n + 2
{
i := i + 1;
}
assert i == n;
}
To figure out what is true at the end of a loop Dafny relies on two things. One is the loop invariant, which it will assume is true, and the other is the loop guard (i < n in your case), which it will assume is false.
So immediately after the loop you can assert 0 <=i <= n+1 && !(i < n) This implies that n <= i <= n+2. If you change the assert to
assert n <= i <= n+2 ;
That will verify.
Now, if you strengthen the invariant to
invariant 0 <= i <= n
Then you can strengthen the assert at the end to
assert n <= i <= n ;
Or more simply
assert i == n ;
Dafny is able to infer some loop invariants that you don't tell it. In this case it will notice that n is not changed by the loop. Essentially it will add n==10 as an extra loop invariant. So, if you change the invariant as I suggested, you can assert that i==10 at the end. But it's still only relying on the guard being false and the loop invariants (both the ones you tell it and the ones it infers on its own) being true.
For the loop you posted, Dafny is unable to infer that i <= n is a loop invariant, even though it may perfectly obvious to you that this is an invariant.
It would be interesting if Dafny worked backwards from the post-condition to arrive at something it might try as a loop invariant. E.g. it might look at i==n and i<n and work out that it would need as a loop invariant i<n || i==n (or something stronger) and try that. But that's not something Dafny does (yet).
This is the code that I wrote for a method that returns the maximum of two integers:
predicate greater(x: int, a: int, b: int){
(x >= a) && (x >= b)
}
method Max(a: int, b: int) returns (max: int)
ensures max >= a
ensures max >= b
ensures forall x /*{:trigger greater(x,a,b)}*/ :: (greater(x,a,b)) ==> x >= max
{
if (a > b){
max := a;
}else{
max := b;
}
// assert greater(max, a, b); - trivial assertion
}
method Main(){
var res:= Max(4, 5);
assert res == 5;
}
As you can see, I have tried both the two techniques metnioned in the Wiki page (manual trigger assignment and also adding a trivial non-useful assertion in the method body. However, I still get an assertion error.
I am not sure what else to do. I have read other answers like this, this and this, but none have helped me so far.
PS: I know there is a simpler way to write the postconditions for this particular method, however, I really wish to model the postconditions in terms of the forall quantifier only.
Let's forget greater for a while and just take a look at what you're trying to achieve. After the call to Max in Main, you know the following (from the postcondition of Max):
res >= 4
res >= 5
forall x :: x >= 4 && x >= 5 ==> x >= res
You're trying to prove res == 5 from this. The second of these three things immediately gives you half of that equality, so all you need to do is obtain 5 >= res. If you instantiate the quantifier with 5 for x, you will get
5 >= 4 && 5 >= 4 ==> 5 >= res
which simplifies to 5 >= res, which is what you need, so that's the end of your proof.
In summary, the proof comes down to instantiating the quantifier with 5 for x. Next, you need to know a little about how the Dafny verifier instantiates quantifiers. Essentially, it does this by looking at the "shape" of the quantifier and looking for similar things in the context of what you're trying to prove. By "shape", I mean things like "the functions and predicates it uses". Usually, this technique works well, but in your case, the quantifier is so plain that it doesn't have any "shape" to speak of. Consequently, the verifier fails to come up with the needed instantiation.
It would be nice if we could just say "hey, try instantiating that quantifier with 5 for x". Well, we can, if we give the quantifier some "shape" that we can refer to. That's what those wiki and other guidelines are trying to say. This is where it's useful to introduce the predicate greater. (Don't try to manually write trigger annotations.)
Alright, after introducing greater, your specification says
ensures greater(max, a, b)
ensures forall x :: greater(x, a, b) ==> x >= max
This says "max satisfies greater(max, a, b)" and "among all values x that satisfy greater(x, a, b), max is the smallest". After the call to Max in Main, we then have:
greater(res, 4, 5)
forall x :: greater(x, 4, 5) ==> x >= res
Recall, I said the verifier tries to figure out quantifier instantiations by looking at the quantifier and looking at the context around your assertion, and you're trying to instantiate the quantifier with 5 for x. So, if you can add something to the context just before the assertion that tempts the verifier to do that instantiation, then you're done.
Here's the answer: you want to introduce the term greater(5, 4, 5). This has a shape much like the greater(x, 4, 5) in the quantifier. Because of this similarity, the verifier will instantiate x with 5, which gives
greater(5, 4, 5) ==> 5 >= res
And since greater(5, 4, 5) is easily proved to be true, the needed fact 5 >= res follows.
So, change the body of Main to
var res := Max(4, 5);
assert greater(5, 4, 5);
assert res == 5;
and you're done. The verifier will prove the both assertions. The first is trivial, and after proving it, the verifier gets to use the term greater(5, 4, 5) in the proof of the second assertion. That term is what triggers the quantifier, which produces the fact 5 >= res, which proves the second assertion.
I want to point out that most quantifiers we try to prove do have some shape already. In your case, the predicate greater was introduced in order to give some shape to the quantifier. The technique of adding the extra assertion (here, assert greater(5, 4, 5)) is the same whether or not greater was already defined or was introduced as a trivial predicate that provides shape.
I am trying to write a function to get the minimum of a non-empty set.
Here is what I came up with:
method minimum(s: set<int>) returns (out: int)
requires |s| >= 1
ensures forall t : int :: t in s ==> out <= t
{
var y :| y in s;
if (|s| > 1) {
var m := minimum(s - {y});
out := (if y < m then y else m);
assert forall t : int :: t in (s - {y}) ==> out <= t;
assert out <= y;
} else {
assert |s| == 1;
assert y in s;
assert |s - {y}| == 0;
assert s - {y} == {};
assert s == {y};
return y;
}
}
This is suboptimal for two reasons:
Dafny gives a "No terms found to trigger on." warning for the line,
assert forall t : int :: t in (s - {y}) ==> out <= t;
However, removing this line causes the code to fail to verify. My understanding is that the trigger warning isn't really bad, it's just a warning that Dafny might have trouble with the line. (Even though it actually seems to help.) So it makes me feel like I'm doing something suboptimal or non-idiomatic.
This is pretty inefficient. (It constructs a new set each time, so it would be O(n^2).) But I don't see any other way to iterate through a set. Is there a faster way to do this? Are sets really intended for programming "real" non-ghost code in Dafny?
So my question (in addition to the above) is: is there a better way to write the minimum function?
In this case, I recommend ignoring the trigger warning, since it seems to work fine despite the warning. (Dafny's trigger inference is a little bit overly conservative when it comes to the set theoretic operators, and Z3 is able to infer a good trigger at the low level.) If you really want to fix it, here is one way. Replace the "then" branch of your code with
var s' := (s - {y});
var m := minimum(s');
out := (if y < m then y else m);
assert forall t :: t in s ==> t == y || t in s';
assert forall t : int :: t in s' ==> out <= t;
assert out <= y;
The second problem (about efficiency) is somewhat fundamental. (See Rustan's paper "Compiling Hilbert's Epsilon Operator" where it is mentioned that compiling let-such-that statements results in quadratic performance.) I prefer to think of Dafny's set as a mathematical construct that should not be compiled. (The fact that it can be compiled is a convenience for toy programs, not for real systems, where one would expect a standard library implementation of sets based on a data structure.)
After reading Getting Started with Dafny: A Guide, I decided to create my first program: given a sequence of integers, compute the sum of its elements. However, I am having a hard time in getting Dafny to verify the program.
function G(a: seq<int>): int
decreases |a|
{
if |a| == 0 then 0 else a[0] + G(a[1..])
}
method sum(a: seq<int>) returns (r: int)
ensures r == G(a)
{
r := 0;
var i: int := 0;
while i < |a|
invariant 0 <= i <= |a|
invariant r == G(a[0..i])
{
r := r + a[i];
i := i + 1;
}
}
I get
stdin.dfy(12,2): Error BP5003: A postcondition might not hold on this return path.
stdin.dfy(8,12): Related location: This is the postcondition that might not hold.
stdin.dfy(14,16): Error BP5005: This loop invariant might not be maintained by the loop.
I suspect that Dafny needs some "help" in order to verify the program (lemmas maybe?) but I do not know where to start.
Here is a version of your program that verifies.
There were two things to fix: the proof that the postcondition follows after the loop, and the proof that the loop invariant is preserved.
The postcondition
Dafny needs a hint that it might be helpful to try to prove a == a[..|a|]. Asserting that equality is enough to finish this part of the proof: Dafny automatically proves the equality and uses it to prove the postcondition from the loop invariant.
This is a common pattern. You can try to see what is bothering Dafny by doing the proof "by hand" in Dafny by making various assertions that you would use to prove it to yourself on paper.
The loop invariant
This one is a bit more complicated. We need to show that updating r and incrementing i preserves r == G(a[..i]). To do this, I used a calc statement, which let's one prove an equality via a sequence of intermediate steps. (It is always possible to prove such things without calc, if you prefer, by asserting all the relevant equalities as well as any assertions inside the calc. But I think calc is nicer.)
I placed the calc statement before the updates to r and i occur. I know that after the updates occur, I will need to prove r == G(a[..i]) for the updated values of r and i. Thus, before the updates occur, it suffices to prove r + a[i] == G(a[..i+1]) for the un-updated values. My calc statement starts with r + a[i] and works toward G(a[..i+1]).
First, by the loop invariant on entry to the loop, we know that r == G(a[i]) for the current values.
Next, we want to bring the a[i] inside the G. This fact is not entirely trivial, so we need a lemma. I chose to prove something slightly more general than necessary, which is that G(a + b) == G(a) + G(b) for any integer sequences a and b. I call this lemma G_append. Its proof is discussed below. For now, we just use it to get bring the a[i] inside as a singleton sequence.
The last step in this calc is to combine a[0..i] + [a[i]] into a[0..i+1]. This is another sequence extensionality fact, and thus needs to be asserted explicitly.
That completes the calc, which proves the invariant is preserved.
The lemma
The proof of G_append proceeds by induction on a. The base case where a == [] is handled automatically. In the inductive case, we need to show G(a + b) == G(a) + G(b), assuming the induction hypothesis for any subsequences of a. I use another calc statement for this.
Beginning with G(a + b), we first expand the definition of G. Next, we note that (a + b)[0] == a[0] since a != []. Similarly, we have that (a + b)[1..] == a[1..] + b, but since this is another sequence extensionality fact, it must be explicitly asserted. Finally, we can use the induction hypothesis (automatically invoked by Dafny) to show that G(a[1..] + b) == G(a[1..]) + G(b).
For a correct method, can Z3 find a model for the method's verification condition?
I had thought not, but here is an example where the method is correct
yet verification finds a model.
This was with Dafny 1.9.7.
What Malte says is correct (and I found it nicely explained as well).
Dafny is sound, in the sense that it will only verify correct programs. In other words, if a program is incorrect, the Dafny verifier will never say that it is correct. However, the underlying decision problems are in general undecidable. Therefore, unavoidably, there will be cases where a program meets its specifications and the verifier still gives an error message. Indeed, in such cases, the verifier may even show a purported counterexample. It may be a false counterexample (as in the example above) -- it simply means that, as far as the verifier can tell, this is a counterexample. If the verifier just spent a little more time or if it was clever enough to unroll more function definitions, apply induction hypotheses, or do a host of other good-things-to-do, it may be possible to determine that the counterexample is bogus. So, any error message you get (including any counterexample that may accompany such an error message) should be interpreted as a possible error (and possible counterexample).
Similar situations frequently occur if you're trying to verify the correctness of a loop and you don't supply a strong enough loop invariant. The Dafny verifier may then show some values of variables on entry to the loop that can never occur in actuality. The counterexample is then trying to give you an idea of how to strengthen your loop invariant appropriately.
Finally, let me add two notes to what Malte said.
First, there's at least another source of incompleteness involved in this example, namely non-linear arithmetic. It can sometimes be difficult to navigate around.
Second, the trick of using function Dummy can be simplified. It suffices (at least in this example) to mention the Pow call, for example like this:
lemma EvenPowerLemma(a: int, b: nat)
requires Even(b)
ensures Pow(a, b) == Pow(a*a, b/2)
{
if b != 0 {
var dummy := Pow(a, b - 2);
}
}
Still, I like the other two manual proofs better, because they do a better job of explaining to the user what the proof is.
Rustan
Dafny fails to prove the lemma due to a combination of two possible sources of incompleteness: recursive definitions (here Pow) and induction. The proof effectively fails because of too little information, i.e. because the problem is underconstrained, which in turn explains why a counterexample can be found.
Induction
Automating induction is difficult because it requires computing an induction hypothesis, which is not always possible. However, Dafny has some heuristics for applying induction (that might or might not work), and which can be switched of, as in the following code:
lemma {:induction false} EvenPowerLemma_manual(a: int, b: nat)
requires Even(b);
ensures Pow(a, b) == Pow(a*a, b/2);
{
if (b != 0) {
EvenPowerLemma_manual(a, b - 2);
}
}
With the heuristics switched off, you need to manually "call" the lemma, i.e. use the induction hypothesis (here, only in the case where b >= 2), in order to get the proof through.
In your case, the heuristics were activated, but they were not "good enough" to get the proof done. I'll explain why next.
Recursive definitions
Reasoning statically about recursive definitions by unfolding them is prone to infinite descent because it is in general undecidable when to stop. Hence, Dafny per default unrolls function definitions only once. In your example, unrolling the definition of Pow only once is not enough to get the induction heuristics to work because the induction hypothesis must be applied to Pow(a, b-2), which does not "appear" in the proof (since unrolling once only gets you to Pow(a, b - 1)). Explicitly mentioning Pow(a, b-2) in the proof, even in a otherwise meaningless formula, triggers the induction heuristics, however:
function Dummy(a: int): bool
{ true }
lemma EvenPowerLemma(a: int, b: nat)
requires Even(b);
ensures Pow(a, b) == Pow(a*a, b/2);
{
if (b != 0) {
assert Dummy(Pow(a, b - 2));
}
}
The Dummy function is there to make sure that the assertion provides no information beyond syntactically including Pow(a, b-2). A less oddly-looking assertion would be assert Pow(a, b) == a * a * Pow(a, b - 2).
Calculational Proof
FYI: You can also make the proof steps explicit and have Dafny check them:
lemma {:induction false} EvenPowerLemma_manual(a: int, b: nat)
requires Even(b);
ensures Pow(a, b) == Pow(a*a, b/2);
{
if (b != 0) {
calc {
Pow(a, b);
== a * Pow(a, b - 1);
== a * a * Pow(a, b - 2);
== {EvenPowerLemma_manual(a, b - 2);}
a * a * Pow(a*a, (b-2)/2);
== Pow(a*a, (b-2)/2 + 1);
== Pow(a*a, b/2);
}
}
}