I have been using abstract lemmas and functions (without bodies) in a modeling task. In this example,
lemma py(c : int) returns (a: int, b :int)
ensures a*a + b*b == c*c
lemma main(c : int) returns (a: int, b :int)
ensures a*a + b*b == c*c
{
a, b := py(c);
}
calling py in main ensures that the post-condition is true irrespective of how py is implemented. I have 2 questions:
Is it safe to use abstract lemmas/functions in Dafny? The following modification to the above code is verified by Dafny:
lemma py(c : int) returns (a: int, b :int)
ensures a*a + b*b == c*c
ensures a*a + b*b != c*c
while I think that may be Dafny should have thrown an error.
Should I say lemma {:axiom} py(...)? I haven't observed a difference on including {:axiom} or {:imported}.
As James mentions, a lemma without a body can be useful when modeling behavior that you're not implementing. If you don't give a body, the verifier does not attempt to verify the correctness of the lemma. Therefore, it is essentially an axiom.
Even without the /noCheating flag that James mentions, the Dafny compiler will complain about body-less lemmas (and methods and functions). Note that the compiler does not kick in until after the verifier is satisfied. The {:axiom} attribute says you're willing to accept responsibility for the truth of the lemma. For a body-less non-ghost method, you can also use the {:extern} attribute, which lets you link with code written in other languages. Again, you assume responsibility for the correctness of that external code, since the Dafny verifier won't check it.
Rustan
Methods and functions without bodies are indeed useful when modeling parts of a system that you're not implementing.
However, one has to be careful when giving such methods and functions postconditions, because those become trusted, and are not checked by Dafny. In other words, it is potentially not safe to use lemmas or functions without bodies if they have postconditions.
That said, such methods and functions are indispensable for modeling, so the fact that they are potentially unsafe does not mean you shouldn't use them. Instead, you should just be extra careful when writing down the postconditions, because they will not be checked.
If you pass Dafny the /noCheating:1 flag, it will complain about any methods or functions without bodies that have postconditions, and force you to use the {:axiom} attribute. This can be helpful even when not passing /noCheating:1, just to mark the fact that the postcondition is trusted. It's up to you whether to pass /noCheating:1 or whether to use the attribute anyways.
Related
I've seen some online materials for defining algebraic datatypes like an IntList in Z3. I'm wondering how to define an algebraic datatype with logical constraints. For example, how to define a PosSort that stands for positive integers.
Total functions in SMT
Functions are always total in SMT, which raises the question how to encode partial functions such a data type constructor for PosSort. Thus, I would be surprised if Z3's/SMT's built-in support for algebraic data types supports partial data type constructors (and the SMT-LIB 2.6 standard appears to agree).
Encoding partial functions: the theory
However, not all hope is lost, but you'll probably have to encode ADTs yourself. Assume a total function f: A -> B, which should model a partial data type constructor function f': A ~> B whose domain are all a that satisfy p(a). Here, A could be Int, B could be List[A], p(a) could be 0 < a and f(a) could be defined as f(a) := a :: Nil (I am using pseudo-code here, but you should get the idea).
One approach is to ensure that f is never applied to an a that is not positive. Depending on where your SMT code comes from, it might be possible to check that constrain before each application of f (and to raise an error of f isn't applicable).
The other approach is to underspecify f and conditionally define it, e.g. along the lines of 0 < a ==> f(a) := a :: Nil. This way, f remains total (which, as said before, you'll most likely have to live with), but its value is undefined for a <= 0. Hence, when you try to prove something about f(a), e.g. that head(f(a)) == a, then this should fail (assuming that head(a :: _) is defined as a).
Encoding partial functions: a practical example
I am too lazy to code up an example in SMT, but this encoding of an integer list (in a verification language called Viper) should give you a very concrete idea of how to encode an integer list using uninterpreted functions and axioms. The example can basically be translated to SMT-LIB in a one-to-one manner.
Changing that example such that it axiomatises a list of positive integers is straight-forward: just add the constrain head < 0 to every axiom that talks about list heads. I.e. use the following alternative axioms:
axiom destruct_over_construct_Cons {
forall head: Int, tail: list :: {Cons(head, tail)}
0 < head ==>
head_Cons(Cons(head, tail)) == head
&& tail_Cons(Cons(head, tail)) == tail
}
...
axiom type_of_Cons {
forall head: Int, tail: list ::
0 < head ==> type(Cons(head, tail)) == type_Cons()
}
If you run the example online with these changes, the test method test_quantifiers() should fail immediately. Adding the necessary constraints on the list elements, i.e. changing it to
method test_quantifiers() {
/* The elements of a deconstructed Cons are equivalent to the corresponding arguments of Cons */
assert forall head: Int, tail: list, xs: list ::
0 < head ==>
is_Cons(xs) ==> (head == head_Cons(xs) && tail == tail_Cons(xs) <==> Cons(head, tail) == xs)
/* Two Cons are equal iff their constructors' arguments are equal */
assert forall head1: Int, head2: Int, tail1: list, tail2: list ::
(0 < head1 && 0 < head2) ==>
(Cons(head1, tail1) == Cons(head2, tail2)
<==>
head1 == head2 && tail1 == tail2)
}
should make the verification succeed again.
What you are looking for is called predicate-subtyping; and as far as I know Yices is the only SMT solver that supported it out of the box: http://yices.csl.sri.com/old/language.shtml
In particular, see the examples here: http://yices.csl.sri.com/old/language.shtml#language_dependent_types
Unfortunately, this is "old" Yices, and I don't think this particular input-language is supported any longer. As Malte mentioned, SMTLib doesn't have support for predicate subtyping either.
Assuming your output SMTLib is "generated," you can insert "checks" to make sure all elements remain within the domain. But this is rather cumbersome and it is not clear how to deal with partiality. Underspecification is a nice trick, but it can get really hairy and lead to specifications that are very hard to debug.
If you really need predicate subtyping, perhaps SMT solvers are not the best choice for your problem domain. Theorem provers, dependently typed languages, etc. might be more suitable. A practical example, for instance, is the LiquidHaskell system for Haskell programs, which allows predicates to be attached to types to do precisely what you are trying; and uses an SMT-solver to discharge the relevant conditions: https://ucsd-progsys.github.io/liquidhaskell-blog/
If you want to stick to SMT-solvers and don't mind using an older system, I'd recommend Yices with its support for predicate subtyping for modeling such problems. It was (and still is) one of the finest implementations of this very idea in the context of SMT-solving.
Dafny has no problem with this definition of a set intersection function.
function method intersection(A: set<int>, B: set<int>): (r: set<int>)
{
set x | x in A && x in B
}
But when it comes to union, Dafny complains, "a set comprehension must produce a finite set, but Dafny's heuristics can't figure out how to produce a bounded set of values for 'x'". A and B are finite, and so, clearly the union is, too.
function method union(A: set<int>, B: set<int>): (r: set<int>)
{
set x | x in A || x in B
}
What explains this, to-a-beginner seemingly discrepant, behavior?
This is indeed potentially surprising!
First, let me note that in practice, Dafny has built-in operators for intersection and union that it knows preserve finiteness. So you don't need to use set comprehensions to express these ideas. Instead you could just say A * B and A + B respectively.
However, my guess is that you're running into a more complicated example where you're using a set comprehension with a disjunction and are confused about why Dafny can't prove it finite.
Dafny uses syntactic heuristics to determine whether a set comprehension is finite. Unfortunately, these heuristics are not well documented anywhere. For purposes of this question, the key point is that the heuristics either depend on the type of the comprehension's bound variables, or look for a conjunct that constrains elements to be bounded in some other way. For example, Dafny can prove
set x: int | 0 <= x < 10 && ...
finite, as well as
set x:A | x in S && ...
In both cases, it is essential that the relevant bounds be conjuncts. Dafny has no syntactic heuristic for proving a bound for disjunctions, although one could imagine adding one. That is why Dafny cannot prove your union function finite.
As an aside, another work around would be to use potentially infinite sets (written iset in Dafny). If you don't need use the cardinality of the sets, then these might work better.
After reading Getting Started with Dafny: A Guide, I decided to create my first program: given a sequence of integers, compute the sum of its elements. However, I am having a hard time in getting Dafny to verify the program.
function G(a: seq<int>): int
decreases |a|
{
if |a| == 0 then 0 else a[0] + G(a[1..])
}
method sum(a: seq<int>) returns (r: int)
ensures r == G(a)
{
r := 0;
var i: int := 0;
while i < |a|
invariant 0 <= i <= |a|
invariant r == G(a[0..i])
{
r := r + a[i];
i := i + 1;
}
}
I get
stdin.dfy(12,2): Error BP5003: A postcondition might not hold on this return path.
stdin.dfy(8,12): Related location: This is the postcondition that might not hold.
stdin.dfy(14,16): Error BP5005: This loop invariant might not be maintained by the loop.
I suspect that Dafny needs some "help" in order to verify the program (lemmas maybe?) but I do not know where to start.
Here is a version of your program that verifies.
There were two things to fix: the proof that the postcondition follows after the loop, and the proof that the loop invariant is preserved.
The postcondition
Dafny needs a hint that it might be helpful to try to prove a == a[..|a|]. Asserting that equality is enough to finish this part of the proof: Dafny automatically proves the equality and uses it to prove the postcondition from the loop invariant.
This is a common pattern. You can try to see what is bothering Dafny by doing the proof "by hand" in Dafny by making various assertions that you would use to prove it to yourself on paper.
The loop invariant
This one is a bit more complicated. We need to show that updating r and incrementing i preserves r == G(a[..i]). To do this, I used a calc statement, which let's one prove an equality via a sequence of intermediate steps. (It is always possible to prove such things without calc, if you prefer, by asserting all the relevant equalities as well as any assertions inside the calc. But I think calc is nicer.)
I placed the calc statement before the updates to r and i occur. I know that after the updates occur, I will need to prove r == G(a[..i]) for the updated values of r and i. Thus, before the updates occur, it suffices to prove r + a[i] == G(a[..i+1]) for the un-updated values. My calc statement starts with r + a[i] and works toward G(a[..i+1]).
First, by the loop invariant on entry to the loop, we know that r == G(a[i]) for the current values.
Next, we want to bring the a[i] inside the G. This fact is not entirely trivial, so we need a lemma. I chose to prove something slightly more general than necessary, which is that G(a + b) == G(a) + G(b) for any integer sequences a and b. I call this lemma G_append. Its proof is discussed below. For now, we just use it to get bring the a[i] inside as a singleton sequence.
The last step in this calc is to combine a[0..i] + [a[i]] into a[0..i+1]. This is another sequence extensionality fact, and thus needs to be asserted explicitly.
That completes the calc, which proves the invariant is preserved.
The lemma
The proof of G_append proceeds by induction on a. The base case where a == [] is handled automatically. In the inductive case, we need to show G(a + b) == G(a) + G(b), assuming the induction hypothesis for any subsequences of a. I use another calc statement for this.
Beginning with G(a + b), we first expand the definition of G. Next, we note that (a + b)[0] == a[0] since a != []. Similarly, we have that (a + b)[1..] == a[1..] + b, but since this is another sequence extensionality fact, it must be explicitly asserted. Finally, we can use the induction hypothesis (automatically invoked by Dafny) to show that G(a[1..] + b) == G(a[1..]) + G(b).
I am working with Boogie and I have come across some behaviors I do not understand.
I have been using assert(false) as a way to check if the previous assume statements are absurd.
For instance in the following case, the program is verified without errors...
type T;
const t1, t2: T;
procedure test()
{
assume (t1 == t2);
assume (t1 != t2);
assert(false);
}
...as t1 == t2 && t1 != t2 is an absurd statement.
On the other hand if I have something like
type T;
var map: [T]bool;
const t1, t2: T;
procedure test()
{
assume(forall a1: T, a2: T :: !map[a1] && map[a2]);
//assert(!map[t1]);
assert(false);
}
The assert(false) only fails when the commented line is uncommented. Why is the commented assert changing the result of the assert(false)?
Gist: the SMT solver underlying Boogie will not instantiate the quantifier if you don't mention a ground instance of map[...] in your program.
Here is why: SMT solvers (that use e-matching) typically use syntactic heuristics to decide when to instantiate a quantifier. Consider the following quantifier:
forall i: Int :: f(i)
This quantifier admits infinitely many instantiations (since i ranges over an unbounded domain), trying all would thus result in non-termination. Instead, SMT solvers expect syntactic hints instructing it for which i the quantifier should be instantiated. These hints are called a patterns or triggers. In Boogie, they can be written as follows:
forall i: Int :: {f(i)} f(i)
This trigger instructs the SMT solver to instantiate the quantifier for each i for which f(i) is mentioned in the program (or rather, current proof search). E.g., if you assume f(5), then the quantifier will be instantiated with 5 substituted for i.
In your example, you don't provide a pattern explicitly, so the SMT solver might pick one for you, by inspecting the quantifier body. It will most likely pick {map[a1], map[a2]} (multiple function applications are allowed, patterns must cover all quantified variables). If you uncomment the assume, the ground term map[t1] becomes available, and the SMT solver can instantiate the quantifier with a1, a2 mapped to t1, t1. Hence, the contradiction is obtained.
See the Z3 guide for more details on patterns. More involved texts about patterns can be found, e.g. in
this paper, in
this paper or in
this paper.
For a correct method, can Z3 find a model for the method's verification condition?
I had thought not, but here is an example where the method is correct
yet verification finds a model.
This was with Dafny 1.9.7.
What Malte says is correct (and I found it nicely explained as well).
Dafny is sound, in the sense that it will only verify correct programs. In other words, if a program is incorrect, the Dafny verifier will never say that it is correct. However, the underlying decision problems are in general undecidable. Therefore, unavoidably, there will be cases where a program meets its specifications and the verifier still gives an error message. Indeed, in such cases, the verifier may even show a purported counterexample. It may be a false counterexample (as in the example above) -- it simply means that, as far as the verifier can tell, this is a counterexample. If the verifier just spent a little more time or if it was clever enough to unroll more function definitions, apply induction hypotheses, or do a host of other good-things-to-do, it may be possible to determine that the counterexample is bogus. So, any error message you get (including any counterexample that may accompany such an error message) should be interpreted as a possible error (and possible counterexample).
Similar situations frequently occur if you're trying to verify the correctness of a loop and you don't supply a strong enough loop invariant. The Dafny verifier may then show some values of variables on entry to the loop that can never occur in actuality. The counterexample is then trying to give you an idea of how to strengthen your loop invariant appropriately.
Finally, let me add two notes to what Malte said.
First, there's at least another source of incompleteness involved in this example, namely non-linear arithmetic. It can sometimes be difficult to navigate around.
Second, the trick of using function Dummy can be simplified. It suffices (at least in this example) to mention the Pow call, for example like this:
lemma EvenPowerLemma(a: int, b: nat)
requires Even(b)
ensures Pow(a, b) == Pow(a*a, b/2)
{
if b != 0 {
var dummy := Pow(a, b - 2);
}
}
Still, I like the other two manual proofs better, because they do a better job of explaining to the user what the proof is.
Rustan
Dafny fails to prove the lemma due to a combination of two possible sources of incompleteness: recursive definitions (here Pow) and induction. The proof effectively fails because of too little information, i.e. because the problem is underconstrained, which in turn explains why a counterexample can be found.
Induction
Automating induction is difficult because it requires computing an induction hypothesis, which is not always possible. However, Dafny has some heuristics for applying induction (that might or might not work), and which can be switched of, as in the following code:
lemma {:induction false} EvenPowerLemma_manual(a: int, b: nat)
requires Even(b);
ensures Pow(a, b) == Pow(a*a, b/2);
{
if (b != 0) {
EvenPowerLemma_manual(a, b - 2);
}
}
With the heuristics switched off, you need to manually "call" the lemma, i.e. use the induction hypothesis (here, only in the case where b >= 2), in order to get the proof through.
In your case, the heuristics were activated, but they were not "good enough" to get the proof done. I'll explain why next.
Recursive definitions
Reasoning statically about recursive definitions by unfolding them is prone to infinite descent because it is in general undecidable when to stop. Hence, Dafny per default unrolls function definitions only once. In your example, unrolling the definition of Pow only once is not enough to get the induction heuristics to work because the induction hypothesis must be applied to Pow(a, b-2), which does not "appear" in the proof (since unrolling once only gets you to Pow(a, b - 1)). Explicitly mentioning Pow(a, b-2) in the proof, even in a otherwise meaningless formula, triggers the induction heuristics, however:
function Dummy(a: int): bool
{ true }
lemma EvenPowerLemma(a: int, b: nat)
requires Even(b);
ensures Pow(a, b) == Pow(a*a, b/2);
{
if (b != 0) {
assert Dummy(Pow(a, b - 2));
}
}
The Dummy function is there to make sure that the assertion provides no information beyond syntactically including Pow(a, b-2). A less oddly-looking assertion would be assert Pow(a, b) == a * a * Pow(a, b - 2).
Calculational Proof
FYI: You can also make the proof steps explicit and have Dafny check them:
lemma {:induction false} EvenPowerLemma_manual(a: int, b: nat)
requires Even(b);
ensures Pow(a, b) == Pow(a*a, b/2);
{
if (b != 0) {
calc {
Pow(a, b);
== a * Pow(a, b - 1);
== a * a * Pow(a, b - 2);
== {EvenPowerLemma_manual(a, b - 2);}
a * a * Pow(a*a, (b-2)/2);
== Pow(a*a, (b-2)/2 + 1);
== Pow(a*a, b/2);
}
}
}