Surprising Dafny failure to verify boundedness of set comprehension - dafny

Dafny has no problem with this definition of a set intersection function.
function method intersection(A: set<int>, B: set<int>): (r: set<int>)
{
set x | x in A && x in B
}
But when it comes to union, Dafny complains, "a set comprehension must produce a finite set, but Dafny's heuristics can't figure out how to produce a bounded set of values for 'x'". A and B are finite, and so, clearly the union is, too.
function method union(A: set<int>, B: set<int>): (r: set<int>)
{
set x | x in A || x in B
}
What explains this, to-a-beginner seemingly discrepant, behavior?

This is indeed potentially surprising!
First, let me note that in practice, Dafny has built-in operators for intersection and union that it knows preserve finiteness. So you don't need to use set comprehensions to express these ideas. Instead you could just say A * B and A + B respectively.
However, my guess is that you're running into a more complicated example where you're using a set comprehension with a disjunction and are confused about why Dafny can't prove it finite.
Dafny uses syntactic heuristics to determine whether a set comprehension is finite. Unfortunately, these heuristics are not well documented anywhere. For purposes of this question, the key point is that the heuristics either depend on the type of the comprehension's bound variables, or look for a conjunct that constrains elements to be bounded in some other way. For example, Dafny can prove
set x: int | 0 <= x < 10 && ...
finite, as well as
set x:A | x in S && ...
In both cases, it is essential that the relevant bounds be conjuncts. Dafny has no syntactic heuristic for proving a bound for disjunctions, although one could imagine adding one. That is why Dafny cannot prove your union function finite.
As an aside, another work around would be to use potentially infinite sets (written iset in Dafny). If you don't need use the cardinality of the sets, then these might work better.

Related

What is the operator "(>=)" called in this context?

Now I understand that the first line of code can be shortened to the second one. This is the first time I'm running into it and I can't find any documentation on what the operator is called. Is it an abstract concept that can be used for other things as well?
let contains optValue value =
Option.exists (fun v -> v >= value) optValue
let contains optValue value =
Option.exists ((>=) value) optValue
You've already been told that the second example should have been (=) for your two functions to be equivalent, so I won't go over that. But I want to warn you that using the >= operator in this way might work differently than what you expect. The underlying reason has to do with how F# does partial application, and https://fsharpforfunandprofit.com/series/thinking-functionally.html is the best reference for that. (The relevant parts are the articles on currying and partial application, but you'll want to read the whole thing in order since later articles build on concepts explained in earlier articles).
Now, if you've read those articles, you know that F# allows partial application of functions: if you have a two-parameter function f a b, but you call it with just one parameter f a, the result is a function that is expecting a parameter b, and when it receives that, it executes f a b. When you wrap an operator in parentheses, F# treats it like a function, so when you do (>=) value, you get a function that's awaiting a second parameter x, and will then evaluate (>=) value x. And when F# evaluates (op) a b, whatever the operator is, it's the same as a op b, so (>=) value x is the same as value >= x.
And that's the bit that trips most people up at first. Because when you read Option.exists ((>=) value) optValue, you naturally want to read that as "Does the option contain something greater than or equal to value"? But in fact, it's actually saying "Does the option contain a value x such that value >= x is true?", i.e., something less than or equal to value.
So the simple rules of partial application, applied consistently, can lead to unexpected results with greater-than or less-than operators, or in fact any operator that isn't commutative. Be aware of this, and if you want to use partial application with non-commutative operators, double-check your logic.

Defining algebraic datatypes with constraints in Z3

I've seen some online materials for defining algebraic datatypes like an IntList in Z3. I'm wondering how to define an algebraic datatype with logical constraints. For example, how to define a PosSort that stands for positive integers.
Total functions in SMT
Functions are always total in SMT, which raises the question how to encode partial functions such a data type constructor for PosSort. Thus, I would be surprised if Z3's/SMT's built-in support for algebraic data types supports partial data type constructors (and the SMT-LIB 2.6 standard appears to agree).
Encoding partial functions: the theory
However, not all hope is lost, but you'll probably have to encode ADTs yourself. Assume a total function f: A -> B, which should model a partial data type constructor function f': A ~> B whose domain are all a that satisfy p(a). Here, A could be Int, B could be List[A], p(a) could be 0 < a and f(a) could be defined as f(a) := a :: Nil (I am using pseudo-code here, but you should get the idea).
One approach is to ensure that f is never applied to an a that is not positive. Depending on where your SMT code comes from, it might be possible to check that constrain before each application of f (and to raise an error of f isn't applicable).
The other approach is to underspecify f and conditionally define it, e.g. along the lines of 0 < a ==> f(a) := a :: Nil. This way, f remains total (which, as said before, you'll most likely have to live with), but its value is undefined for a <= 0. Hence, when you try to prove something about f(a), e.g. that head(f(a)) == a, then this should fail (assuming that head(a :: _) is defined as a).
Encoding partial functions: a practical example
I am too lazy to code up an example in SMT, but this encoding of an integer list (in a verification language called Viper) should give you a very concrete idea of how to encode an integer list using uninterpreted functions and axioms. The example can basically be translated to SMT-LIB in a one-to-one manner.
Changing that example such that it axiomatises a list of positive integers is straight-forward: just add the constrain head < 0 to every axiom that talks about list heads. I.e. use the following alternative axioms:
axiom destruct_over_construct_Cons {
forall head: Int, tail: list :: {Cons(head, tail)}
0 < head ==>
head_Cons(Cons(head, tail)) == head
&& tail_Cons(Cons(head, tail)) == tail
}
...
axiom type_of_Cons {
forall head: Int, tail: list ::
0 < head ==> type(Cons(head, tail)) == type_Cons()
}
If you run the example online with these changes, the test method test_quantifiers() should fail immediately. Adding the necessary constraints on the list elements, i.e. changing it to
method test_quantifiers() {
/* The elements of a deconstructed Cons are equivalent to the corresponding arguments of Cons */
assert forall head: Int, tail: list, xs: list ::
0 < head ==>
is_Cons(xs) ==> (head == head_Cons(xs) && tail == tail_Cons(xs) <==> Cons(head, tail) == xs)
/* Two Cons are equal iff their constructors' arguments are equal */
assert forall head1: Int, head2: Int, tail1: list, tail2: list ::
(0 < head1 && 0 < head2) ==>
(Cons(head1, tail1) == Cons(head2, tail2)
<==>
head1 == head2 && tail1 == tail2)
}
should make the verification succeed again.
What you are looking for is called predicate-subtyping; and as far as I know Yices is the only SMT solver that supported it out of the box: http://yices.csl.sri.com/old/language.shtml
In particular, see the examples here: http://yices.csl.sri.com/old/language.shtml#language_dependent_types
Unfortunately, this is "old" Yices, and I don't think this particular input-language is supported any longer. As Malte mentioned, SMTLib doesn't have support for predicate subtyping either.
Assuming your output SMTLib is "generated," you can insert "checks" to make sure all elements remain within the domain. But this is rather cumbersome and it is not clear how to deal with partiality. Underspecification is a nice trick, but it can get really hairy and lead to specifications that are very hard to debug.
If you really need predicate subtyping, perhaps SMT solvers are not the best choice for your problem domain. Theorem provers, dependently typed languages, etc. might be more suitable. A practical example, for instance, is the LiquidHaskell system for Haskell programs, which allows predicates to be attached to types to do precisely what you are trying; and uses an SMT-solver to discharge the relevant conditions: https://ucsd-progsys.github.io/liquidhaskell-blog/
If you want to stick to SMT-solvers and don't mind using an older system, I'd recommend Yices with its support for predicate subtyping for modeling such problems. It was (and still is) one of the finest implementations of this very idea in the context of SMT-solving.

Erlang implementing an amb operator.

On wikipedia it says that using call/cc you can implement the amb operator for nondeterministic choice, and my question is how would you implement the amb operator in a language in which the only support for continuations is to write in continuation passing style, like in erlang?
If you can encode the constraints for what constitutes a successful solution or choice as guards, list comprehensions can be used to generate solutions. For example, the list comprehension documentation shows an example of solving Pythagorean triples, which is a problem frequently solved using amb (see for example exercise 4.35 of SICP, 2nd edition). Here's the more efficient solution, pyth1/1, shown on the list comprehensions page:
pyth1(N) ->
[ {A,B,C} ||
A <- lists:seq(1,N-2),
B <- lists:seq(A+1,N-1),
C <- lists:seq(B+1,N),
A+B+C =< N,
A*A+B*B == C*C
].
One important aspect of amb is efficiently searching the solution space, which is done here by generating possible values for A, B, and C with lists:seq/2 and then constraining and testing those values with guards. Note that the page also shows a less efficient solution named pyth/1 where A, B, and C are all generated identically using lists:seq(1,N); that approach generates all permutations but is slower than pyth1/1 (for example, on my machine, pyth(50) is 5-6x slower than pyth1(50)).
If your constraints can't be expressed as guards, you can use pattern matching and try/catch to deal with failing solutions. For example, here's the same algorithm in pyth/1 rewritten as regular functions triples/1 and the recursive triples/5:
-module(pyth).
-export([triples/1]).
triples(N) ->
triples(1,1,1,N,[]).
triples(N,N,N,N,Acc) ->
lists:reverse(Acc);
triples(N,N,C,N,Acc) ->
triples(1,1,C+1,N,Acc);
triples(N,B,C,N,Acc) ->
triples(1,B+1,C,N,Acc);
triples(A,B,C,N,Acc) ->
NewAcc = try
true = A+B+C =< N,
true = A*A+B*B == C*C,
[{A,B,C}|Acc]
catch
error:{badmatch,false} ->
Acc
end,
triples(A+1,B,C,N,NewAcc).
We're using pattern matching for two purposes:
In the function heads, to control values of A, B and C with respect to N and to know when we're finished
In the body of the final clause of triples/5, to assert that conditions A+B+C =< N and A*A+B*B == C*C match true
If both conditions match true in the final clause of triples/5, we insert the solution into our accumulator list, but if either fails to match, we catch the badmatch error and keep the original accumulator value.
Calling triples/1 yields the same result as the list comprehension approaches used in pyth/1 and pyth1/1, but it's also half the speed of pyth/1. Even so, with this approach any constraint could be encoded as a normal function and tested for success within the try/catch expression.

Coq: typeclasses vs dependent records

I can't understand the difference between typeclasses and dependent records in Coq. The reference manual gives the syntax of typeclasses, but says nothing about what they really are and how should you use them. A bit of thinking and searching reveals that typeclasses essentially are dependent records with a bit of syntactic sugar that allows Coq to automatically infer some implicit instances and parameters. It seems that the algorithm for typeclasses works better when there is more or a less only one possible instance of it in any given context, but that's not a big issue since we can always move all fields of typeclass to its parameters, removing ambiguity. Also the Instance declaration is automatically added to the Hints database which can often ease the proofs but will also sometimes break them, if the instances were too general and caused proof search loops or explosions. Are there any other issues I should be aware of? What is the heuristic for choosing between the two? E.g. would I lose anything if I use only records and set their instances as implicit parameters whenever possible?
You are right: type classes in Coq are just records with special plumbing and inference (there's also the special case of single-method type classes, but it doesn't really affect this answer in any way). Therefore, the only reason you would choose type classes over "pure" dependent records is to benefit from the special inference that you get with them: inference with plain dependent records is not very powerful and doesn't allow you to omit much information.
As an example, consider the following code, which defines a monoid type class, instantiating it with natural numbers:
Class monoid A := Monoid {
op : A -> A -> A;
id : A;
opA : forall x y z, op x (op y z) = op (op x y) z;
idL : forall x, op id x = x;
idR : forall x, op x id = x
}.
Require Import Arith.
Instance nat_plus_monoid : monoid nat := {|
op := plus;
id := 0;
opA := plus_assoc;
idL := plus_O_n;
idR := fun n => eq_sym (plus_n_O n)
|}.
Using type class inference, we can use any definitions that work for any monoid directly with nat, without supplying the type class argument, e.g.
Definition times_3 (n : nat) := op n (op n n).
However, if you make the above definition into a regular record by replacing Class and Instance by Record and Definition, the same definition fails:
Toplevel input, characters 38-39: Error: In environment n : nat The term "n" has type "nat" while it is expected to have type "monoid ?11".
The only caveat with type classes is that the instance inference engine gets a bit lost sometimes, causing hard-to-understand error messages to appear. That being said, it's not really a disadvantage over dependent records, given that this possibility isn't even available there.

Ranges A to B where A > B in F#

I've just found something I'd call a quirk in F# and would like to know whether it's by design or by mistake and if it's by design, why is it so...
If you write any range expression where the first term is greater than the second term the returned sequence is empty. A look at reflector suggests this is by design, but I can't really find a reason why it would have to be so.
An example to reproduce it is:
[1..10] |> List.length
[10..1] |> List.length
The first will print out 10 while the second will print out 0.
Tests were made in F# CTP 1.9.6.2.
EDIT: thanks for suggesting expliciting the range, but there's still one case (which is what inspired me to ask this question) that won't be covered. What if A and B are variables and none is constantly greater than the other although they're always different?
Considering that the range expression does not seem to get optimized at compiled time anyway, is there any good reason for the code which determines the step (not explicitly specified) in case A and B are ints not to allow negative steps?
As suggested by other answers, you can do
[10 .. -1 .. 1] |> List.iter (printfn "%A")
e.g.
[start .. step .. stop]
Adam Wright - But you should be able
to change the binding for types you're
interested in to behave in any way you
like (including counting down if x >
y).
Taking Adam's suggestion into code:
let (..) a b =
if a < b then seq { a .. b }
else seq { a .. -1 .. b }
printfn "%A" (seq { 1 .. 10 })
printfn "%A" (seq { 10 .. 1 })
This works for int ranges. Have a look at the source code for (..): you may be able to use that to work over other types of ranges, but not sure how you would get the right value of -1 for your specific type.
What "should" happen is, of course, subjective. Normal range notation in my mind defines [x..y] as the set of all elements greater than or equal to x AND less than or equal to y; an empty set if y < x. In this case, we need to appeal to the F# spec.
Range expressions expr1 .. expr2 are evaluated as a call to the overloaded operator (..), whose default binding is defined in Microsoft.FSharp.Core.Operators. This generates an IEnumerable<_> for the range of values between the given start (expr1) and finish (expr2) values, using an increment of 1. The operator requires the existence of a static member (..) (long name GetRange) on the static type of expr1 with an appropriate signature.
Range expressions expr1 .. expr2 .. expr3 are evaluated as a call to the overloaded operator (.. ..), whose default binding is defined in Microsoft.FSharp.Core.Operators. This generates an IEnumerable<_> for the range of values between the given start (expr1) and finish (expr3) values, using an increment of expr2. The operator requires the existence of a static member (..) (long name GetRange) on the static type of expr1 with an appropriate signature.
The standard doesn't seem to define the .. operator (a least, that I can find). But you should be able to change the binding for types you're interested in to behave in any way you like (including counting down if x > y).
In haskell, you can write [10, 9 .. 1]. Perhaps it works the same in F# (I haven't tried it)?
edit:
It seems that the F# syntax is different, maybe something like [10..-1..1]
Ranges are generally expressed (in the languages and frameworks that support them) like this:
low_value <to> high_value
Can you give a good argument why a range ought to be able to be expressed differently? Since you were requesting a range from a higher number to a lower number does it not stand to reason that the resulting range would have no members?

Resources