Coq: typeclasses vs dependent records - typeclass

I can't understand the difference between typeclasses and dependent records in Coq. The reference manual gives the syntax of typeclasses, but says nothing about what they really are and how should you use them. A bit of thinking and searching reveals that typeclasses essentially are dependent records with a bit of syntactic sugar that allows Coq to automatically infer some implicit instances and parameters. It seems that the algorithm for typeclasses works better when there is more or a less only one possible instance of it in any given context, but that's not a big issue since we can always move all fields of typeclass to its parameters, removing ambiguity. Also the Instance declaration is automatically added to the Hints database which can often ease the proofs but will also sometimes break them, if the instances were too general and caused proof search loops or explosions. Are there any other issues I should be aware of? What is the heuristic for choosing between the two? E.g. would I lose anything if I use only records and set their instances as implicit parameters whenever possible?

You are right: type classes in Coq are just records with special plumbing and inference (there's also the special case of single-method type classes, but it doesn't really affect this answer in any way). Therefore, the only reason you would choose type classes over "pure" dependent records is to benefit from the special inference that you get with them: inference with plain dependent records is not very powerful and doesn't allow you to omit much information.
As an example, consider the following code, which defines a monoid type class, instantiating it with natural numbers:
Class monoid A := Monoid {
op : A -> A -> A;
id : A;
opA : forall x y z, op x (op y z) = op (op x y) z;
idL : forall x, op id x = x;
idR : forall x, op x id = x
}.
Require Import Arith.
Instance nat_plus_monoid : monoid nat := {|
op := plus;
id := 0;
opA := plus_assoc;
idL := plus_O_n;
idR := fun n => eq_sym (plus_n_O n)
|}.
Using type class inference, we can use any definitions that work for any monoid directly with nat, without supplying the type class argument, e.g.
Definition times_3 (n : nat) := op n (op n n).
However, if you make the above definition into a regular record by replacing Class and Instance by Record and Definition, the same definition fails:
Toplevel input, characters 38-39: Error: In environment n : nat The term "n" has type "nat" while it is expected to have type "monoid ?11".
The only caveat with type classes is that the instance inference engine gets a bit lost sometimes, causing hard-to-understand error messages to appear. That being said, it's not really a disadvantage over dependent records, given that this possibility isn't even available there.

Related

F#: Meaning of "type X = Y of Z"

What does type X = Y of Z mean in F#? In particular, does the Y token have any material purpose? If I want to express the fact that X is a type backed by the underlying type Z, why do I need Y in that expression?
According to https://learn.microsoft.com/en-us/dotnet/fsharp/language-reference/keyword-reference, the of keyword is used to express one of these:
Discriminated Unions
Delegates
Exception Types
Strictly speaking, according to the docs, type X = Y of Z doesn't fall into ANY of the above 3 categories.
It's not Discriminated Union syntax because it's lacking the pipe | character.
It's not a Delegate, because the delegate keyword is absent.
It's not an Exception Type because the exception type is absent.
So it seems like type X = Y of Z is invalid syntax even though it is used pervasively in https://fsharpforfunandprofit.com/posts/conciseness-type-definitions/. Very puzzling - is it that the docs are inaccurate?
Contrary to your intuition, such declaration is, in fact, a discriminated union - one with just one case. Yes, discriminated unions with only one case are totally legal and are sometimes used for type safety or for hiding implementation details.
If you don't want to have that extra Y in there, that is also legal:
type X = Z
This is called "type alias". Its advantage over discriminated union is resource consumption (both performance and memory), but its drawback is that it doesn't provide any extra safety, because X ends up being equivalent and interchangeable with Z in all contexts.
It's not Discriminated Union syntax because it's lacking the pipe | character.
It's possible to omit the first pipe in a Discriminated union:
type DU = A of int | B
This is bad style for standard multi-line definitions but can be appropriate for one-liners. In your example there is only one case so it is possible to omit the pipe completely, which admittedly causes confusion. It would improve clarity to include the bar: type X = | Y of Z
In type X = Y of Z, does the Y token have any material purpose?
Y is used to disambiguate between cases, and for matching against data. If there are are likely to be more cases in future then it makes sense to have this scaffolding for matching. If you know that there will only ever be one case then a DU is not the most appropriate type, since a record type X = { Y: Z } is a simpler construct which achieves the same purpose.

What is the operator "(>=)" called in this context?

Now I understand that the first line of code can be shortened to the second one. This is the first time I'm running into it and I can't find any documentation on what the operator is called. Is it an abstract concept that can be used for other things as well?
let contains optValue value =
Option.exists (fun v -> v >= value) optValue
let contains optValue value =
Option.exists ((>=) value) optValue
You've already been told that the second example should have been (=) for your two functions to be equivalent, so I won't go over that. But I want to warn you that using the >= operator in this way might work differently than what you expect. The underlying reason has to do with how F# does partial application, and https://fsharpforfunandprofit.com/series/thinking-functionally.html is the best reference for that. (The relevant parts are the articles on currying and partial application, but you'll want to read the whole thing in order since later articles build on concepts explained in earlier articles).
Now, if you've read those articles, you know that F# allows partial application of functions: if you have a two-parameter function f a b, but you call it with just one parameter f a, the result is a function that is expecting a parameter b, and when it receives that, it executes f a b. When you wrap an operator in parentheses, F# treats it like a function, so when you do (>=) value, you get a function that's awaiting a second parameter x, and will then evaluate (>=) value x. And when F# evaluates (op) a b, whatever the operator is, it's the same as a op b, so (>=) value x is the same as value >= x.
And that's the bit that trips most people up at first. Because when you read Option.exists ((>=) value) optValue, you naturally want to read that as "Does the option contain something greater than or equal to value"? But in fact, it's actually saying "Does the option contain a value x such that value >= x is true?", i.e., something less than or equal to value.
So the simple rules of partial application, applied consistently, can lead to unexpected results with greater-than or less-than operators, or in fact any operator that isn't commutative. Be aware of this, and if you want to use partial application with non-commutative operators, double-check your logic.

Defining algebraic datatypes with constraints in Z3

I've seen some online materials for defining algebraic datatypes like an IntList in Z3. I'm wondering how to define an algebraic datatype with logical constraints. For example, how to define a PosSort that stands for positive integers.
Total functions in SMT
Functions are always total in SMT, which raises the question how to encode partial functions such a data type constructor for PosSort. Thus, I would be surprised if Z3's/SMT's built-in support for algebraic data types supports partial data type constructors (and the SMT-LIB 2.6 standard appears to agree).
Encoding partial functions: the theory
However, not all hope is lost, but you'll probably have to encode ADTs yourself. Assume a total function f: A -> B, which should model a partial data type constructor function f': A ~> B whose domain are all a that satisfy p(a). Here, A could be Int, B could be List[A], p(a) could be 0 < a and f(a) could be defined as f(a) := a :: Nil (I am using pseudo-code here, but you should get the idea).
One approach is to ensure that f is never applied to an a that is not positive. Depending on where your SMT code comes from, it might be possible to check that constrain before each application of f (and to raise an error of f isn't applicable).
The other approach is to underspecify f and conditionally define it, e.g. along the lines of 0 < a ==> f(a) := a :: Nil. This way, f remains total (which, as said before, you'll most likely have to live with), but its value is undefined for a <= 0. Hence, when you try to prove something about f(a), e.g. that head(f(a)) == a, then this should fail (assuming that head(a :: _) is defined as a).
Encoding partial functions: a practical example
I am too lazy to code up an example in SMT, but this encoding of an integer list (in a verification language called Viper) should give you a very concrete idea of how to encode an integer list using uninterpreted functions and axioms. The example can basically be translated to SMT-LIB in a one-to-one manner.
Changing that example such that it axiomatises a list of positive integers is straight-forward: just add the constrain head < 0 to every axiom that talks about list heads. I.e. use the following alternative axioms:
axiom destruct_over_construct_Cons {
forall head: Int, tail: list :: {Cons(head, tail)}
0 < head ==>
head_Cons(Cons(head, tail)) == head
&& tail_Cons(Cons(head, tail)) == tail
}
...
axiom type_of_Cons {
forall head: Int, tail: list ::
0 < head ==> type(Cons(head, tail)) == type_Cons()
}
If you run the example online with these changes, the test method test_quantifiers() should fail immediately. Adding the necessary constraints on the list elements, i.e. changing it to
method test_quantifiers() {
/* The elements of a deconstructed Cons are equivalent to the corresponding arguments of Cons */
assert forall head: Int, tail: list, xs: list ::
0 < head ==>
is_Cons(xs) ==> (head == head_Cons(xs) && tail == tail_Cons(xs) <==> Cons(head, tail) == xs)
/* Two Cons are equal iff their constructors' arguments are equal */
assert forall head1: Int, head2: Int, tail1: list, tail2: list ::
(0 < head1 && 0 < head2) ==>
(Cons(head1, tail1) == Cons(head2, tail2)
<==>
head1 == head2 && tail1 == tail2)
}
should make the verification succeed again.
What you are looking for is called predicate-subtyping; and as far as I know Yices is the only SMT solver that supported it out of the box: http://yices.csl.sri.com/old/language.shtml
In particular, see the examples here: http://yices.csl.sri.com/old/language.shtml#language_dependent_types
Unfortunately, this is "old" Yices, and I don't think this particular input-language is supported any longer. As Malte mentioned, SMTLib doesn't have support for predicate subtyping either.
Assuming your output SMTLib is "generated," you can insert "checks" to make sure all elements remain within the domain. But this is rather cumbersome and it is not clear how to deal with partiality. Underspecification is a nice trick, but it can get really hairy and lead to specifications that are very hard to debug.
If you really need predicate subtyping, perhaps SMT solvers are not the best choice for your problem domain. Theorem provers, dependently typed languages, etc. might be more suitable. A practical example, for instance, is the LiquidHaskell system for Haskell programs, which allows predicates to be attached to types to do precisely what you are trying; and uses an SMT-solver to discharge the relevant conditions: https://ucsd-progsys.github.io/liquidhaskell-blog/
If you want to stick to SMT-solvers and don't mind using an older system, I'd recommend Yices with its support for predicate subtyping for modeling such problems. It was (and still is) one of the finest implementations of this very idea in the context of SMT-solving.

Surprising Dafny failure to verify boundedness of set comprehension

Dafny has no problem with this definition of a set intersection function.
function method intersection(A: set<int>, B: set<int>): (r: set<int>)
{
set x | x in A && x in B
}
But when it comes to union, Dafny complains, "a set comprehension must produce a finite set, but Dafny's heuristics can't figure out how to produce a bounded set of values for 'x'". A and B are finite, and so, clearly the union is, too.
function method union(A: set<int>, B: set<int>): (r: set<int>)
{
set x | x in A || x in B
}
What explains this, to-a-beginner seemingly discrepant, behavior?
This is indeed potentially surprising!
First, let me note that in practice, Dafny has built-in operators for intersection and union that it knows preserve finiteness. So you don't need to use set comprehensions to express these ideas. Instead you could just say A * B and A + B respectively.
However, my guess is that you're running into a more complicated example where you're using a set comprehension with a disjunction and are confused about why Dafny can't prove it finite.
Dafny uses syntactic heuristics to determine whether a set comprehension is finite. Unfortunately, these heuristics are not well documented anywhere. For purposes of this question, the key point is that the heuristics either depend on the type of the comprehension's bound variables, or look for a conjunct that constrains elements to be bounded in some other way. For example, Dafny can prove
set x: int | 0 <= x < 10 && ...
finite, as well as
set x:A | x in S && ...
In both cases, it is essential that the relevant bounds be conjuncts. Dafny has no syntactic heuristic for proving a bound for disjunctions, although one could imagine adding one. That is why Dafny cannot prove your union function finite.
As an aside, another work around would be to use potentially infinite sets (written iset in Dafny). If you don't need use the cardinality of the sets, then these might work better.

Boogie strange assert(false) behavior

I am working with Boogie and I have come across some behaviors I do not understand.
I have been using assert(false) as a way to check if the previous assume statements are absurd.
For instance in the following case, the program is verified without errors...
type T;
const t1, t2: T;
procedure test()
{
assume (t1 == t2);
assume (t1 != t2);
assert(false);
}
...as t1 == t2 && t1 != t2 is an absurd statement.
On the other hand if I have something like
type T;
var map: [T]bool;
const t1, t2: T;
procedure test()
{
assume(forall a1: T, a2: T :: !map[a1] && map[a2]);
//assert(!map[t1]);
assert(false);
}
The assert(false) only fails when the commented line is uncommented. Why is the commented assert changing the result of the assert(false)?
Gist: the SMT solver underlying Boogie will not instantiate the quantifier if you don't mention a ground instance of map[...] in your program.
Here is why: SMT solvers (that use e-matching) typically use syntactic heuristics to decide when to instantiate a quantifier. Consider the following quantifier:
forall i: Int :: f(i)
This quantifier admits infinitely many instantiations (since i ranges over an unbounded domain), trying all would thus result in non-termination. Instead, SMT solvers expect syntactic hints instructing it for which i the quantifier should be instantiated. These hints are called a patterns or triggers. In Boogie, they can be written as follows:
forall i: Int :: {f(i)} f(i)
This trigger instructs the SMT solver to instantiate the quantifier for each i for which f(i) is mentioned in the program (or rather, current proof search). E.g., if you assume f(5), then the quantifier will be instantiated with 5 substituted for i.
In your example, you don't provide a pattern explicitly, so the SMT solver might pick one for you, by inspecting the quantifier body. It will most likely pick {map[a1], map[a2]} (multiple function applications are allowed, patterns must cover all quantified variables). If you uncomment the assume, the ground term map[t1] becomes available, and the SMT solver can instantiate the quantifier with a1, a2 mapped to t1, t1. Hence, the contradiction is obtained.
See the Z3 guide for more details on patterns. More involved texts about patterns can be found, e.g. in
this paper, in
this paper or in
this paper.

Resources