I am trying to get the habit to use Dafny as a friendly SAT-QBF solver for some simple formulae, since doing it in, for instance, Z3 is too uncomfortable.
The context for this is that I have implemented Cooper's algorithm for quantifier elimination and, when all the variables are bounded, it can be used as a decision procedure: therefore, I want to know which is the result that I should get before executing it.
However, I encountered a problem in Dafny.
Let us raise, for instance, this formula (written in Dafny):
assert forall x_1: int :: exists y_1: int :: forall x_2: int :: exists y_2 : int
:: (y_2<y_1) && (x_2<y_2) && (x_1<x_2);
In my Cooper, it returns False, while Dafny returns assertion violation (along with the typical triggerswarnings), which I interpret as False too. Okay, so no problem with this.
But if I raise:
assert exists x_1: int :: exists y_1: int :: exists x_2: int :: exists y_2 : int
:: (y_2<y_1) && (x_2<y_2) && (x_1<x_2);
In my Cooper, it returns True, while Dafny also returns assertion violation. I have done a manual Cooper execution (pencil and paper) and I think the True is right one.
Any idea of what is going on?
PS: I have not tried it in Z3, yet, because I am doing first other attempts with other theories.
EDIT
Trigger warnings can be avoided using a simple trick to instantiate the quantified variables: creating an uninterpreted function.
method Main() {
assert exists x_1 : int {:trigger P(x_1)} :: exists y_1: int {:trigger P(y_1)}
:: exists x_2: int {:trigger P(x_2)} :: exists y_2 : int {:trigger P(y_2)}
:: (y_2<y_1) && (x_2<y_2) && (x_1<x_2);
}
predicate P(a: int)
{
true
}
You cannot do this with Dafny. While Dafny supports quantifiers, booleans, arithmetic, and many other things (recursive functions, sets, sequences, objects and references, multi-dimensional arrays, induction, inductive and coinductive datatypes, bitvectors, greatest and least fixpoints of monotonic functions, etc.), it is not suitable for SAT-QBF (or QBF + artihmetic) benchmarks.
Dafny's errors, including the assertion violation, tell you that the verifier was not able to do the proof. It may be that the property still holds, but you'll need to supply more of the proof yourself. In other words, you should interpret the assertion violation as a "don't know" answer. Stated differently, you cannot decide (only semi-decide) formulas with Dafny.
Dafny uses quantifiers in the SMT solvers via matching patterns, aka triggers. When a quantifier has no good triggers, which is what Dafny's "no trigger" warning is telling you, you may see bad performance, unstable verifications, and so-called butterfly effects (where a small and seemingly unrelated part of the program causes a change in the automatic construction of other proofs). Triggers are driven by uninterpreted function symbols, which your example doesn't have at all.
If you want a readable syntax, you may be able to do what you're trying through Boogie. I have not tried that, but you could try putting Boogie in its monomorphic mode and then supplying prover options to ask for the SAT-QBF or something similar (see Boogie's /help). Otherwise, if you're interested in deciding these problems, then going directly to SMT solver is the way to go.
Related
I have tried to implement Floor and Ceiling Function as defined in the following link
https://math.stackexchange.com/questions/3619044/floor-or-ceiling-function-encoding-in-first-order-logic/3619320#3619320
But Z3 query returning counterexample.
Floor Function
_X=Real('_X')
_Y=Int('_Y')
_W=Int('_W')
_n=Int('_n')
_Floor=Function('_Floor',RealSort(),IntSort())
..
_s.add(_X>=0)
_s.add(_Y>=0)
_s.add(Implies(_Floor(_X)==_Y,And(Or(_Y==_X,_Y<_X),ForAll(_W,Implies(And(_W>=0,_W<_X),And(_W ==_Y,_W<_Y))))))
_s.add(Implies(And(Or(_Y==_X,_Y<_X),ForAll(_W,Implies(And(_W>=0,_W<_X),And(_W==_Y,_W<_Y))),_Floor(_X)==_Y))
_s.add(Not(_Floor(0.5)==0))
Expected Result - Unsat
Actual Result - Sat
Ceiling Function
_X=Real('_X')
_Y=Int('_Y')
_W=Int('_W')
_Ceiling=Function('_Ceiling',RealSort(),IntSort())
..
..
_s.add(_X>=0)
_s.add(_Y>=0)
_s.add(Implies(_Ceiling(_X)==_Y,And(Or(_Y==_X,_Y<_X),ForAll(_W,Implies(And(_W>=0,_W<_X),And(_W ==_Y,_Y<_W))))))
_s.add(Implies(And(Or(_Y==_X,_Y<_X),ForAll(_W,Implies(And(_W>=0,_W<_X),And(_W==_Y,_Y<_W)))),_Ceiling(_X)==_Y))
_s.add(Not(_Ceilng(0.5)==1))
Expected Result - Unsat
Actual Result - Sat
[Your encoding doesn't load to z3, it gives a syntax error even after eliminating the '..', as your call to Implies needs an extra argument. But I'll ignore all that.]
The short answer is, you can't really do this sort of thing in an SMT-Solver. If you could, then you can solve arbitrary Diophantine equations. Simply cast it in terms of Reals, solve it (there is a decision procedure for Reals), and then add the extra constraint that the result is an integer by saying Floor(solution) = solution. So, by this argument, you can see that modeling such functions will be beyond the capabilities of an SMT solver.
See this answer for details: Get fractional part of real in QF_UFNRA
Having said that, this does not mean you cannot code this up in Z3. It just means that it will be more or less useless. Here's how I would go about it:
from z3 import *
s = Solver()
Floor = Function('Floor',RealSort(),IntSort())
r = Real('R')
f = Int('f')
s.add(ForAll([r, f], Implies(And(f <= r, r < f+1), Floor(r) == f)))
Now, if I do this:
s.add(Not(Floor(0.5) == 0))
print(s.check())
you'll get unsat, which is correct. If you do this instead:
s.add(Not(Floor(0.5) == 1))
print(s.check())
you'll see that z3 simply loops forever. To make this usefull, you'd want the following to work as well:
test = Real('test')
s.add(test == 2.4)
result = Int('result')
s.add(Floor(test) == result)
print(s.check())
but again, you'll see that z3 simply loops forever.
So, bottom line: Yes, you can model such constructs, and z3 will correctly answer the simplest of queries. But with anything interesting, it'll simply loop forever. (Essentially whenever you'd expect sat and most of the unsat scenarios unless they can be constant-folded away, I'd expect z3 to simply loop.) And there's a very good reason for that, as I mentioned: Such theories are just not decidable and fall well out of the range of what an SMT solver can do.
If you are interested in modeling such functions, your best bet is to use a more traditional theorem prover, like Isabelle, Coq, ACL2, HOL, HOL-Light, amongst others. They are much more suited for working on these sorts of problems. And also, give a read to Get fractional part of real in QF_UFNRA as it goes into some of the other details of how you can go about modeling such functions using non-linear real arithmetic.
I've written a class that represents a binary relation on a set, S, with two fields: that set, S, and a second set of pairs of values drawn from S. The class defines a bunch of properties of relations, such as being single-valued (i.e., being a function, as defined in an "isFunction()" predicate). After the class definition I try to define some subset types. One is meant to define a subtype of these relations that are also actually "functions". It's not working, and it's a bit hard to decode the resulting error codes. Note that the Valid() and isFunction() predicates do declare "reads this;". Any ideas on where I should be looking? Is it that Dafny can't tell that the subset type is inhabited? Is there way to convince it that it is?
type func<T> = f: binRelOnS<T> | f.Valid() && f.isFunction()
[Dafny VSCode] trying witness null: result of operation might violate subset type constraint for 'binRelOnS'
Subset types and non-emptiness
A subset type definition of the form
type MySubset = x: BaseType | RHS(x)
introduces MySubset as a type that stands for those values x of type BaseType that satisfy the boolean expression RHS(x). Since every type in Dafny must be nonempty, there is a proof obligation to show that the type you declared has some member. Dafny may find some candidate values and will try to see if any one of them satisfies RHS. If the candidates don't, you get an error message like the one you saw. Sometimes, the error message will tell you which candidate values Dafny tried.
In your case, the only candidate value that Dafny tried is the value null. As James points out, the value null doesn't even get to first base, because the BaseType in your example is a type of non-null references. If you change binRelOnS<T> to binRelOnS?<T>, then null stands a chance of being a possible witness for showing your subset type to be nonempty.
User-supplied witnesses
Since Dafny is not too clever about coming up with candidate witnesses, you may have to supply one yourself. You do this by adding a witness clause at the end of the declaration. For example:
type OddInt = x: int | x % 2 == 1 witness 73
Since 73 satisfies the RHS constraint x % 2 == 1, Dafny accepts this type. In some programs, it can happen that the witness you have in mind is available only in ghost code. You can then write ghost witness instead of witness, which allows the subsequent expression to be ghost. A ghost witness can be used to convince the Dafny verifier that a type is nonempty, but it does not help the Dafny compiler in initializing variables of that type, so you will still need to initialize such variables yourself.
Using a witness clause, you could attempt to supply your own witness using your original definition of the subset type func. However, a witness clause takes an expression, not a statement, which means you are not able to use new. If you don't care about compiling your program and you're willing to trust yourself about the existence of the witness, you can declare a body-less function that promises to return a suitable witness:
type MySubset = x: BaseType | RHS(x) ghost witness MySubsetWitness()
function MySubsetWitness(): BaseType
ensures RHS(MySubsetWitness())
You'll either need ghost witness or function method. The function MySubsetWitness will forever be left without a body, so there's room for you to make a mistake about some value satisfying RHS.
The witness clause was introduced in Dafny version 2.0.0. The 2.0.0 release notes mention this, but apparently don't give much of an explanation. If you want to see more examples of witness, search for that keyword in the Dafny test suite.
Subset types of classes
In your example, if you change the base type to be a possibly-null reference type:
type func<T> = f: binRelOnS?<T> | f.Valid() && f.isFunction()
then your next problem will be that the RHS dereferences f. You can fix this by weakening your subset-type constraint as follows:
type func<T> = f: binRelOnS?<T> | f != null ==> f.Valid() && f.isFunction()
Now comes the part that may be a deal breaker. Subset types cannot depend on mutable state. This is because types are a very static notion (unlike specifications, which often depend on the state). It would be a disaster if a value could satisfy a type one moment and then, after some state change in the program, not satisfy the type. (Indeed, almost all type systems with subset/refinement/dependent types are for functional languages.) So, if your Valid or isFunction predicates have a reads clause, then you cannot define func in the way you have hoped. But, as long as both Valid and isFunction depend only on the values of const fields in the class, then no reads clause is needed and you are all set.
Rustan
Say I have a function,foo/1, whose spec is -spec foo(atom()) -> #r{}., where #r{} is a record defined as -record(r, {a :: 1..789})., however, I have foo(a) -> 800. in my code, when I run dialyzer against it, it didn't warn me about this, (800 is not a "valid" return value for function foo/1), can I make dialyzer warn me about this?
Edit
Learn You Some Erlang says:
Dialyzer reserves the right to expand this range into a bigger one.
But I couldn't find how to disable this.
As of Erlang 18, the handling of integer ranges is done by erl_types:t_from_range/2. As you can see, there are a lot of generalizations happening to get a "safe" overapproximation of a range.
If you tried to ?USE_UNSAFE_RANGES (see the code) it is likely that your particular error would be caught, but at a terrible cost: native compilation and dialyzing of recursive integer functions would not ever finish!
The reason is that the type analysis for recursive functions uses a simple fixpoint approach, where the initial types accept the base cases and are repeatedly expanded using the recursive cases to include more values. At some point overapproximations must happen if the process is to terminate. Here is a concrete example:
fact(1) -> 1;
fact(N) -> N * fact(N - 1).
Initially fact/1 is assumed to have type fun(none()) -> none(). Using that to analyse the code, the second clause is 'failing' and only the first one is ok. Therefore after the first iteration the new type is fun(1) -> 1. Using the new type the second clause can succeed, expanding the type to fun(1|2) -> 1|2. Then fun(1|2|3) -> 1|2|6 this continues until the ?SET_LIMIT is reached in which case t_from_range stops using the individual values and type becomes fun(1..255) -> pos_integer(). The next iteration expands 1..255 to pos_integer() and then fun(pos_integer()) -> pos_integer() is a fixpoint!
Incorrect answer follows (explains the first comment below):
You should get a warning for this code if you use the -Woverspecs option. This option is not enabled by default, since Dialyzer operates under the assumption that it is 'ok' to over-approximate the return values of a function. In your particular case, however, you actually want any extra values to produce warnings.
We are running into performance problems with what I believe is the part of Z3 that treats non-linear arithmetic. Here is a simple concrete Boogie example, that when verified with Z3 (version 4.1) takes a long time (on the order of 3 minutes) to complete.
const D: int;
function f(n: int) returns (int) { n * D }
procedure test() returns ()
{
var a, b, c: int;
var M: [int]int;
var t: int;
assume 0 < a && 1000 * a < f(1);
assume 0 < c && 1000 * c < f(1);
assume f(100) * b == a * c;
assert M[t] > 0;
}
It seems that the problem is caused by an interaction of functions, range assumption on integer variables as well as multiplications of (unknown) integer values. The assertion in the end should not be provable. Instead of failing quickly, it seems that Z3 has ways to instantiate lots of terms somehow, as its memory consumptions grows fairly quickly to about 300 MB, at which point it gives up.
I'm wondering if this is a bug, or whether it is possible to improve the heuristics on when Z3 should stop the search in the particular direction it is currently trying to solve the problem.
One interesting thing is that inlining the function by using
function {:inline} f(n: int) returns (int) { n * D }
makes the verification terminate very quickly.
Background: This is a minimal test case for a problem that we see in our verifier Chalice. There, the Boogie programs get much longer, potentially with multiple assumptions of a similar kind. Often, the verification does not appear to be terminating at all.
Any ideas?
Yes, the non-termination is due to nonlinear integer arithmetic. Z3 has a new nonlinear solver, but it is for "nonlinear real arithmetic", and can only be used in quantifier free problems that only use arithmetic (i.e., no uninterpreted functions like in your example).
Thus, the old arithmetic solver is used in your example. This solver has very limited support for integer arithmetic. Your analysis of the problem is correct. Z3 has trouble finding a solution for the block of nonlinear integer constraints. Note that if we replace f(100) * b == a * c with f(100) * b <= a * c, then Z3 returns immediately with an "unknown" answer.
We can avoid the non-termination by limiting the number of nonlinear arithmetic reasoning in Z3. The option NL_ARITH_ROUNDS=100 will use the nonlinear module at most 100 times for each Z3 query. This is not an ideal solution, but at least, we avoid the non-termination.
I just asked a question about how the Erlang compiler implements pattern matching, and I got some great responses, one of which is the compiled bytecode (obtained with a parameter passed to the c() directive):
{function, match, 1, 2}.
{label,1}.
{func_info,{atom,match},{atom,match},1}.
{label,2}.
{test,is_tuple,{f,3},[{x,0}]}.
{test,test_arity,{f,3},[{x,0},2]}.
{get_tuple_element,{x,0},0,{x,1}}.
{test,is_eq_exact,{f,3},[{x,1},{atom,a}]}.
return.
{label,3}.
{badmatch,{x,0}}
Its all just plain Erlang tuples. I was expecting some cryptic binary thingy, guess not. I am asking this on impulse here (I could look at the compiler source but asking questions always ends up better with extra insight), how is this output translated in the binary level?
Say {test,is_tuple,{f,3},[{x,0}]} for example. I am assuming this is one instruction, called 'test'... anyway, so this output would essentially be the AST of the bytecode level language, from which the binary encoding is just a 1-1 translation?
This is all so exciting, I had no idea that I can this easily see what the Erlang compiler break things into.
ok so I dug into the compiler source code to find the answer, and to my surprise the asm file produced with the 'S' parameter to the compile:file() function is actually consulted in as is (file:consult()) and then the tuples are checked one by one for further action(line 661 - beam_consult_asm(St) -> - compile.erl). further on then there's a generated mapping table in there (compile folder of the erlang source) that shows what the serial number of each bytecode label is, and Im guessing this is used to generate the actual binary signature of the bytecode.
great stuff. but you just gotta love the consult() function, you can almost have a lispy type syntax for a random language and avoid the need for a parser/lexer fully and just consult source code into the compiler and do stuff with it... code as data data as code...
The compiler has a so-called pattern match compiler which will take a pattern and compile it down to what is essentially a series of branches, switches and such. The code for Erlang is in v3_kernel.erl in the compiler. It uses Simon Peyton Jones, "The Implementation of Functional
Programming Languages", available online at
http://research.microsoft.com/en-us/um/people/simonpj/papers/slpj-book-1987/
Another worthy paper is the one by Peter Sestoft,
http://www.itu.dk/~sestoft/papers/match.ps.gz
which derives a pattern match compiler by inspecting partial evaluation of a simpler system. It may be an easier read, especially if you know ML.
The basic idea is that if you have, say:
% 1
f(a, b) ->
% 2
f(a, c) ->
% 3
f(b, b) ->
% 4
f(b, c) ->
Suppose now we have a call f(X, Y). Say X = a. Then only 1 and 2 are applicable. So we check Y = b and then Y = c. If on the other hand X /= a then we know that we can skip 1 and 2 and begin testing 3 and 4. The key is that if something does not match it tells us something about where the match can continue as well as when we do match. It is a set of constraints which we can solve by testing.
Pattern match compilers seek to optimize the number of tests so there are as few as possible before we have conclusion. Statically typed language have some advantages here since they may know that:
-type foo() :: a | b | c.
and then if we have
-spec f(foo() -> any().
f(a) ->
f(b) ->
f(c) ->
and we did not match f(a), f(b) then f(c) must match. Erlang has to check and then fail if it doesn't match.