Are existential quantifiers nested under foralls skolemised once? What do quantifier instantiation statistics mean for these quantifiers? - z3

In a (rather large) Z3 problem, we have a few axioms of the shape:
forall xs :: ( (P(xs) ==> (exists ys :: Q(xs,ys))) && ((exists zs :: Q(xs,zs)) ==> P(xs)) )
All three quantifiers (including the existentials) have explicit triggers provided (omitted here). When running the problem and gathering quantifier statistics, we observed the following data (amongst many other instantiations):
[quantifier_instances] k!244 : 804 : 3 : 4
[quantifier_instances] k!232 : 10760 : 29 : 30
Here, line 244 corresponds to the end of the outer forall quantifier, and line 232 to the end of the first inner exists. Furthermore, there are no reported instantiations of the second inner exists (which I believe Z3 will pull out into a forall); given the triggers this is surprising.
My understanding is that existentials in this inner position should be skolemised by a function (depending on the outer quantifier). It's not clear to me what quantifier statistics mean for such existentials.
Here are my specific questions:
Are quantifier statistics meaningful for existential quantifiers (those which remain as existentials - i.e. in positive positions)? If so, what do they mean?
Does skolemisation of such an existential happen once and for all, or each time the outer quantifier is instantiated? Why is a substantially higher number reported for this quantifier than for the outer forall?
Does Z3 apply some internal rewriting of this kind of (A==>B)&&(B==>A) assertion? If so, how does that affect quantifier statistics for quantifiers in A and B?
From our point of view, understanding question 2 is most urgent, since we are trying to investigate how the generated existentials affect the performance of the overall problem.
The original smt file is available here:
https://gist.github.com/anonymous/16e489ce5c513e8c4bc6
and a summary of the statistics generated (with Z3 4.4.0, but we observed the same with 4.3.2) is here:
https://gist.github.com/anonymous/ce7b96acf712ac16299e

The answer to all these questions is 'it depends', mainly on what else appears in the problem and what options are set. For instance, if there are only bit-vector variables, then Skolemization will indeed be performed during preprocessing, once and forall, but this is not the case for all other theories or theory combinations.
Briefly looking at your SMT2 file, it seems to me that all existentials appear in the left hand side of implications, i.e., they are in fact negated (and actually rewritten into universals somewhere along the line), so those statistics do make sense for the existentials appearing in this particular problem.

Related

What does "quantifier free logic" mean in SMT context?

Even for simplest arithmetic SMT problems the existential quantifier is required to declare symbolic variables. And ∀ quantifier can be turned into ∃ by inverting the constraint. So, I can use both of them in QF_* logics and it works.
I take it, "quantifier free" means something else for such SMT logics, but what exactly?
The claim is that
∀ quantifier can be turned into ∃ by inverting the constraint
AFAIK, the following two relations hold:
∀x.φ(x) <=> ¬∃x.¬φ(x)
¬∀x.φ(x) <=> ∃x.¬φ(x)
Since a quantifier-free SMT formula φ(x) is equisatisfiable to its existential closure ∃x.φ(x), we can use the quantifier-free fragment of an SMT Theory to express a (simple) negated occurrence of universal quantification, and [AFAIK] also a (simple) positive occurrence of universal quantification over trivial formulas (e.g. if [∃x.]φ(x) is unsat then ∀x.¬φ(x)¹).
¹: assuming φ(x) is quantifier-free; As #Levent Erkok points out in his answer, this approach is inconclusive when both φ(x) and ¬φ(x) are satisfiable
However, we cannot, for example, find a model for the following quantified formula using the quantifier-free fragment of SMT:
[∃y.]((∀x.y <= f(x)) and (∃z.y = f(z)))
For the records, this is an encoding of the OMT problem min(y), y=f(x) as a quantified SMT formula. [related paper]
A term t is quantifier-free iff t syntactically contains no quantifiers. A quantifier-free formula φ is equisatisfiable with its existential closure
(∃x1. (∃x2 . . .(∃xn.φ ). . .))
where x1, x2, . . . , xn is any enumeration of free(φ), the free variables in φ.
The set of free variables of a term t, free(t), is defined inductively as:
free(x) = {x} if x is a variable,
free((f t1 t2 . . . tk)) = \cup_{i∈[1,k]} free(ti) for function applications,
free(∀x.φ) = free(φ) \ {x}, and
free(∃x.φ) = free(φ) \ {x}.
[source]
Patrick gave an excellent answer, but here're a few more thoughts. (I'd have put this up as a comment, but StackOverflow thinks it's too long for that!)
Notice that you cannot always play the "negate and check the opposite" trick. This only works because if the negation of a property is unsatisfiable, then the property must be true for all inputs. But it doesn't go the other way around: A property can be satisfiable, and its negation can be satisfiable as well. Simple example: x < 10. This is obviously satisfiable, and so is its negation x >= 10. So, you cannot always get rid of quantifiers by playing this trick. It only works if you want to prove something: Then you can negate it and see if that negation is unsatisfiable. If you're concerned about finding a model to a formula, the method doesn't apply.
You can always skolemize a formula and eliminate all the existential quantifiers by replacing them with uninterpreted functions. What you then end up with is an equisatisfiable formula that has all prefix universals. Clearly, this is not quantifier free, but this is a very common trick that most tools do for you automatically.
Where all this hurts is alternating quantifiers. Regardless of skolemization, if you have alternating quantifiers than your problem is already too difficult to deal with. The wikipedia page on quantifier elimination is rather terse, but it gives a very good introduction: https://en.wikipedia.org/wiki/Quantifier_elimination Bottom line: Not every theory admits quantifier elimination, and even those that do might require exponential algorithms to get rid of them; causing performance issues.

Dafny rejects a simple postcondition

Below is a first attempt to prove various simple theorems, in this case about parity. Dafny /v. 1.9.9.40414/ verifies that adding 2 to an even number yields an even number but does not accept either of the commented out conditions.
function IsEven(a : int) : bool
requires a >= 0
{
if a == 0 then true
else if a == 1 then false
else IsEven(a - 2)
}
method Check1(a : int)
requires a >= 0
ensures IsEven(a) ==> IsEven(a + 2)
//ensures IsEven(a) ==> IsEven(a + a)
//ensures IsEven(a) ==> IsEven(a * a)
{
}
As I have just started to study this wonderful tool, my approach or the implementation might be incorrect. Any advice would be appreciated.
There are few different things going on here. I will discuss each of the three postconditions in turn.
The first and second postconditions
Since IsEven is a recursively defined predicate, in general, facts about it will require proofs by induction. Your first post condition is simple enough to not require induction, which is why it goes through.
Your second postcondition does require induction to be proved. Dafny has heuristics for automatically performing induction, but these heuristics are only invoked in certain contexts. In particular, Dafny will only attempt induction on "ghost methods" (also called "lemmas").
If you add the keyword ghost in front of method in Check1 (or change method to lemma, which is equivalent), you will see that the second postcondition goes through. This is because Dafny's induction heuristic gets invoked and manages to complete the proof.
The third postcondition
The third postcondition is more complex, because it involves nonlinear arithmetic. (In other words, it involves nontrivial reasoning about multiplying two variables together.) Dafny's underlying solver has trouble reasoning about such things, and so the heuristic proof by induction doesn't go through.
A proof that a * a is even if a is even
One way to prove it is here. I have factored out IsEven(a) ==> IsEven(a * a) into its own lemma, called EvenSquare. I have also changed it to require IsEven(a) as a precondition, rather than put an implication in the postcondition. (A similar proof also goes through with the implication instead, but using preconditions on lemmas like this instead of implications is idiomatic Dafny.)
The proof of EvenSquare is by (manual) induction on a. The base case is handled automatically. In the inductive case (the body of the if statement), I invoke the induction hypothesis (ie, I make a recursive method call to EvenSquare to establish that (a - 2) * (a - 2) is even). I then assert that a * a can be written as the sum of (a - 2) * (a - 2) and some offset. The assertion is dispatched automatically. The proof will be done if I can show that the right hand side of this equality is even.
To do this, I already know that (a - 2) * (a - 2) is even, so I first invoke another lemma to show that the offset is even, because it is twice something else. Finally, I invoke one last lemma to show that the sum of two even numbers is even.
This completes the proof, assuming the two lemmas.
Proofs of the two lemmas
It remains to show that twice anything is even, and that the sum of two even numbers is even. While not completely trivial, neither is as complex as EvenSquare.
The lemma EvenDouble proves that twice anything is even. (This is in fact a stronger version of your second postcondition. Your second postcondition says that doubling any even number is even. In fact, doubling any (non-negative, under your definition of evenness) number at all is even.) The proof of EvenDouble proceeds by (manual) induction on a. The base case is handled automatically. The inductive case only requires explicitly invoking the induction hypothesis.
The lemma EvenPlus is almost proved automatically by Dafny's induction heuristic, except that it trips over a bug or some other problem which causes a loop in the solver. After a little debugging, I determined that the annotation {:induction x} (or {:induction y}, for that matter) makes the proof not loop. These annotations tell Dafny's heuristics which variable(s) to try to induct on. By default in this case, Dafny tries to induct on both x and y, which for some reason causes the solver to loop. But inducting on either variable alone works. I'm investigating this problem further, but the current solution works.

Getting a counterexample from µZ3 (Horn solver)

Using Z3's Horn clause solver:
If the answer is SAT, one can get a satisfying assignment to the unknown predicates (which, in most applications, correspond to inductive invariants of some kind of transition system or procedure call system).
If the answer is unsat, then this means the exists an unfolding of the Horn clauses and an assignment to the universally quantified variables in the Horn clauses such that at least one of the safety conditions (the clauses with a false head) is violated. This constitutes a concrete witness why the system had no solution.
I suspect that if Z3 can conclude unsat, then it has some form of such witness internally (and this anyway is the case in PDR, if I remember well). Is there a way to print it out?
Maybe I badly read the documentation, but I can't find a way. (get-proof) prints something unreadable, and, besides, (set-option :produce-proofs true) makes some problems intractable.
The refutation that Z3 produces for HORN logic problems is in the form of a tree of unit-resulting resolution steps. The counterexample you're looking for is hiding in the conclusions of the unit-resolution steps. These conclusions (the last arguments of the rules) are ground facts that correspond to program states (or procedure summaries or whatever) in the counterexample. The variable bindings that produce these facts can be found in "quant-inst" rules.
Obviously, this is not human readable, and actually is pretty hard to read by machine. For Boogie I implemented a more regular format, but it is currently only available with the duality engine and only for the fixedpoint format using "rule" and "query". You can get this using the following command.
(query :engine duality :print-certificate true)

How incremental solving works in Z3?

I have a question regarding how Z3 incrementally solves problems. After reading through some answers here, I found the following:
There are two ways to use Z3 for incremental solving: one is push/pop(stack) mode, the other is using assumptions. Soft/Hard constraints in Z3.
In stack mode, z3 will forget all learned lemmas in global (am I right?) scope even after one local "pop" Efficiency of constraint strengthening in SMT solvers
In assumptions mode (I don't know the name, that is the name that comes to my mind), z3 will not simplify some formulas, e.g. value propagation. z3 behaviour changing on request for unsat core
I did some comparison (you are welcome to ask for the formulas, they are just too large to put on the rise4fun), but here are my observations: On some formulas, including quantifiers, the assumptions mode is faster. On some formulas with lots of boolean variables (assumptions variables), stack mode is faster than assumptions mode.
Are they implemented for specific purposes? How does incremental solving work in Z3?
Yes, there are essentially two incremental modes.
Stack based: using push(), pop() you create a local context, that follows a stack discipline. Assertions added under a push() are removed after a matching pop(). Furthermore, any lemmas that are derived under a push are removed. Use push()/pop() to emulate freezing a state and adding additional constraints over the frozen state, then resume to the frozen state. It has the advantage that any additional memory overhead (such as learned lemmas) built up within the scope of a push() is released. The working assumption is that learned lemmas under a push would not be useful any longer.
Assumption based: using additional assumption literals passed to check()/check_sat() you can (1) extract unsatisfiable cores over the assumption literals, (2) gain local incrementality without garbage collecting lemmas that get derived independently of the assumptions. In other words, if Z3 learns a lemma that does not contain any of the assumption literals it expects to not garbage collect them. To use assumption literals effectively, you would have to add them to formulas too. So the tradeoff is that clauses used with assumptions contain some amount of bloat. For example if you want to locally assume some formula (<= x y), then you add a clause (=> p (<= x y)), and assume p when calling check_sat(). Note that the original assumption was a unit. Z3 propagates units efficiently. With the formulation that uses assumption literals it is no longer a unit at the base level of search. This incurs some extra overhead. Units become binary clauses, binary clauses become ternary clauses, etc.
The differentiation between push/pop functionality holds for Z3's default SMT engine. This is the engine most formulas will be using. Z3 contains some portfolio of engines. For example, for pure bit-vector problems, Z3 may end up using the sat based engine. Incrementality in the sat based engine is implemented differently from the default engine. Here incrementality is implemented using assumption literals. Any assertion you add within the scope of a push is asserted as an implication (=> scope_literals formula). check_sat() within such a scope will have to deal with assumption literals. On the flip-side, any consequence (lemma) that does not depend on the current scope is not garbage collected on pop().
In optimization mode, when you assert optimization objectives, or when you use the optimization objects over the API, you can also invoke push/pop. Likewise with fixedpoints. For these two features, push/pop are essentially for user-convenience. There is no internal incrementality. The reason is that these two modes use substantial pre-processing that is super non-incremental.

Can Z3 check the satisfiability of recursive functions on bounded data structures?

I know that Z3 cannot check the satisfiability of formulas that contain recursive functions. But, I wonder if Z3 can handle such formulas over bounded data structures. For example, I've defined a list of length at most two in my Z3 program and a function, called last, to return the last element of the list. However, Z3 does not terminate when asked to check the satisfiability of a formula that contains last.
Is there a way to use recursive functions over bounded lists in Z3?
(Note that this related to your other question as well.) We looked at such cases as part of the Leon verifier project. What we are doing there is avoiding the use of quantifiers and instead "unrolling" the recursive function definitions: if we see the term length(lst) in the formula, we expand it using the definition of length by introducing a new equality: length(lst) = if(isNil(lst)) 0 else 1 + length(tail(lst)). You can view this as a manual quantifier instantiation procedure.
If you're interested in lists of length at most two, doing the manual instantiation for all terms, then doing it once more for the new list terms should be enough, as long as you add the term:
isCons(lst) => ((isCons(tail(lst)) => isNil(tail(tail(lst))))
for each list. In practice you of course don't want to generate these equalities and implications manually; in our case, we wrote a program that is essentially a loop around Z3 adding more such axioms when needed.
A very interesting property (very related to your question) is that it turns out that for some functions (such as length), using successive unrollings will give you a complete decision procedure. Ie. even if you don't constrain the size of the datastructures, you will eventually be able to conclude SAT or UNSAT (for the quantifier-free case).
You can find more details in our paper Satisfiability Modulo Recursive Programs, or I'm happy to give more here.
You may be interested in the work of Erik Reeber on SULFA, the ``Subclass of Unrollable List Formulas in ACL2.'' He showed in his PhD thesis how a large class of list-oriented formulas can be proven by unrolling function definitions and applying SAT-based methods. He proved decidability for the SULFA class using these methods.
See, e.g., http://www.cs.utexas.edu/~reeber/IJCAR-2006.pdf .

Resources