I am experimenting with optimizing the use of Z3 for proving facts about a first-order theory. Currently, I specify a first-order theory in Python, ground the quantifiers there and send all the clauses along with the negation of the proof goal to Z3. I have the following idea that I hope could optimize the outcome: I only want to send the formulas in the theory to Z3 that are relevant to the proof goal. I will not discuss this concept in detail, but I think the intuition is simple: my theory is a conjunction of formulas, and I only want to send conjuncts that can possibly affect the truth value of the proof goal.
My question is the following: can this lead to an improvement in efficiency, or does Z3 already use a similar method? I would guess not, because I don't think that Z3 always assumes that the last assertion is the proof goal, so it has no way of optimizing this.
Yes, removing irrelevant facts can make a big difference. Suppose that we have a unsatisfiable formula of the form F_1 and F_2 and (not G). Moreover, let us assume that F_1 and (not G) is unsatisfiable, and F_2 is satisfiable. F_2 is what you call irrelevant. If there is a cheap way to remove F_2 before sending the formulat to Z3, it will probable make a big difference.
Z3 has heuristics for "ignoring" irrelevant facts, but they are just heuristics. For our example, the worst case scenario is a F_2 that is really hard for Z3 to satisfy. Z3 is essentially trying to build an interpretation/solution that satisfies the input formula (the formula F_1 an F_2 and (not G) in our working example). A formula is unsatisfiable when Z3 can show it is impossible to build the interpretation. In practice, the formula F_2 is irrelevant for Z3 only if it can quickly show it to be satisfiable, and the interpretation/solution for F_2 does not conflicts F_1 and (not G). If that is not the case, Z3 can waste a lot of resources with F_2.
Related
While reading Extending Sledgehammer with SMT solvers I read the following:
In the original Sledgehammer architecture, the available lemmas were rewritten
to clause normal form using a naive application of distributive laws before the relevance filter was invoked. To avoid clausifying thousands of lemmas on each invocation, the clauses were kept in a cache. This design was technically incompatible with the (cache-unaware) smt method, and it was already unsatisfactory for ATPs, which include custom polynomial-time clausifiers.
My understanding of SMT so far is as follows: SMTs don't work over clauses. Instead, they try to build a model for the quantifier-free part of a problem. The search is refined by instantiating quantifiers according to some set of active terms. Thus, indeed no clausal form is needed for SMT solvers.
We rewrote the relevance filter so that it operates on arbitrary HOL formulas, trying to simulate the old behavior. To mimic the penalty associated with Skolem functions in the clause-based code, we keep track of polarities and detect quantifiers that give rise to Skolem functions.
What's the penalty associated with Skolem functions? I could understand they are not good for SMTs, but here it seems that they are bad for ATPs too...
First, SMT solvers do work over clauses and there is definitely some (non-naive) normalization internally (e.g., miniscoping). But you do not need to do the normalization before calling the SMT solver (especially, since it will be more naive and generate a larger number of clauses).
Anyway, Section 6.6.7 explains why skolemization was done on the Isabelle side. To summarize: it is not possible to introduce polymorphic constants in a proof in Isabelle; hence it must be done before starting the proof.
It seems likely that, when writing the paper, not changing the filtering lead to worse performance and, hence, the penalty was added. However, I tried to find the relevant code simulating clausification in Sledgehammer, so I don't believe that this happens anymore.
In many programming languages, branching efficiency is dependent on the order in which the clauses are provided. E.g., in Python,
if p or q :
will branch into the if statement as soon as p evaluates to true, so it is generally a good idea to provide computationally light clauses first. I'm wondering if the same is true for satisfiability checking in Z3. In other words, is there any difference between checking And(P, Q) and And(Q, P), provided that one of the formulas is significantly more complex than the other?
Yes, it make can make a difference what order you specify the clauses. However, there is probably not much to be gained by rearranging the clauses when it comes to Z3. Also, it might be difficult (and perhaps not worth your while) to determine a generic "optimal order" beforehand.
With other SMT solvers, I have observed what you are describing in an extreme fashion. With some solvers, it can make or break the computation, in the sense that I have seen certain cases where a solver times out when the clauses are presented in one order, but quickly finds a solution when the same clauses are presented in a different order. However, to me, this is an indication of a poorly designed solver. Z3 and other modern SMT solvers such as CVC4 are much less vulnerable to this kind of problem than older or less robust SMT solvers.
Regarding actually finding the optimal ordering, what constitutes "computationally light" or "more complex" for an SMT solver is different than a traditional computer language. What really matters is whether or not a certain clause in a conjunction is likely to lead to a conflict, or whether a particular clause in a disjunction is likely to lead to a solution. This is akin to trying to find your way out of a maze -- if you make a wrong turn early on, then you might spend a long time following a dead end, before you finally decided to go back to the beginning and take a different branch. Again, modern SMT and SAT solvers have techniques for mitigating this.
Also, When it comes to Z3 specifically, if Z3 could be made more efficient in real-world applications by sorting the clauses in terms of some human-understandable and generic measure of complexity, then that has probably already added as a pre-processing step to Z3. In general, for such complicated optimization questions where there are multiple factors at play (e.g. the particulars of the Z3 implementation) you have to try multiple things and benchmark over a suite of test examples relevant for your application.
As a bit of evidence for some of my claims above, I wrote this little example SMT problem:
(declare-fun p () Int)
(declare-fun q () Int)
(declare-fun n () Int)
(assert (> p 1))
(assert (> q 1))
(assert (and (= n 18679565357) (= n (* p q))))
(check-sat)
(get-value (p q n))
(exit)
Swapping the order of the clauses in the and statement makes very little difference on my system (less than 1%). If I break it up into two separate assertions, then the difference is larger, but to know if this difference holds in general (across different SMT problems of this class) I would have to run a benchmark suite with many such problems.
As far as I understand, Z3, when encountering quantified linear real/rational arithmetic, applies a form of quantifier elimination described in Bjørner, IJCAR 2010 and more recent work by Bjørner and Monniaux (that's what qe_sat_tactic.cpp says, at least).
I was wondering
Whether it still works if the formula is multilinear, in the sense that the "constants" are symbolic. E.g. ∀x, ax≤b ⇒ ax ≤ 0 can be dealt with by separating the cases a<0, a=0 and a>0. This is possible using Weispfenning's virtual substitution approach, but I don't know what ended up being implemented in Z3 (that is, whether it implements the general approach or the one restricted to constant coefficients).
Whether it is possible, in Z3, to output the result of elimination instead of just solving for one model. There might be a Z3 tactic to do so but I don't know how this is supposed to be requested.
Whether it is possible, in Z3, to perform elimination as described above, then use the new nonlinear solver to obtain a model. Again, a succession of tactics might do the trick, but I don't know how this is supposed to be requested.
Thanks.
After long travels (including a travel where I met David at a conference), here is a short summary to answer the questions as they are posed.
There is no specific support for multi-linear forms.
The 'qe' tactic produces results of elimination, but may as a side-effect decide satisfiablity.
This is a very interesting problem to investigate, but it is not supported out of the box.
Can SMT solver efficiently find a solution (or an assignment) for the pseudo-Boolean problem as described as follows:
\sum {i..m} f_i x1 x2.. xn *w_i
where f_i x1 x2 .. xn is a Boolean function, and w_i is a weight of Int type.
For your convenience, I highlight the contents in page 1 and 3, which is enough for specifying
the pseudo-Boolean problem.
SMT solvers typically address the question: given a logical formula, optionally using functions and predicates from underlying theories (such as the theory of arithmetic, the theory of bit-vectors, arrays), is the formula satisfiable or not.
They typically don't expose a way for you specify objective functions
and typically don't have built-in optimization procedures.
Some special cases are formulas that only use Booleans or a combination of Booleans and either bit-vectors or integers. Pseudo Boolean constraints can be formulated with either integers or encoded (with some care taking overflow semantics into account) using bit-vectors, or they can be encoded directly into SAT. For some formulas using bounded integers that fall in the class of psuedo-boolean problems, Z3 will try automatic reductions into bit-vectors. This applies only to benchmkars in the SMT-LIB2 format tagged as QF_LIA or applies if you explicitly invoke a tactic that performs this reduction (the "qflia" tactic should apply).
While Z3 does not directly expose objective functions, the question of augmenting
SMT solvers with objective functions is actively pursued in the research community.
One approach suggested by Nieuwenhuis and Oliveras in SAT 2006 was to build in
solving for the "weighted max SMT" problem as a custom theory. Yices comes with built-in
features for weighted max SMT, Z3 does not, but it is possible to write a custom
theory that performs the backtracking search of a weighted max SMT solver, but nothing
out of the box.
Sometimes people try to specify objective functions using quantified formulas.
In theory one could hope that quantifier elimination procedures then can solve
for the objective.
This is generally pretty bad when it comes to performance. Quantifier elimination
is an overfit and the routines (that we have) will not be efficient.
For your problem, if you want to find an optimized (maximum or minimum) result from the sum, yes Z3 has this ability. You can use the Optimize class of Z3 library instead of Solver class. The class provides two methods for 'maximization' and 'minimization' respectively. You can pass the SMT variable that is needed to be optimized and Optimization class model will give the solution for you. It actually worked with C# API using Microsoft.Z3 library. For your inconvenience, I am attaching a snippet:
Optimize opt; // initializing object
opt.MkMaximize(*your variable*);
opt.MkMinimize(*your variable*);
opt.Assert(*anything you need to do*);
Has anyone tried proving Z3 with Z3 itself?
Is it even possible, to prove that Z3 is correct, using Z3?
More theoretical, is it possible to prove that tool X is correct, using X itself?
The short answer is: “no, nobody tried to prove Z3 using Z3 itself” :-)
The sentence “we proved program X to be correct” is very misleading.
The main problem is: what does it mean to be correct.
In the case of Z3, one could say that Z3 is correct if, at least, it never returns “sat” for an unsatisfiable problem, and “unsat” for a satisfiable one.
This definition may be improved by also including additional properties such as: Z3 should not crash; the function X in the Z3 API has property Y, etc.
After we agree on what we are supposed to prove, we have to create models of the runtime, programming language semantics (C++ in the case of Z3), etc.
Then, a tool (aka verifier) is used to convert the actual code into a set of formulas that we should check using a theorem prover such as Z3.
We need the verifier because Z3 does not “understand” C++.
The Verifying C Compiler (VCC) is an example of this kind of tool.
Note that, proving Z3 to be correct using this approach does not provide a definitive guarantee that Z3 is really correct since our models may be incorrect, the verifier may be incorrect, Z3 may be incorrect, etc.
To use verifiers, such as VCC, we need to annotate the program with the properties we want to verify, loop invariants, etc. Some annotations are used to specify what code fragments are supposed to do. Other annotations are used to "help/guide" the theorem prover. In some cases, the amount of annotations is bigger than the program being verified. So, the process is not completely automatic.
Another problem is cost, the process would be very expensive. It would be much more time consuming than implementing Z3.
Z3 has 300k lines of code, some of this code is based on very subtle algorithms and implementation tricks.
Another problem is maintenance, we are regularly adding new features and improving performance. These modifications would affect the proof.
Although the cost may be very high, VCC has been used to verify nontrivial pieces of code such as the Microsoft Hyper-V hypervisor.
In theory, any verifier for programming language X can be used to prove itself if it is also implemented in language X.
The Spec# verifier is an example of such tool.
Spec# is implemented in Spec#, and several parts of Spec# were verified using Spec#.
Note that, Spec# uses Z3 and assumes it is correct. Of course, this is a big assumption.
You can find more information about these issues and Z3 applications on the paper:
http://research.microsoft.com/en-us/um/people/leonardo/ijcar10.pdf
No, it is not possible to prove that a nontrivial tool is correct using the tool itself. This was basically stated in Gödel's second incompleteness theorem:
For any formal effectively generated theory T including basic arithmetical truths and also certain truths about formal provability, if T includes a statement of its own consistency then T is inconsistent.
Since Z3 includes arithmetic, it cannot prove its own consistency.
Because it was mentioned in a comment above: Even if the user provides invariants, Gödels's theorem still applies. This is not a question of computability. The theorem states that no such prove can exist in a consistent system.
However you could verify parts of Z3 with Z3.
Edit after 5 years:
Actually the argument is easier than Gödel's incompleteness theorem.
Let's say Z3 is correct if it only returns UNSAT for unsatisfiable formulas.
Assume we find a formula A, such that if A is unsatisfiable then Z3 is correct (and we somehow have proven this relation).
We can give this formula to Z3, but
if Z3 returns UNSAT it could be because Z3 is correct or because of a bug in Z3. So we have not verified anything.
if Z3 returns SAT and a countermodel, we might be able to find a bug in Z3 by analyzing the model
otherwise we don't know anything.
So we can use Z3 to find bugs in Z3 and to improve confidence about Z3 (to an extremely high level), but not to formally verify it.