In Extending Sledgehammer with SMT solvers I find this quote:
Certificates make it possible to store Z3 proofs alongside Isabelle formalizations, allowing SMT proof replay without Z3. Only if the formalizations cahnge must the certificates be regenerated.
How does a Z3 certificate look like? Is it just some sort of balanced tree where the inference steps obtained in Z3 are stored?
A certificate is simply the proof produced by Z3. Here is an example (taken from the file SMT_Examples.certs you can find in the Isabelle distribution):
23f5eb3b530a4577da2f8947333286ff70ed557f 11 0
unsat
((set-logic AUFLIA)
(proof
(let (($x29 (exists ((?v0 A$) )(! (g$ ?v0) :qid k!7))
))
(let (($x30 (f$ $x29)))
(let (($x31 (=> $x30 true)))
(let (($x32 (not $x31)))
(let ((#x42 (trans (monotonicity (rewrite (= $x31 true)) (= $x32 (not true))) (rewrite (= (not true) false)) (= $x32 false))))
(mp (asserted $x32) #x42 false))))))))
A Z3 proof is, in essence, a proof tree with false as a conclusion, not a balanced tree. The reconstruction and the proof format is described in a paper by Sascha Böhme.
Remark that Sledgehammer has nothing to do with certificates. Whenever you have an smt call (whether you have written it by hand or used Sledgehammer to produce it), you can use certificates. However, I don't know anyone doing it.
Related
Is it possible to get assertion names inside Z3 (version 4.8.9) proofs?
As a minimal example:
(set-option :produce-proofs true)
(assert (! false :named name))
(check-sat)
(get-proof)
I would like to have the following output:
unsat
((proof (asserted name)))
However, this is the actual output:
unsat
((proof (asserted false)))
Is it possible to have the proof refering to the assertion names instead of the actual formula?
Via experimenting, I found out that it is possible to add (set-option :unsat-core true).
However, this makes the proof more complicated. With the option set, the output is:
unsat
((proof
(let (($x27 (not name)))
(let ((#x30 (mp (asserted (=> name false)) (rewrite (= (=> name false) $x27)) $x27)))
(unit-resolution #x30 (asserted name) false)))))
Also I am not sure if enabling proof and unsat-core generation simultaneously is allowed, in https://github.com/Z3Prover/z3/issues/189#issuecomment-129786093 NikolajBjorner states:
Z3 doesn't really support simultaneous proof and core generation, ...
This is extremely unlikely. Names are pretty much only used in unsat-core generation. Proof objects in z3 remain more or less a black-box. However, if you restrict yourself to only bit-vectors, it might work better. See: https://link.springer.com/chapter/10.1007/978-3-642-25379-9_15
This paper is also quite relevant in this context: http://homepage.divms.uiowa.edu/~ajreynol/lpar15.pdf. In short, BV-solvers usually reduce the problem to propositional reasoning, and thus resolution-style proofs that can be easily checked by external tools are much easier to construct. But for other logics, and especially when quantifiers are involved, the proof steps can be rather opaque and replay will be much more difficult.
I'm using the Z3 theorem prover as a backend of a compiler to verify that function calls respect their contracts. However, Z3 appears to be stuck when confronted with solving seemingly simple existential queries.
I'm using Z3 version 4.8.5 - 64 bit (in Linux 5.0). I understand that the SMT solver is not complete for first order logic (as soon as quantifiers are involved), but still I would have expected the following to work.
This is a minimal example showing the problem, which does not terminate:
(declare-datatypes ()
((Term (structure (constructor Int) (arguments TermList)))
(TermList empty (cons (head Term) (tail TermList)))))
(assert
(forall ((A TermList) (B Term))
(implies
(= A (cons B empty))
(exists ((C Term))
(= A (cons C empty))))))
(check-sat)
Is this a well known bug or limitation of Z3?
Are there any reasonable alternatives to represent this query in such a way that Z3 can handle it?
These sorts of problems are just not suitable for SMT solvers. There have been many queries along these lines, here're some of the most relevant ones:
Creating a transitive and not reflexive function in Z3
parthood definition in Z3
max element in Z3 Seq Int
Long story short, use a more powerful system to conduct such proofs, which uses SMT solvers as proof-tools under the hood. You'll have to do some manual "guiding" but the tactic language of theorem provers these days are quite well developed that they can discharge most goals of this form automatically for you. (See this paper for some Isabelle specific details: https://people.mpi-inf.mpg.de/~jblanche/frocos2011-dis-proof.pdf )
Is it possible to have Z3 serialise a proof for some assertion, and replay the proof on later invocations instead of running a proof-search again? I know Z3 can output counter-examples for unsat, but can it provide proofs for models that are sat?
Terminology note: Z3 (and SAT/SMT solvers in general) output models for sat, and proofs for unsat.
Proof generation is actually an SMT-Lib feature. See page 56 of http://smtlib.cs.uiowa.edu/papers/smt-lib-reference-v2.6-r2017-07-18.pdf
And Z3 indeed supports it, here's the simplest example:
(set-option :produce-proofs true)
(declare-fun a () Bool)
(assert (= a (not a)))
(check-sat)
(get-proof)
Z3 says:
unsat
((proof
(mp (asserted (= a (not a))) (rewrite (= (= a (not a)) false)) false)))
The format is solver-specific. The SMTLib document says:
(get-proof) asks the solver for a proof of unsatisfiability for the
set of all formulas in the current context. The command can be issued
only if the most recent check command had an empty set of assumptions.
The solver responds by printing a refutation proof on its regular
output channel. The format of the proof is solver-specific. The
only requirement is that, like all responses, it be a member of
s_expr.
So far as I know there's no "public" switch to tell Z3 to read this proof back and do anything with it. It wouldn't surprise me, however, that they might have internal tools to consume this output.
Replaying in a theorem prover
Isabelle theorem prover can read Z3's proofs back and replay them internally to construct the corresponding proof. This is probably closer to what you are looking for. Here's a paper that describes this work: http://www21.in.tum.de/~boehmes/proofrec.pdf Of course, precisely which logics are supported and whether the connection is actively maintained is a different question! You might find the "related work" section of that paper quite helpful.
Officially, there is no trig support in Z3. For example, see this question, or this one. However, there are undocumented trigonometric operators in Z3 -- they are used for example in the regression tests. There is even a built-in symbol called pi. Z3 can even do some trivial proofs with these operators, e.g.:
(declare-fun x () Real)
(assert (= (cos pi) x))
(check-sat)
(get-value (x))
Comes back with:
sat
((x (- 1.0)))
These operators do not work well. For example, this little input file will cause a seg fault with Z3 4.4.1, or cause a rapid explosion in memory usage with the master branch as this commit (now):
(declare-fun x () Real)
(assert (< (sin x) -1.0))
(check-sat)
I'm not surprised that an undocumented feature that the team says doesn't exist doesn't work. My question is: are they possible to fix? What level of performance would be a justified addition to Z3? I know that I can do a number of trigonometric proofs with Z3 using uninterpreted functions plus trigonometric identities. Is there any interest in this among the Z3 team?
Thanks, Z3 should not crash in such cases. It should be more graceful about handling these operations. I checked in a fix to this now, 9b91e6f..cb29c07.
OTOH, there is essentially no theory reasoning for such operators.
For example, Z3 does not know the bounds for sin. You would have to axiomatize such properties yourself. Z3 returns "unknown" (or "unsat", but not "sat") when you use the built-in functions that have no (partial) decision procedure support.
I have several SMTLIB2 examples which z3 normally finds unsat in 10s of milliseconds, yet, when I add in a request for it to generate unsat cores, the check-sat keeps going for minutes without returning. Is this behaviour to be expected? Does requesting unsat cores do more than just switch on instrumentation recording dependencies, and change which procedures and options z3 runs with? Is it possible to set further options so I see the same behaviour when I'm using unsat core generation as I see when I'm not using it?
I'm using Z3 4.3.1 (stable branch) on Scientific Linux 6.3.
The examples are in AUFNIRA, though several involve no reals and probably are not non-linear.
Thanks,
Paul.
The unsat cores are tracked using "answer literals" (aka assumptions).
When we enable unsat core extraction and use assertions such as
(assert (! (= x 10) :named a1))
Z3 will internally create a fresh Boolean variable for the name a1, and assert
(assert (=> a1 (= x 10)))
When, check-sat is invoked, it assumes all these auxiliary variables are true. That is, Z3 tries to show the problem is unsat/sat modulo these assumptions. For satisfiable instances, it will terminate as usual with a model. For unsatisfiable instances, it will terminate whenever it generates a lemma that contains only these assumed Boolean variables. The lemma is of the form (or (not a_i1) ... (not a_in)) where the a_i's are a subset of the assumed Boolean variables.
As far as I know, this technique has been introduced by the MiniSAT solver. It is described here (Section 3). I really like it because it is simple to implement and we essentially get unsat core generation for free.
However, this approach has some disadvantages. First, some preprocessing steps are not applicable anymore. If we just assert
(assert (= x 10))
Z3 will replace x with 10 everywhere. We say Z3 is performing "value propagation". This preprocessing step is not applied if the assertion is of the form
(assert (=> a1 (= x 10)))
This is just an example, many other preprocessing steps are affected.
During solving time, some of the simplification steps are also disabled.
If we inspect the Z3 source file smt_context.cpp we will find code such as:
void context::simplify_clauses() {
// Remark: when assumptions are used m_scope_lvl >= m_search_lvl > m_base_lvl. Therefore, no simplification is performed.
if (m_scope_lvl > m_base_lvl)
return;
...
}
The condition m_scope_lvl > m_base_lvl) is always true when "answer literals"/assumptions are used.
So, when we enable unsat core generation, we may really impact the performance. It seems that nothing is really for free :)