Satisfying Models under Tseitin Encoding - z3

I am using the following code fragment in z3 4.0 to convert a formula to CNF.
(set-logic QF_UF)
(
set-option
:produce-models
true
)
; ------ snip -------
;
; declarations,
; and assert statement
; of "original" formula
; here.
;
; ------ snap -------
(
apply
(
then
(
!
simplify
:elim-and
true
)
tseitin-cnf
)
)
I get something like the following:
(goals
(goal
; ------ snip -------
;
; Lot's of lines here
;
; ------ snap -------
:precision precise :depth 2)
)
I was assuming that each of the expressions that follows goal is one clause of the CNF, i.e., all those expressions should be conjuncted to yield the actual formula. I will refer to this conjunction as the "encoded" formula.
Obviously, the original formula and the encoded formula are not equivalent, as the encoded formula contains new variables k!0, k!1, ... which do the Tseitin encoding. However, I was expecting that they are equisatisfiable, or actually that they are satisfied by the same models (when disregarding the k!i variables).
I.e., I was expecting that (encoded formula) AND (NOT original formula) is unsatisfiable. Unfortunately, this does not seem to be the case; I have a counterexample where this check actually returns sat.
Is this a bug in z3, am I using it wrong, or are any of my assumptions not valid?

This is a bug in the new tseitin-cnf tactic. I fixed the bug, and the fix will be available in the next release (Z3 4.1). In the meantime, you can workaround the bug by using the rounds of simplification.
That is, use
(apply
(then (! simplify :elim-and true)
(! simplify :elim-and true)
tseitin-cnf))
instead of
(apply
(then (! simplify :elim-and true)
tseitin-cnf))

Related

Solving predicate calculus problems with Z3 SMT

I'd like to use Z3 to solve problems that are most naturally expressed in terms of atoms (symbols), sets, predicates, and first order logic. For example (in pseudocode):
A = {a1, a2, a3, ...} # A is a set
B = {b1, b2, b3...}
C = {c1, c2, c3...}
def p = (a:A, b:B, c:C) -> Bool # p is unspecified predicate
def q = (a:A, b:B, c:C) -> Bool
# Predicates can be defined in terms of other predicates:
def teaches = (a:A, b:B) -> there_exists c:C
such_that [ p(a, b, c) OR q(a, b, c) ]
constraint1 = forall b:B there_exists a:A
such_that teaches(a, b)
solve(constraint1)
What are good ways to express atoms, sets, predicates, relations, and first order quantifiers in Z3 (or other SMTs)?
Is there a standard idiom for this? Must it be done manually? Is there perhaps a translation library (not necessarily specific to Z3) that can convert them?
I believe Alloy uses SMT to implement predicate logic and relations, but Alloy seems designed more for interactive use to explore consistency of models, rather than to find specific solutions for problems.
"Alloy seems designed more for interactive use to explore consistency of models, rather than to find specific solutions for problems."
IMHO, Alloy shines when it comes to validate your own way of thinking. You model something and through the visualization of several instances you can sometime come to realize that what you modeled is not exactly what you'd have hoped for.
In that sense, I agree with you.
Yet, Alloy can also be used to find specific solutions to problems. You can overload a model with constraints so that only one instance can be found (i.e. your solution).
It works also quite well when your domain space remains relatively small.
Here's your model translated in Alloy :
sig A,B,C{}
pred teaches(a:A,b:B) {
some c:C | a->b->c in REL.q or a->b->c in REL.p}
// I'm a bit rusted, so .. that's my unelegant take on defining an "undefined predicate"
one sig REL {
q: A->B ->C,
p: A->B->C
}
fact constraint1 {
all b:B | some a:A | teaches[a,b]
}
run{}
If you want to define the atoms in sets A,B,C yourself and refer to them in predicates you could always over-constraint this model as follows:
abstract sig A,B,C{}
one sig A1,A2 extends A{}
one sig B1 extends B{}
one sig C1,C2,C3 extends C{}
pred teaches(a:A,b:B) {
some c:C | a->b->c in REL.q or a->b->c in REL.p}
one sig REL {
q: A->B ->C,
p: A->B->C
}{
// here you could for example define the content of p and q yourself
q= A1->B1->C2 + A2 ->B1->C3
p= A1->B1->C3 + A1 ->B1->C2
}
fact constraint1 {
all b:B | some a:A | teaches[a,b]
}
run{}
Modeling predicate logic in SMTLib is indeed possible; though it might be a bit cumbersome compared to a regular theorem prover like Isabelle/HOL etc. And interpreting the results can require a fair amount of squinting.
Having said that, here's a direct encoding of your sample problem using SMTLib:
(declare-sort A)
(declare-sort B)
(declare-sort C)
(declare-fun q (A B C) Bool)
(declare-fun p (A B C) Bool)
(assert (forall ((b B))
(exists ((a A))
(exists ((c C)) (or (p a b c) (q a b c))))))
(check-sat)
(get-model)
A few notes:
declare-sort creates an uninterpreted sort. It's essentially a non-empty set of values. (Can be infinite as well, there are no cardinality assumptions made, aside from the fact that it's not empty.) For your specific problem, it doesn't seem to matter what this sort actually is since you didn't use any of its elements directly. If you do so, you might also want to try a "declared" sort, i.e., a data-type declaration. This can be an enumeration, or something even more complicated; depending on the problem. For the current question as posed, an uninterpreted sort works just fine.
declare-fun tells the solver that there's an uninterpreted function with that name and the signature. But otherwise it neither defines it, nor constrains it in any way. You can add "axioms" about them to be more specific on how they behave.
Quantifiers are supported, as you see with forall and exists in how your constraint1 is encoded. Note that SMTLib isn't that suitable for code-reuse, and one usually programs in a higher-level binding. (Bindings from C/C++/Java/Python/Scala/O'Caml/Haskell etc. are provided, with similar but varying degrees of support and features.) Otherwise, it should be easy to read.
We finally issue check-sat and get-model, to ask the solver to create a universe where all the asserted constraints are satisfied. If so, it'll print sat and will have a model. Otherwise, it'll print unsat if there's no such universe; or it can also print unknown (or loop forever!) if it cannot decide. Use of quantifiers are difficult for SMT solvers to deal with, and heavy use of quantifiers will no doubt lead to unknown as the answer. This is an inherent limitation of the semi-decidability of first-order predicate calculus.
When I run this specification through z3, I get:
sat
(
;; universe for A:
;; A!val!1 A!val!0
;; -----------
;; definitions for universe elements:
(declare-fun A!val!1 () A)
(declare-fun A!val!0 () A)
;; cardinality constraint:
(forall ((x A)) (or (= x A!val!1) (= x A!val!0)))
;; -----------
;; universe for B:
;; B!val!0
;; -----------
;; definitions for universe elements:
(declare-fun B!val!0 () B)
;; cardinality constraint:
(forall ((x B)) (= x B!val!0))
;; -----------
;; universe for C:
;; C!val!0 C!val!1
;; -----------
;; definitions for universe elements:
(declare-fun C!val!0 () C)
(declare-fun C!val!1 () C)
;; cardinality constraint:
(forall ((x C)) (or (= x C!val!0) (= x C!val!1)))
;; -----------
(define-fun q ((x!0 A) (x!1 B) (x!2 C)) Bool
(and (= x!0 A!val!0) (= x!2 C!val!0)))
(define-fun p ((x!0 A) (x!1 B) (x!2 C)) Bool
false)
)
This takes a bit of squinting to understand fully. The first set of values tell you how the solver constructed a model for the uninterpreted sorts A, B, and C; with witness elements and cardinality constraints. You can ignore this part for the most part, though it does contain useful information. For instance, it tells us that A is a set with two elements (named A!val!0 and A!val!1), so is C, and B only has one element. Depending on your constraints, you'll get different sets of elements.
For p, we see:
(define-fun p ((x!0 A) (x!1 B) (x!2 C)) Bool
false)
This means p always is False; i.e., it's the empty set, regardless of what the arguments passed to it are.
For q we get:
(define-fun q ((x!0 A) (x!1 B) (x!2 C)) Bool
(and (= x!0 A!val!0) (= x!2 C!val!0)))
Let's rewrite this a little more simply:
q (a, b, c) = a == A0 && c == C0
where A0 and C0 are the members of the sorts A and C respectively; see the sort declarations above. So, it says q is True whenever a is A0, c is C0, and it doesn't matter what b is.
You can convince yourself that this model does indeed satisfy the constraint you wanted.
To sum up; modeling these problems in z3 is indeed possible, though a bit clumsy and heavy use of quantifiers can make the solver loop-forever or return unknown. Interpreting the output can be a bit cumbersome, though you'll realize that the models will follow a similar schema: First the uninterpreted sorts, and then the the definitions for the predicates.
Side note
As I mentioned, programming z3 in SMTLib is cumbersome and error-prone. Here's the same program done using the Python interface:
from z3 import *
A = DeclareSort('A')
B = DeclareSort('B')
C = DeclareSort('C')
p = Function('p', A, B, C, BoolSort())
q = Function('q', A, B, C, BoolSort())
dummyA = Const('dummyA', A)
dummyB = Const('dummyB', B)
dummyC = Const('dummyC', C)
def teaches(a, b):
return Exists([dummyC], Or(p(a, b, dummyC), q(a, b, dummyC)))
constraint1 = ForAll([dummyB], Exists([dummyA], teaches(dummyA, dummyB)))
s = Solver()
s.add(constraint1)
print(s.check())
print(s.model())
This has some of its idiosyncrasies as well, though hopefully it'll provide a starting point for your explorations should you choose to program z3 in Python. Here's the output:
sat
[p = [else -> And(Var(0) == A!val!0, Var(2) == C!val!0)],
q = [else -> False]]
Which has the exact same info as the SMTLib output, though written slightly differently.
Function definition style
Note that we defined teaches as a regular Python function. This is the usual style in z3py programming, as the expression it produces gets substituted as calls are made. You can also create a z3-function as well, like this:
teaches = Function('teaches', A, B, BoolSort())
s.add(ForAll([dummyA, dummyB],
teaches(dummyA, dummyB) == Exists([dummyC], Or(p(dummyA, dummyB, dummyC), q(dummyA, dummyB, dummyC)))))
Note that this style of definition will rely on quantifier instantiation internally, instead of the general function-definition facilities of SMTLib. So, you should prefer the python function style in general as it translates to "simpler" internal constructs. It is also much easier to define and use in general.
One case where you need the z3 function definition style is if the function you're defining is recursive and its termination relies on a symbolic argument. For a discussion of this, see: https://stackoverflow.com/a/68457868/936310

SMT let expression binding scope

I'm using a simple let expression to shorten my SMT formula. I want bindings to use previously defined bindings as follows, but if I remove the commented line and have n refer to s it doesn't work:
;;;;;;;;;;;;;;;;;;;;;
; ;
; This is our state ;
; ;
;;;;;;;;;;;;;;;;;;;;;
(declare-datatypes ((State 0))
(((rec
(myArray String)
(index Int))))
)
;;;;;;;;;;;;;;;;;;;;;;;;;;
; ;
; This is our function f ;
; ;
;;;;;;;;;;;;;;;;;;;;;;;;;;
(define-fun f ((in State)) State
(let (
(s (myArray in))
(n (str.len (myArray in))))
;;;;;;;;;;(n (str.len s)))
in
(rec (str.substr s 1 n) 1733))
)
I looked at the documentation here, and it's not clear whether it's indeed forbidden to have bindings refer to other (previously defined) bindings:
The whole let construct is entirely equivalent to replacing each new
parameter by its expression in the target expression, eliminating the
new symbols completely (...)
I guess it's a "shallow" replacement?
From Section 3.6.1 of http://smtlib.cs.uiowa.edu/papers/smt-lib-reference-v2.6-r2017-07-18.pdf:
Let. The let binder introduces and defines one or more local variables
in parallel. Semantically, a term of the form (let ((x1 t1) · · · (xn tn)) t) (3.3) is equivalent to the term t[t1/x1, . . . , tn/xn]
obtained from t by simultaneously replacing each free occurrence of xi
in t by ti , for each i = 1, . . . , n, possibly after a suitable
renaming of t’s bound variables to avoid capturing any variables in
t1, . . . , tn. Because of the parallel semantics, the variables x1, .
. . , xn in (3.3) must be pairwise distinct.
Remark 3 (No sequential
version of let). The language does not have a sequential version of
let. Its effect is achieved by nesting lets, as in (let ((x1 t1)) (let ((x2 t2)) t)).
As indicated in Remark 3, if you want to refer to an earlier definition you have to nest the let-expressions.

Z3: How to best encode a "switch statement"?

I want to create an expression that selects one of a given set of expressions. Given an array of expressions
Expr[] availableExprs = ...;
with statically known length, I want Z3 to select any one of these (like a switch statement). In case the problem is SAT I need a way to find out which of these was selected in the model (its index in the array).
What is the fastest way to encode this?
I considered these approaches so far:
Have an integer restricted to [0, arrayLength) and use ITE to select one of those expressions. The model allows me to extract this integer. Unfortunately, this introduces the integer theory to the model (which previously did not use integers at all).
Have one boolean for each possible choice. Use ITE to select an expression. Assert that exactly one of those booleans is true. This strategy does not need any special theory (I think) but the encoding might be too verbose.
Store the expressions into an array expression and read from that array using an integer. This saves the ITE chain but introduces the array theory.
Clearly, all of these work, but they all seem to have drawbacks. What is the best strategy?
If all you want is to encode is that an element v is in a finite set {e1, ..., en} (with sort U), you can do this using smtlib2 as follows:
(declare-fun v () U)
(declare-fun p1 () Bool)
...
(assert (= p1 (= v e1)))
(assert (= p2 (= v e2)))
...
(assert (= pn (= v en)))
(assert (or p1 ... pn))
The variable v will be equal to one of the elements in "the array" of {e1 ... en}. And pi must be true if the selection variable v is equal to ei. This is basically a restatement of Nikolaj's suggestion, but recast for arbitrary sorts.
Note that multiple pi may be set to true as there is no guarantee ei != ej. If you need to ensure no two elements are both selected, you will need to figure out what semantics you want. If the {e1... en} are already entailed to be distinct, you don't need to add anything. If the "array" elements must be distinct but are not yet entailed to be distinct, you can assert
(assert (distinct e1 ... en))
(This will probably be internally expanded to something quadratic in n.)
You can instead say that no 2 p variables can be true as once. Note that this is a weaker statement. To see this, suppose v = e1 = 1 and e2 = e3 = 0. Then p1 = true and p2 = p3 = false. The obvious encoding for these constraints is quadratic:
(assert (or (not pi) (not pj))) ; for all i < j
If you need a better encoding for this, try taking a look at how to encode "p1+ ... + pn <= 1" in Translating Pseudo-Boolean Constraints into SAT section 5.3. I would only try this if the obvious encoding fails though.
I think you want to make sure that each expression is quantifier free and uses only functions and predicates already present in the formula. If this is not the case then introduce a fresh propositional variable p_i for each index and assert ctx.MkIff(p_i, availableExprs[i]) to the solver.
When Z3 produces a model, use model.Eval(p_i) and check if the result is the expression "True".

z3py: assumptions from (check-sat ...) statement

Is there a way to pass assumptions from (check-sat ...) statement of SMT2 formula into the solver ?
Consider the following example formula stored in ex.smt2:
# cat ex.smt2
(declare-fun p () Bool)
(assert (not p))
(check-sat p)
Running z3 on it gives unsat, as expected. Now, I'd like to solve with assumptions (p) through z3py interface:
In [30]: ctx = z3.Context()
In [31]: s = z3.Solver(ctx=ctx)
In [32]: f = z3.parse_smt2_file("ex.smt2", ctx=ctx)
In [33]: s.add(f)
In [34]: s.check()
Out[34]: sat
Is there an API to get assumptions (i.e. (p) in this example) from the parser ? Or even better, just tell the solver to solve with the assumptions read from the input file ?
No, there is no such API. The parse_smt2_file API is very simple, and only provides access to the assertions in the input file. Extending this API is in the TODO list, but nobody is currently working on that.

Why does a query result changes if comment an intermediate `(check-sat)` call?

While debugging UNSAT query I noticed an interesting difference in the query status. The query structure is:
assert(...)
(push) ; commenting any of these two calls
(check-sat) ; makes the whole query UNSAT, otherwise it is SAT
assert(...)
(check-sat) ; SAT or UNSAT depending on existence of previous call
(exit)
There are no pop calls in the query. The query that triggers this behaviour is here.
Ideas why?
Note: I don't actually need incrementality, it is for debugging purposes only. Z3 version is 3.2.
This is a bug in one of the quantifier reasoning engines. This bug will be fixed. In the meantime, you can avoid the bug by using datatypes instead of uninterpreted sorts + cardinality constraints. That is, you declare Q and T as:
(declare-datatypes () ((Q q_accept_S13 q_T0_init q_accept_S7
q_accept_S6 q_accept_S5 q_accept_S4 q_T0_S3 q_accept_S12 q_accept_S10
q_accept_S9 q_accept_all)))
(declare-datatypes () ((T t_0 t_1 t_2 t_3 t_4 t_5 t_6 t_7)))
The declarations above are essentially defining two "enumeration" types.
With these declarations, you will get a consistent answer for the second query.

Resources