Z3: How to best encode a "switch statement"? - z3

I want to create an expression that selects one of a given set of expressions. Given an array of expressions
Expr[] availableExprs = ...;
with statically known length, I want Z3 to select any one of these (like a switch statement). In case the problem is SAT I need a way to find out which of these was selected in the model (its index in the array).
What is the fastest way to encode this?
I considered these approaches so far:
Have an integer restricted to [0, arrayLength) and use ITE to select one of those expressions. The model allows me to extract this integer. Unfortunately, this introduces the integer theory to the model (which previously did not use integers at all).
Have one boolean for each possible choice. Use ITE to select an expression. Assert that exactly one of those booleans is true. This strategy does not need any special theory (I think) but the encoding might be too verbose.
Store the expressions into an array expression and read from that array using an integer. This saves the ITE chain but introduces the array theory.
Clearly, all of these work, but they all seem to have drawbacks. What is the best strategy?

If all you want is to encode is that an element v is in a finite set {e1, ..., en} (with sort U), you can do this using smtlib2 as follows:
(declare-fun v () U)
(declare-fun p1 () Bool)
...
(assert (= p1 (= v e1)))
(assert (= p2 (= v e2)))
...
(assert (= pn (= v en)))
(assert (or p1 ... pn))
The variable v will be equal to one of the elements in "the array" of {e1 ... en}. And pi must be true if the selection variable v is equal to ei. This is basically a restatement of Nikolaj's suggestion, but recast for arbitrary sorts.
Note that multiple pi may be set to true as there is no guarantee ei != ej. If you need to ensure no two elements are both selected, you will need to figure out what semantics you want. If the {e1... en} are already entailed to be distinct, you don't need to add anything. If the "array" elements must be distinct but are not yet entailed to be distinct, you can assert
(assert (distinct e1 ... en))
(This will probably be internally expanded to something quadratic in n.)
You can instead say that no 2 p variables can be true as once. Note that this is a weaker statement. To see this, suppose v = e1 = 1 and e2 = e3 = 0. Then p1 = true and p2 = p3 = false. The obvious encoding for these constraints is quadratic:
(assert (or (not pi) (not pj))) ; for all i < j
If you need a better encoding for this, try taking a look at how to encode "p1+ ... + pn <= 1" in Translating Pseudo-Boolean Constraints into SAT section 5.3. I would only try this if the obvious encoding fails though.

I think you want to make sure that each expression is quantifier free and uses only functions and predicates already present in the formula. If this is not the case then introduce a fresh propositional variable p_i for each index and assert ctx.MkIff(p_i, availableExprs[i]) to the solver.
When Z3 produces a model, use model.Eval(p_i) and check if the result is the expression "True".

Related

In Z3, I cannot understand result of quantifier elimination of Exists y. Forall x. (x>=2) => ((y>1) /\ (y<=x))

In Z3-Py, I am performing quantifier elimination (QE) over the following formulae:
Exists y. Forall x. (x>=2) => ((y>1) /\ (y<=x))
Forall x. Exists y. (x>=2) => ((y>1) /\ (y<=x)),
where both x and y are Integers. I did QE in the following way:
x, y = Ints('x, y')
t = Tactic("qe")
negS0= (x >= 2)
s1 = (y > 1)
s2 = (y <= x)
#EA
ea = Goal()
ea.add(Exists([y],Implies(negS0, (ForAll([x], And(s1,s2))))))
ea_qe = t(ea)
print(ea_qe)
#AE
ae = Goal()
ae.add(ForAll([x],Implies(negS0, (Exists([y], And(s1,s2))))))
ae_qe = t(ae)
print(ae_qe)
Result QE for ae is as expected: [[]] (i.e., True). However, as for ea, QE outputs: [[Not(x, >= 2)]], which is a results that I do not know how to interpret since (1) it has not really performed QE (note the resulting formula still contains x and indeed does not contain y which is the outermost quantified variable) and (2) I do not understand the meaning of the comma in x, >=. I cannot get the model either:
phi = Exists([y],Implies(negS0, (ForAll([x], And(s1,s2)))))
s_def = Solver()
s_def.add(phi)
print(s_def.model())
This results in the error Z3Exception: model is not available.
I think the point is as follows: since (x>=2) is an implication, there are two ways to satisfy the formula; by making the antecedent False or by satisfying the consequent. In the second case, the model would be y=2. But in the first case, the result of QE would be True, thus we cannot get a single model (as it happens with a universal model):
phi = ForAll([x],Implies(negS0, (Exists([y], And(s1,s2)))))
s_def = Solver()
s_def.add(phi)
print(s_def.model())
In any case, I cannot 'philosophically' understand the meaning of a QE of x where x is part of the (quantifier-eliminated) answer.
Any help?
There are two separate issues here, I'll address them separately.
The mysterious comma This is a common gotcha. You declared:
x, y = Ints('x, y')
That is, you gave x the name "x," and y the name "y". Note the comma after the x in the name. This should be
x, y = Ints('x y')
I guess you can see the difference: The name you gave to the variable x is "x," when you do the first; i.e., comma is part of the name. Simply skip the comma on the right hand side, which isn't what you intended anyhow. And the results will start being more meaningful. To be fair, this is a common mistake, and I wish the z3 developers ignored the commas and other punctuation in the string you give; but that's just not the case. They simply break at whitespace.
Quantification
This is another common gotcha. When you write:
ea.add(Exists([y],Implies(negS0, (ForAll([x], And(s1,s2))))))
the x that exists in negS0 is not quantified over by your ForAll, since it's not in the scope. Perhaps you meant:
ea.add(Exists([y],ForAll([x], Implies(negS0, And(s1,s2)))))
It's hard to guess what you were trying to do, but I hope the above makes it clear that the x wasn't quantified. Also, remember that a top-level exist quantifier in a formula is more or less irrelevant. It's equivalent to a top-level declaration for all practical purposes.
Once you make this fix, I think things will become more clear. If not, please ask further clarifying questions. (As a separate question on Stack-overflow; as edits to existing questions only complicate the matters.)

forall usage in SMT

How does forall statement work in SMT? I could not find information about usage. Can you please simply explain this? There is an example from
https://rise4fun.com/Z3/Po5.
(declare-fun f (Int) Int)
(declare-fun g (Int) Int)
(declare-const a Int)
(declare-const b Int)
(declare-const c Int)
(assert (forall ((x Int))
(! (= (f (g x)) x)
:pattern ((g x)))))
(assert (= (g a) c))
(assert (= (g b) c))
(assert (not (= a b)))
(check-sat)
For general information on quantifiers (and everything else SMTLib) see the official SMTLib document:
http://smtlib.cs.uiowa.edu/papers/smt-lib-reference-v2.6-r2017-07-18.pdf
Quoting from Section 3.6.1:
Exists and forall quantifiers. These binders correspond to the usual
universal and existential quantifiers of first-order logic, except
that each variable they quantify is also associated with a sort. Both
binders have a non-empty list of variables, which abbreviates a
sequential nesting of quantifiers. Specifically, a formula of the form
(forall ((x1 σ1) (x2 σ2) · · · (xn σn)) ϕ) (3.1) has the same
semantics as the formula (forall ((x1 σ1)) (forall ((x2 σ2)) (· · ·
(forall ((xn σn)) ϕ) · · · ) (3.2) Note that the variables in the list
((x1 σ1) (x2 σ2) · · · (xn σn)) of (3.1) are not required to be
pairwise disjoint. However, because of the nested quantifier
semantics, earlier occurrences of same variable in the list are
shadowed by the last occurrence—making those earlier occurrences
useless. The same argument applies to the exists binder.
If you have a quantified assertion, that means the solver has to find a satisfying instance that makes that formula true. For a forall quantifier, this means it has to find a model that the assertion is true for all assignments to the quantified variables of the relevant sorts. And likewise, for exists the model needs to be able to exhibit a particular value satisfying the assertion.
Top-level exists quantifiers are usually left-out in SMTLib: By skolemization, declaring a top-level variable fills that need, and also has the advantage of showing up automatically in models. (That is, any top-level declared variable is automatically existentially quantified.)
Using forall will typically make the logic semi-decidable. So, you're likely to get unknown as an answer if you use quantifiers, unless some heuristic can find a satisfying assignment. Similarly, while the syntax allows for nested quantifiers, most solvers will have very hard time dealing with them. Patterns can help, but they remain hard-to-use to this day. To sum up: If you use quantifiers, then SMT solvers are no longer decision-procedures: They may or may not terminate.
If you are using the Python interface for z3, also take a look at: https://ericpony.github.io/z3py-tutorial/advanced-examples.htm. It does contain some quantification examples that can clarify things for you. (Even if you don't use the Python interface, I heartily recommend going over that page to see what the capabilities are. They more or less translate to SMTLib directly.)
Hope that gets you started. Stack-overflow works the best if you ask specific questions, so feel free to ask clarification on actual code as you need.
Semantically, a quantifier forall x: T . e(x) is equivalent to e(x_1) && e(x_2) && ..., where the x_i are all the values of type T. If T has infinitely many (or statically unknown many) values, then it is intuitively clear that an SMT solver cannot simply turn a quantifier into the equivalent conjunction.
The classical approach in this case are patterns (also called triggers), pioneered by Simplify and available in Z3 and others. The idea is rather simple: users annotate a quantifier with a syntactical pattern that serves a heuristic for when (and how) to instantiate the quantifier.
Here is an example (in pseudo-code):
assume forall x :: {foo(x)} foo(x) ==> false
Here, {foo(x)} is the pattern, indicating to the SMT solver that the quantifier should be instantiated whenever the solver gets a ground term foo(something). For example:
assume forall x :: {foo(x)} foo(x) ==> 0 < x
assume foo(y)
assert 0 < y
Since the ground term foo(y) matches the trigger foo(x) when the quantified variable x is instantiated with y, the solver will instantiate the quantifier accordingly and learn 0 < y.
Patterns and quantfier triggering is difficult, though. Consider this example:
assume forall x :: {foo(x)} (foo(x) || bar(x)) ==> 0 < y
assume bar(y)
assert 0 < y
Here, the quantifier won't be instantiated because the ground term bar(y) does not match the chosen pattern.
The previous example shows that patterns can cause incompletenesses. However, they can also cause termination problems. Consider this example:
assume forall x :: {f(x)} !f(x) || f(f(x))
assert f(y)
The pattern now admits a matching loop, which can cause nontermination. Ground term f(y) allows to instantiate the quantifier, which yields the ground term f(f(y)). Unfortunately, f(f(y)) matches the trigger (instantiate x with f(y)), which yields f(f(f(y))) ...
Patterns are dreaded by many and indeed tricky to get right. On the other hand, working out a triggering strategy (given a set of quantifiers, find patterns that allow the right instantiations, but ideally not more than these) ultimately "only" required logical reasoning and discipline.
Good starting points are:
* https://rise4fun.com/Z3/tutorial/, section "Quantifiers"
* http://moskal.me/smt/e-matching.pdf
* https://dl.acm.org/citation.cfm?id=1670416
* http://viper.ethz.ch/tutorial/, section "Quantifiers"
Z3 also has offers Model-based Quantifier Instantiation (MBQI), an approach to quantifiers that doesn't use patterns. As far as I know, it is unfortunately also much less well documented, but the Z3 tutorial has a short section on MBQI as well.

Clarification SMT types; bug in variable truncation?; allowing Bool*Int

I have the following questions:
A. If I have a variable x which I declare it (case 1) to be Bool (case 2) to be Int and assert x=0 or x=1 (case 3) to be Int and assert 0<=x<=1. What happens in Z3, or more generally in SMT, in each of the cases? Is, (case 1) desired since things happen at SAT and not SMT level? Do you have some reference paper here?
B. I have the following statement in Python (for using Z3 Python API)
tmp.append(sum([self.a[i * self.nrVM + k] * componentsRequirements[i][1] for i
in range(self.nrComp)]) <= self.MemProv[k])
where a is declared as a Z3 Int variable (0 or 1), componentsRequirements is a Python variable which is to be considered float, self.MemProv is declared as a Z3 Real variable.
The strange thing is that for float values for componentsRequirements, e.g 0.5, the constraint built by the solver considers it 0 and not 0.5. For example in:
(assert (<= (to_real (+ 0 (* 0 C1_VM2)
(* 0 C2_VM2)
(* 2 C3_VM2)
(* 2 C4_VM2)
(* 3 C5_VM2)
(* 1 C6_VM2)
(* 0 C7_VM2)
(* 2 C8_VM2)
(* 1 C9_VM2)
(* 2 C10_VM2)))
StorageProv2))
the 0 above should be actually 0.5. When changing a to a Real value then floats are considered, but this is admitted by us because of the semantics of a. I tried to use commutativity between a and componentsRequirements but with no success.
C. If I'd like to use x as type Bool and multiply it by an Int, I get, of course, an error (Bool*Real/Int not allowed), in Z3. Is there a way to overcome this problem but keeping the types of both multipliers? An example in this sense is the above (a - Bool, componentsRequirements - Real/Int):
a[i * self.nrVM + k] * componentsRequirements[i][1]
Thanks
Regarding A
Patrick gave a nice explanation for this one. In practice, you really have to try and see what happens. Different problems might exhibit different behavior due to when/how heuristics kick in. I don't think there's a general rule of thumb here. My personal preference would be to keep booleans as booleans and convert as necessary, but that's mostly from a programming/maintenance perspective, not an efficiency one.
Regarding B
Your "division" getting truncated problem is most likely the same as this: Retrieve a value in Z3Py yields unexpected result
You can either be explicit and write 1.0/2.0 etc.; or use:
from __future__ import division
at the top of your program.
Regarding C
SMTLib is a strongly typed language; you cannot multiply by a bool. You have to convert it to a number first. Something like (* (ite b 1 0) x), where b is your boolean. Or you can write a custom function that does this for you and call that instead; something like:
(define-fun mul_bool_int ((b Bool) (i Int)) Int (ite b i 0))
(assert (= 0 (mul_bool_int false 5)))
(assert (= 5 (mul_bool_int true 5)))
I honestly can't answer for z3 specifically, but I want to state my opinion on point A).
In general, the constraint x=0 v x=1 is abstracted into t1 v t2, where t1 is x=0 and t2 is x=1, at the SAT engine level.
Thus, the SAT engine might try to assign both t1 and t2 to true during the construction of a satisfiable truth assignment for the input formula. It is important to note that this is a contradiction in the theory of LAR, but the SAT engine is not capable of such reasoning. Therefore, the SAT engine might continue its search for a while after taking this decision.
When the LAR Solver is finally invoked, it will detect the LAR-unsatisfiability of the given (possibly partial) truth assignment. As a consequence, it (should) hand the clause not t1 or not t2 to the SAT engine as learning clause so as to block the bad assignment of values to be generated again.
If there are many of such bad assignments, it might require multiple calls to the LAR Solver so as to generate all blocking clauses that are needed to rectify the SAT truth assignment, possibly up to one for each of those bad combinations.
Upon receiving the conflict clause, the SAT engine will have to backjump to the place where it made the bad assignment and do the right decision. Obviously, not all the work done by the SAT engine since it made the bad assignment of values gets wasted: any useful clause that was learned in the meanwhile will certainly be re-used. However, this can still result in a noticeable performance hit on some formulas.
Compare all of this back-and-forth computation being wasted with simply declaring x as a Bool: now the SAT engine knows that it can only assign one of two values to it, and won't ever assign x and not x contemporarily.
In this specific situation, one might mitigate this problem by providing the necessary blocking clauses right into the input formula. However, it is not trivial to conclude that this is always the best practice: explicitly listing all known blocking clauses might result in an explosion of the formula size in some situations, and into an even worse degradation of performance in practice. The best advice is to try different approaches and perform an experimental evaluation before settling for any particular encoding of a problem.
Disclaimer: it is possible that some SMT solvers are smarter than others, and automatically generate appropriate blocking clauses for 0/1 variables upon parsing the formula or avoid this situation altogether through other means (i.e. afaik, Model-Based SMT solvers shouldn't incur in the same problem)

partial assignments in Z3

I have a Boolean formula (format: CNF) whose satisfiablity I check using the Z3 SAT solver. I am interested in obtaining partial assignments when the formula is satisfiable. I tried model.partial=true on a simple formula for an OR gate and did not get any partial assignment.
Can you suggest how this can be done? I have no constraints on the assignment other than that it be partial.
Z3's partial model mode are just for models of functions. For propositional formulas there is no assurance that models use a minimal number of assignments. On the contrary, the default mode of SAT solvers is that they find complete assignments.
Suppose you are interested in a minimized number of literals, such that the conjunction of the literals imply the formula. You can use unsat cores to get such subsets. The idea is that you first find a model of your formula F as a conjunction of literals l1, l2, ..., ln. Then given that this is a model of F we have that l1 & l2 & ... & ln & not F is unsatisfiable.
So the idea is to assert "not F" and check satisfiability of "not F" modulo assumptions l1, l2, .., ln. Since the result is unsat, you can query Z3 to retrieve an unsat core among l1, l2, .., ln.
From python what you would do is to create two solver objects:
s1 = Solver()
s2 = Solver()
Then you add F, respectively, Not(F):
s1.add(F)
s2.add(Not(F))
then you find a reduced model for F using both solvers:
is_Sat = s1.check()
if is_Sat != sat:
# do something else, return
m = s1.model()
literals = [sign(m, m[idx]()) for idx in range(len(m)) ]
is_sat = s2.check(literals)
if is_Sat != unsat:
# should not happen
core = s2.unsat_core()
print core
where
def sign(m, c):
val = m.eval(c)
if is_true(val):
return c
else if is_false(val):
return Not(c)
else:
# should not happen for propositional variables.
return BoolVal(True)
There are of course other ways to get reduced set of literals. A partial cheap way is to eagerly evaluate each clause and add literals from the model until each clause is satisfied by at least one literal in the model. In other words, you are looking for a minimal hitting set. You would have to implement this outside of Z3.

Avoiding quantifiers in Z3

I am experimenting with Z3 where I combine the theories of arithmetic, quantifiers and equality. This does not seem to be very efficient, in fact it seems to be more efficient to replace the quantifiers with all instantiated ground instances when possible. Consider the following example, in which I have encoded the unique names axiom for a function f that takes two arguments of sort Obj and returns an interpreted sort S. This axiom states that each unique list of arguments to f returns a unique object:
(declare-datatypes () ((Obj o1 o2 o3 o4 o5 o6 o7 o8)))
(declare-sort S 0)
(declare-fun f (Obj Obj) S)
(assert (forall ((o11 Obj) (o12 Obj) (o21 Obj) (o22 Obj))
(=>
(not (and (= o11 o21) (= o12 o22)))
(not (= (f o11 o12) (f o21 o22))))))
Although this is a standard way of defining such an axiom in logic, implementing it like this is computationally very expensive. It contains 4 quantified variables, which each can have 8 values. This means that this results in 8^4 = 4096 equalities. It takes Z3 0.69s and 2016 quantifier instantiations to prove this. When I write a simple script that generates the instances of this formula:
(assert (distinct (f o1 o1) (f o1 o2) .... (f o8 o7) (f o8 o8)))
It takes 0.002s to generate these axioms, and another 0.01s (or less) to prove it in Z3. When we increase the objects in the domain, or the number of arguments to the function f this different increases rapidly, and the quantified case quickly becomes unfeasible.
This makes me wonder: when we have a bounded domain, why would we use quantifiers in Z3 in the first place? I know that SMT uses heuristics to find solutions, but I get the feeling that it still cannot compete in efficiency with a simple domain-specific grounder that feeds the grounded instances to SMT, which is then nothing more than SAT solving. Is my intuition correct?
Your intuition is correct. The heuristics for handling quantifiers in Z3 are not tuned for problems where universal variables range over finite/bounded domains.
In this kind of problem, using quantifiers is a good option only if a very small percentage of the instances are needed to show that a problem is unsatisfiable.
I usually suggest that users should expand this quantifiers using the programmatic API.
Here a two related posts. They contain links to Python code that implements this approach.
Does Z3 take a longer time to give an unsat result compared to a sat result?
Quantifier Vs Non-Quantifier
Here is one of the code fragments:
VFunctionAt = Function('VFunctionAt', IntSort(), IntSort(), IntSort())
s = Solver()
s.add([VFunctionAt(V,S) >= 0 for V in range(1, 5) for S in range(1, 9)])
print s
In this example, I'm essentially encoding forall V in [1,4] S in [1,8] VFunctionAt(V,S) >= 0.
Finally, your encoding (assert (distinct (f o1 o1) (f o1 o2) .... (f o8 o7) (f o8 o8)) is way more compact than expanding the quantifier 4096 times. However, even if we use a naive encoding (just expand the quantifier 4096 times), it is stil faster to solve the expanded version.

Resources