How to bias Z3's (Python) SAT solving towards a criteria, such as 'preferring' to have more negated literals - z3

In Z3 (Python) is there any way to 'bias' the SAT search towards a 'criteria'?
A case example: I would like Z3 to obtain a model, but not any model: if possible, give me a model that has a great amount of negated literals.
Thus, for instance, if we have to search A or B a possible model is [A = True, B = True], but I would rather have received the model [A = True, B = False] or the model [A = False, B = True], since they have more False assignments.
Of course, I guess the 'criteria' must be much more concrete (say, if possible: I prefer models with the half of literals to False ), but I think the idea is understandable.
I do not care whether the method is native or not. Any help?

There are two main ways to handle this sort of problems in z3. Either using the optimizer, or manually computing via multiple-calls to the solver.
Using the optimizer
As Axel noted, one way to handle this problem is to use the optimizing solver. Your example would be coded as follows:
from z3 import *
A, B = Bools('A B')
o = Optimize()
o.add(Or(A, B))
o.add_soft(Not(A))
o.add_soft(Not(B))
print(o.check())
print(o.model())
This prints:
sat
[A = False, B = True]
You can add weights to soft-constraints, which gives a way to associate a penalty if the constraint is violated. For instance, if you wanted to make A true if at all possible, but don't care much for B, then you'd associate a bigger penalty with Not(B):
from z3 import *
A, B = Bools('A B')
o = Optimize()
o.add(Or(A, B))
o.add_soft(Not(A))
o.add_soft(Not(B), weight = 10)
print(o.check())
print(o.model())
This prints:
sat
[A = True, B = False]
The way to think about this is as follows: You're asking z3 to:
Satisfy all regular constraints. (i.e., those you put in with add)
Satisfy as many of the soft constraints as possible (i.e., those you put in with add_soft.) If a solution isn't possible that satisfies them all, then the solver is allowed to "violate" them, trying to minimize the total cost of all violated constraints, computed by summing the weights up.
When no weights are given, you can assume it is 1. You can also group these constraints, though I doubt you need that generality.
So, in the second example, z3 violated Not(A), because doing so has a cost of 1, while violating Not(B) would've incurred a cost of 10.
Note that when you use the optimizer, z3 uses a different engine than the one it uses for regular SMT solving: In particular, this engine is not incremental. That is, if you call check twice on it (after introducing some new constraints), it'll solve the whole problem from scratch, instead of learning from the results of the first. Also, the optimizing solver is not as optimized as the regular solver (pun intended!), so it usually performs worse on straight satisfiability as well. See https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/nbjorner-nuz.pdf for details.
Manual approach
If you don't want to use the optimizer, you can also do this "manually" using the idea of tracking variables. The idea is to identify soft-constraints (i.e., those that can be violated at some cost), and associate them with tracker variables.
Here's the basic algorithm:
Make a list of your soft constraints. Int the above example, they'll be Not(A) and Not(B). (That is, you'd like these to be satisfied giving you negative literals, but obviously you want these to be satisfied only if possible.) Call these S_i. Let's say you have N of them.
For each such constraint, create a new tracker variable, which will be a boolean. Call these t_i.
Assert N regular constraints, each of the form Implies(t_i, S_i), for each soft-constraint.
Use a pseudo-boolean constraint, of the form AtMostK, to force that at most K of these tracker variables t_i are false. Then use a binary-search like schema to find the optimal value of K. Note that since you're using the regular solver, you can use it in the incremental mode, i.e., with calls to push and pop.
For a detailed description of this technique, see https://stackoverflow.com/a/15075840/936310.
Summary
Which of the above two methods will work is problem dependent. Using the optimizer is easiest from an end-user point of view, but you're pulling in heavy machinery that you may not need and you can thus suffer from a performance penalty. The second method might be faster, at the risk of more complicated (and thus error prone!) programming. As usual, do some benchmarking to see which works the best for your particular problem.

Z3py features an optimizing solver Optimize. This has a method add_soft with the following description:
Add soft constraint with optional weight and optional identifier.
If no weight is supplied, then the penalty for violating the soft constraint
is 1.
Soft constraints are grouped by identifiers. Soft constraints that are
added without identifiers are grouped by default.
A small example can be found here:
The Optimize context provides three main extensions to satisfiability checking:
o = Optimize()
x, y = Ints('x y')
o.maximize(x + 2*y) # maximizes LIA objective
u, v = BitVecs('u v', 32)
o.minimize(u + v) # minimizes BV objective
o.add_soft(x > 4, 4) # soft constraint with
# optional weight

Related

Some questions about incremental SAT in Z3: can it be deactivated? Which techniques are used inside?

I am still in the process of learning the guts of Z3 (Python).
It was brought to my attention that Z3 performs incremental SAT solving by default (see SAT queries are slowing down in Z3-Python: what about incremental SAT?): specifically, every time you use the s.add command (where s is a solver), it means that it adds that clause to s, but it does not forget everything you have learned before.
First question: can non-incremental SAT solving be done in Z3? That is, somehow 'deactivate' the incremental solving. What would this mean? That we are creating a new solver for each enlarged formula?
For instance, this approach would be Z3-default-incremental:
...
phi = a_formula
s = Solver()
s.add(phi)
while s.check() == sat:
s.check()
m = s.model()
phi = add_modelNegation(m)
s.add(phi) #in order not to explore the same model again
...
That is, once we get a model, we attach the negated model to the same solver.
While this one is 'forcing' Z3 to be non-incremental:
...
phi_st = a_formula
s = Solver()
s.add(phi_st)
negatedModelsStack = []
while s.check() == sat:
m = s.model()
phi_n = add_modelNegation(m)
negatedModelsStack.append(phi_n)
original_plus_negated = And(phi_st, And(negatedModelsStack))
s = Solver()
s.add(original_plus_negated) #in order not to explore the same model again
...
That is, once we get a model, we attach the obtained models to a new solver.
Am I right?
On the other hand, in the attached link, the following is stated:
Compare this to, for instance, CVC4; which is by default not incremental. If you want to use CVC4 in incremental mode, you have to pass a specific command line argument
Does this mean in CVC4 you must create a new solver every time? Like in the second code?
Second question: how can I know exactly what techniques I am using to do incremental solving in Z3? I have been reading about incremental SAT theory and I see that one of those techniques is 'CDCL' (http://www.cril.univ-artois.fr/~audemard/slidesMontpellier2014.pdf), is this used in Z3's incremental search?
References: In order not to inundate Stack with similar questions, which readings do you recommend for incremental SAT in general and Z3's incremental SAT in particular? Also, is the incremental SAT of Z3 similar to the ones of other solvers such as MiniSAT or PySAT?
I'm not sure why you're trying to get z3 to act in a non-incremental way. But, if that's your goal, simply do not call check more than once: That's equivalent to being non-incremental. (Think of being incremental an "additional feature." You don't have to use it. The only difference between z3 and cvc4 is that the latter requires you to tell it ahead of time that you intend to use it in an incremental fashion, while the former is incremental by default.) As an end user you don't really need to know or care about the difference.
Side note If you start cvc4 without telling it to be incremental and call check twice, it'll complain. z3 won't. But otherwise the experience should be the same.
I don't think knowing how solvers implement incrementally is really helpful from a programming perspective. (It's of course paramount if you are implementing your own SMT solver.) There are many papers online for many aspects of SMT, though if you want to study the topic from scratch I recommend Daniel and Ofer's book on decision procedures: http://www.decision-procedures.org

How to check if cvxpy's solve() is successful?

This my first time trying to use cvxpy. I have 2 very simple constrains:
x = cp.Variable((5, 5))
constrains = [cp.sum(x) == 1.0, 0 <= x]
The solution worked most of time, satisfying both constrains. But sometimes the solution only satisfied the first constrain and spit out negative values. I am wondering if there is a way to have the solver to indicate whether it has succeeded or not.
This kind of information is always part of some status-information filled by the solver. In cvxpy's case this is documented here:
So something like:
problem.solve()
if problem.status == 'optimal':
...
else:
...
is the usual route.
Remark:
The solver decides this and feasibility and optimality decisions are depending on tolerances in general (floating-point math!).
Furthermore, most solvers within cvxpy are interior-point like solvers (some even first-order based solvers) which slowly converge to some arbitrarily accurate approximate solution such that:
your simplex-constraint (sum(x) == 1) might be off (compared to 1.0) by some small epsilon like 1e-12
some non-negative variable might be negative by some small epsilon like 1e-12
This is totally normal (for these kinds of solvers; things are different when using simplex-like solvers or simplex-based crossover post-opt). The user needs to take care and the approach he is chosing usually depends on his use-case. E.g. post-clipping x = np.clip(x.value, 0.0, np.inf), rounding and so on.
For me problem.status == 'optimal' didn't work, it said it was optimal when the constraints were not met. This worked better.
result = prob.solve()
if np.isnan(result):
print('no solution found')

Extracting upper and/or lower bound of a numerical variable in Z3

Is it possible to extract the upper and (or) lower bound of some numerical variables in Z3? Suppose there are some constraints on numerical variable x, and the cause of the constraints is that x must be in the interval [x_min, x_max]. Is there a way in Z3 to extract these bounds (x_min and x_max) (in case the solver calculates these values internally), without doing optimization (minimization and maximization).
You could try to increase Z3's verbosity, maybe you can find bounds in the output.
I doubt it, though: since Z3 is ultimately a SAT solver, any numerical solver that (tries to) decide satisfiability could be applied, but deciding satisfiability doesn't necessary require computing (reasonable) numerical bounds.
Out of curiosity: why would you like to avoid optimisation queries?
In general, no.
The minimum/maximum optimal values of a variable x provide the tightest over-approximation of the satisfiable domain interval of x. This requires enumerating all possible Boolean assignments, not just one.
The (Dual) Simplex Algorithm inside the T-Solver for linear arithmetic keeps track of bounds for all arithmetic variables. However, these bounds are only valid for the (possibly partial) Boolean assignment that is currently being constructed by the SAT engine. In early pruning calls, there is no guarantee whatsoever about the significance of these bounds: the corresponding domain for a given variable x may be an under-approximation, an over-approximation or neither (compared to the domain of x wrt. the input formula).
The Theory Combination approach implemented by a SMT solver can also affect the significance of the bounds available inside the LA-Solver. In this regard, I can vouch that Model-Based Theory Combination can be particularly nasty to deal with. With this approach, the SMT Solver may not generate some interface equalities/inequalities when the T-Solvers agree on the Model Value of an interface variable. However, this is counterproductive when one wants to know from the LA-Solver the valid domain of a variable x because it can provide an over-approximated interval even after finding a model of the input formula for a given total Boolean assignment.
Unless the original problem --after preprocessing-- contains terms of the form (x [<|<=|=|=>|>] K), for all possibly interesting values of K, it is hardly likely that the SMT solver generates any valid T-lemma of this form during the search. The main exception is when x is an Int and the LIA-Solver uses splitting on demand. As a consequence, the Boolean stack is not that much helpful to discover bounds either and, even if they were generated, they would only provide an under-approximation of the feasible interval of x (when they are contained in a satisfiable total Boolean assignment).

z3py: What is a correct way of asserting a constraint of "something does not exist"

I want to assert a constraint of "something must not exist" in z3py. I tried using "Not(Exists(...))". A simple example is as follows. I want to find a assignment for a and b, so that such c does not exist.
from z3 import *
s = Solver()
a = Int('a')
b = Int('b')
c = Int('c')
s.add(a+b==5)
s.add(Not(Exists(c,And(c>0,c<5,a*b+c==10))))
print s.check()
print s.model()
The output is
sat
[b = 5, a = 0]
Which seems to be correct.
But when I write "Not(Exists(...))" constraint in a more complex problem, it would take hours without generating a solution.
I wonder if this is the correct and the most efficient way to assert "not exist" constraint? Or such problems with quantifiers are intrinsically hard to solve by any solver?
The way you wrote that constraint is just fine. And it is not surprising that Z3 (or any other solver) would have a hard time solving such problems as you have both quantifiers and non-linear arithmetic. Such problems are intrinsically hard to solve.
You might look into Z3's nlsat tactic, which might provide some relief here: How does Z3 handle non-linear integer arithmetic?
Or, you can try reals instead of integers, or bit-vectors (i.e., machine integers). Of course, whether you can actually use these types would depend on your problem domain. (Reals will have "fractional" values obviously, and bitvectors are subject to modular-arithmetic.)

Z3: Function Expansion and Encoding in QBVF

I am trying to encode formulas with functions in Z3 and I have an encoding problem. Consider the following example:
f(x) = x + 42
g(x1, x2) = f(x1) / f(x2)
h(x1, x2) = g(x1, x2) % g(x2, x1)
k(x1, x2, x3) = h(x1, x2) - h(x2, x3)
sat( k(y1, y2, y3) == 42 && k(y3, y2, y1) == 42 * 2 && ... )
I would like my encoding to be both efficient (no expression duplication) and allow Z3 to re-use lemmas about functions across subproblems. Here is what I have tried so far:
Inline the functions for every free variable instantiation y1, y2, etc. This introduces duplication and performance is not as good as I hoped for.
Assert the function declarations with universal quantifiers. This works for very specific examples - from the solving times it seems that Z3 can (?) re-use results from previous queries that involve the same functions. However, solving times vary greatly and in many cases (1) turns out to be faster.
Use function definitions (i.e., quantifiers + the MACRO_FINDER option). If my understanding of the documentation is correct, this should expand the functions and thus should be close to (1). However, in terms of performance the results were a bit surprising (">" means faster):
For problems where (1) > (2) I get: (1) > (3) > (2)
For problems where (2) > (1) I get: (2) > (1) = (3)
I have also tried tweaking the MBQI option (and others) with most of the above. However, it is not clear what is the best combination. I am using Z3 4.0.
The question is: What is the "right" way to encode the problem? Note that I only have interpreted functions (I do not really need UF). Could I use this fact for a more efficient encoding and avoid function expansion?
Thanks
I think there's no clear answer to this question. Some techniques work better for one type of benchmarks and other techniques work better for others. For the QBVF benchmarks we've looked at so far, we found macros give us the best combination of small benchmark size and small solving times, but this may not apply in this case.
Your understanding of the documentation is correct, the macro finder will identify quantifiers that look like function definitions and replace all other calls to that function with its definition. It's possible that not all of your macros are picked up or that you are using quasi-macros which aren't detected correctly, either of which could go towards explaining why the performance is sometimes worse than your (1). How much is the difference in the case that (1) > (3)? A little overhead is to be expected, but vast variations in runtime are probably due to some macros being malformed or not being detected.
In general, the is no "right" way to encode these problems. Function expansion can not always be avoided. The trade-off is essentially between expanding eagerly (1, 3), or doing it lazily (2). It may be that there is a correlation of the type SAT (1, 3 faster) and UNSAT (2 faster), but this is also not guaranteed to be the case.

Resources