Obtaining representation of SMT formula as SAT formula - z3

I've come up with an SMT formula in Z3 which outputs one solution to a constraint solving problem using only BitVectors and IntVectors of fixed length. The logic I use for the IntVectors is only simple Presburger arithmetic (of the form (x[i] - x[i + 1] <=/>= z) for some x and z). I also take the sum of all of the bits in the bitvector (NOT the binary value), and set that value to be within a range of [a, b].
This works perfectly. The only problem is that, as z3 works by always taking the easiest path towards determining satisfiability, I always get the same answer back, whereas in my domain I'd like to find a variety of substantially different solutions (I know for a fact that multiple, very different solutions exist). I'd like to use this nifty tool I found https://bitbucket.org/kuldeepmeel/weightgen, which lets you uniformly sample a constrained space of possibilities using SAT. To use this though, I need to convert my SMT formula into a SAT formula.
Do you know of any resources that would help me learn how to perform Presburger arithmetic and adding the bits of a bitvector as a SAT instance? Alternatively, do you know of any SMT solver which as an intermediate step outputs a readable description of the problem as a SAT instance?
Many thanks!
[Edited to reflect the fact that I really do need the uniform sampling feature.]

Related

Extracting upper and/or lower bound of a numerical variable in Z3

Is it possible to extract the upper and (or) lower bound of some numerical variables in Z3? Suppose there are some constraints on numerical variable x, and the cause of the constraints is that x must be in the interval [x_min, x_max]. Is there a way in Z3 to extract these bounds (x_min and x_max) (in case the solver calculates these values internally), without doing optimization (minimization and maximization).
You could try to increase Z3's verbosity, maybe you can find bounds in the output.
I doubt it, though: since Z3 is ultimately a SAT solver, any numerical solver that (tries to) decide satisfiability could be applied, but deciding satisfiability doesn't necessary require computing (reasonable) numerical bounds.
Out of curiosity: why would you like to avoid optimisation queries?
In general, no.
The minimum/maximum optimal values of a variable x provide the tightest over-approximation of the satisfiable domain interval of x. This requires enumerating all possible Boolean assignments, not just one.
The (Dual) Simplex Algorithm inside the T-Solver for linear arithmetic keeps track of bounds for all arithmetic variables. However, these bounds are only valid for the (possibly partial) Boolean assignment that is currently being constructed by the SAT engine. In early pruning calls, there is no guarantee whatsoever about the significance of these bounds: the corresponding domain for a given variable x may be an under-approximation, an over-approximation or neither (compared to the domain of x wrt. the input formula).
The Theory Combination approach implemented by a SMT solver can also affect the significance of the bounds available inside the LA-Solver. In this regard, I can vouch that Model-Based Theory Combination can be particularly nasty to deal with. With this approach, the SMT Solver may not generate some interface equalities/inequalities when the T-Solvers agree on the Model Value of an interface variable. However, this is counterproductive when one wants to know from the LA-Solver the valid domain of a variable x because it can provide an over-approximated interval even after finding a model of the input formula for a given total Boolean assignment.
Unless the original problem --after preprocessing-- contains terms of the form (x [<|<=|=|=>|>] K), for all possibly interesting values of K, it is hardly likely that the SMT solver generates any valid T-lemma of this form during the search. The main exception is when x is an Int and the LIA-Solver uses splitting on demand. As a consequence, the Boolean stack is not that much helpful to discover bounds either and, even if they were generated, they would only provide an under-approximation of the feasible interval of x (when they are contained in a satisfiable total Boolean assignment).

Complexity of finding a solution to SMT system with quantifier

I need to find a solution to a problem by generating by using z3py. The formulas are generated depending on input of the user. During the generation of the formulas temporary SMT variables are created that can take on only a limited amount of values, eg is its an integer only even values are allowed. For this case let the temporary variables be a and b and their relation with global variables x and y are defined by the predicate P(a,b,x,y).
An example generated using SMT-LIB like syntax:
(set-info :status unknown)
(declare-fun y () Int)
(declare-fun x () Int)
(assert
(forall (
(a Int) (b Int) (z Int) )
(let
(($x22 (exists ((z Int))(and (< x z) (> z y)))))
(=>
P(a, b, x, y)
$x22))))
(check-sat)
where
z is a variable of which all possible values must be considered
a and b represent variables who's allowed values are restricted by the predicate P
the variable 'x' and 'y' need to be computed for which the formula is satisfied.
Questions:
Does the predicate P reduce the time needed by z3 to find a solution?
Alternatively: viewing that z3 perform search over all possible values for z and a will the predicate P reduce the size of the search space?
Note: The question was updated after remarks from Levent Erkok.
The SMTLib example you gave (generated or hand-written) doesn't make much sense to me. You have universal quantification over x and z, and inside of that you existentially quantify z again, and the whole formula seems meaningless. But perhaps that's not your point and this is just a toy. So, I'll simply ignore that.
Typically, "redundant equations" (as you put it), should not impact performance. (By redundant, I assume you mean things that are derivable from other facts you presented?) Aside: a=z in your above formula is not redundant at all.
This should be true as long as you remain in the decidable subset of the logics; which typically means linear and quantifier-free.
The issue here is that you have quantifier and in particular you have nested quantifiers. SMT-solvers do not deal well with them. (Search stack-overflow for many questions regarding quantifiers and z3.) So, if you have performance issues, the best strategy is to see if you really need them. Just by looking at the example you posted, it is impossible to tell as it doesn't seem to be stating a legitimate fact. So, see if you can express your property without quantifiers.
If you have to have quantifiers, then you are at the mercy of the e-matcher and the heuristics, and all bets are off. I've seen wild performance characteristics in that case. And if reasoning with quantifiers is your goal, then I'd argue that SMT solvers are just not the right tool for you, and you should instead use theorem provers like HOL/Isabelle/Coq etc., that have built-in support for quantifiers and higher-order logic.
If you were to post an actual example of what you're trying to have z3 prove for you, we might be able to see if there's another way to formulate it that might make it easier for z3 to handle. Without a specific goal and an example, it's impossible to opine any further on performance.

Control the solving strategy of Z3

So lets assume I have a large Problem to solve in Z3 and if i try to solve it in one take, it would take too much time. So i divide this problem in parts and solve them individually.
As a toy example lets assume that my complex problem is to solve those 3 equations:
eq1: x>5
eq2: y<6
eq3: x+y = 10
So my question is whether for example it would be possible to solve eq1 and eq2 first. And then using the result solve eq3.
assert eq1
assert eq2
(check-sat)
assert eq3
(check-sat)
(get-model)
seems to work but I m not sure whether it makes sense performancewise?
Would incremental solving maybe help me out there? Or is there any other feature of z3 that i can use to partition my problem?
The problems considered are usually satisfiability problems, i.e., the goal is to find one solution (model). A solution (model) that satisfies eq1 does not necessarily satisfy eq3, thus you can't just cut the problem in half. We would have to find all solutions (models) for eq1 so that we can replace x in eq3 with that (set of) solutions. (For example, this is what happens in Gaussian elimination after the matrix is diagonal.)

Optimize Solver Tactics for Circuit SAT

I am using the Z3 solver with Python API to tackle a Circuit SAT problem.
It consists of many Xor expressions with up to 21 inputs and three-input And expressions. Z3 is able to solve my smaller examples but does not cope with the bigger ones.
Rather than creating the Solver object with
s = Solver()
I tried to optimize the solver tactics like in
t = Then('simplify', 'symmetry-reduce', 'aig', 'tseitin-cnf', 'sat' )
s = t.solver()
I got the tactics list via describe_tactics()
Unfortunately, my attempts have not been fruitful. The default sequence of tactics seems to do a pretty good job. The tactics tutorial previously available in rise4fun is no longer accessible.
Another attempt - without visible effect - was to set the phase parameter, as I am expecting the majority of my variables to have false values. (cf related post)
set_option("sat.phase", "always-false")
What sequence of tactics is recommended for Circuit SAT problems?

Z3: Function Expansion and Encoding in QBVF

I am trying to encode formulas with functions in Z3 and I have an encoding problem. Consider the following example:
f(x) = x + 42
g(x1, x2) = f(x1) / f(x2)
h(x1, x2) = g(x1, x2) % g(x2, x1)
k(x1, x2, x3) = h(x1, x2) - h(x2, x3)
sat( k(y1, y2, y3) == 42 && k(y3, y2, y1) == 42 * 2 && ... )
I would like my encoding to be both efficient (no expression duplication) and allow Z3 to re-use lemmas about functions across subproblems. Here is what I have tried so far:
Inline the functions for every free variable instantiation y1, y2, etc. This introduces duplication and performance is not as good as I hoped for.
Assert the function declarations with universal quantifiers. This works for very specific examples - from the solving times it seems that Z3 can (?) re-use results from previous queries that involve the same functions. However, solving times vary greatly and in many cases (1) turns out to be faster.
Use function definitions (i.e., quantifiers + the MACRO_FINDER option). If my understanding of the documentation is correct, this should expand the functions and thus should be close to (1). However, in terms of performance the results were a bit surprising (">" means faster):
For problems where (1) > (2) I get: (1) > (3) > (2)
For problems where (2) > (1) I get: (2) > (1) = (3)
I have also tried tweaking the MBQI option (and others) with most of the above. However, it is not clear what is the best combination. I am using Z3 4.0.
The question is: What is the "right" way to encode the problem? Note that I only have interpreted functions (I do not really need UF). Could I use this fact for a more efficient encoding and avoid function expansion?
Thanks
I think there's no clear answer to this question. Some techniques work better for one type of benchmarks and other techniques work better for others. For the QBVF benchmarks we've looked at so far, we found macros give us the best combination of small benchmark size and small solving times, but this may not apply in this case.
Your understanding of the documentation is correct, the macro finder will identify quantifiers that look like function definitions and replace all other calls to that function with its definition. It's possible that not all of your macros are picked up or that you are using quasi-macros which aren't detected correctly, either of which could go towards explaining why the performance is sometimes worse than your (1). How much is the difference in the case that (1) > (3)? A little overhead is to be expected, but vast variations in runtime are probably due to some macros being malformed or not being detected.
In general, the is no "right" way to encode these problems. Function expansion can not always be avoided. The trade-off is essentially between expanding eagerly (1, 3), or doing it lazily (2). It may be that there is a correlation of the type SAT (1, 3 faster) and UNSAT (2 faster), but this is also not guaranteed to be the case.

Resources