I am trying to encode formulas with functions in Z3 and I have an encoding problem. Consider the following example:
f(x) = x + 42
g(x1, x2) = f(x1) / f(x2)
h(x1, x2) = g(x1, x2) % g(x2, x1)
k(x1, x2, x3) = h(x1, x2) - h(x2, x3)
sat( k(y1, y2, y3) == 42 && k(y3, y2, y1) == 42 * 2 && ... )
I would like my encoding to be both efficient (no expression duplication) and allow Z3 to re-use lemmas about functions across subproblems. Here is what I have tried so far:
Inline the functions for every free variable instantiation y1, y2, etc. This introduces duplication and performance is not as good as I hoped for.
Assert the function declarations with universal quantifiers. This works for very specific examples - from the solving times it seems that Z3 can (?) re-use results from previous queries that involve the same functions. However, solving times vary greatly and in many cases (1) turns out to be faster.
Use function definitions (i.e., quantifiers + the MACRO_FINDER option). If my understanding of the documentation is correct, this should expand the functions and thus should be close to (1). However, in terms of performance the results were a bit surprising (">" means faster):
For problems where (1) > (2) I get: (1) > (3) > (2)
For problems where (2) > (1) I get: (2) > (1) = (3)
I have also tried tweaking the MBQI option (and others) with most of the above. However, it is not clear what is the best combination. I am using Z3 4.0.
The question is: What is the "right" way to encode the problem? Note that I only have interpreted functions (I do not really need UF). Could I use this fact for a more efficient encoding and avoid function expansion?
Thanks
I think there's no clear answer to this question. Some techniques work better for one type of benchmarks and other techniques work better for others. For the QBVF benchmarks we've looked at so far, we found macros give us the best combination of small benchmark size and small solving times, but this may not apply in this case.
Your understanding of the documentation is correct, the macro finder will identify quantifiers that look like function definitions and replace all other calls to that function with its definition. It's possible that not all of your macros are picked up or that you are using quasi-macros which aren't detected correctly, either of which could go towards explaining why the performance is sometimes worse than your (1). How much is the difference in the case that (1) > (3)? A little overhead is to be expected, but vast variations in runtime are probably due to some macros being malformed or not being detected.
In general, the is no "right" way to encode these problems. Function expansion can not always be avoided. The trade-off is essentially between expanding eagerly (1, 3), or doing it lazily (2). It may be that there is a correlation of the type SAT (1, 3 faster) and UNSAT (2 faster), but this is also not guaranteed to be the case.
Related
This my first time trying to use cvxpy. I have 2 very simple constrains:
x = cp.Variable((5, 5))
constrains = [cp.sum(x) == 1.0, 0 <= x]
The solution worked most of time, satisfying both constrains. But sometimes the solution only satisfied the first constrain and spit out negative values. I am wondering if there is a way to have the solver to indicate whether it has succeeded or not.
This kind of information is always part of some status-information filled by the solver. In cvxpy's case this is documented here:
So something like:
problem.solve()
if problem.status == 'optimal':
...
else:
...
is the usual route.
Remark:
The solver decides this and feasibility and optimality decisions are depending on tolerances in general (floating-point math!).
Furthermore, most solvers within cvxpy are interior-point like solvers (some even first-order based solvers) which slowly converge to some arbitrarily accurate approximate solution such that:
your simplex-constraint (sum(x) == 1) might be off (compared to 1.0) by some small epsilon like 1e-12
some non-negative variable might be negative by some small epsilon like 1e-12
This is totally normal (for these kinds of solvers; things are different when using simplex-like solvers or simplex-based crossover post-opt). The user needs to take care and the approach he is chosing usually depends on his use-case. E.g. post-clipping x = np.clip(x.value, 0.0, np.inf), rounding and so on.
For me problem.status == 'optimal' didn't work, it said it was optimal when the constraints were not met. This worked better.
result = prob.solve()
if np.isnan(result):
print('no solution found')
I need to find a solution to a problem by generating by using z3py. The formulas are generated depending on input of the user. During the generation of the formulas temporary SMT variables are created that can take on only a limited amount of values, eg is its an integer only even values are allowed. For this case let the temporary variables be a and b and their relation with global variables x and y are defined by the predicate P(a,b,x,y).
An example generated using SMT-LIB like syntax:
(set-info :status unknown)
(declare-fun y () Int)
(declare-fun x () Int)
(assert
(forall (
(a Int) (b Int) (z Int) )
(let
(($x22 (exists ((z Int))(and (< x z) (> z y)))))
(=>
P(a, b, x, y)
$x22))))
(check-sat)
where
z is a variable of which all possible values must be considered
a and b represent variables who's allowed values are restricted by the predicate P
the variable 'x' and 'y' need to be computed for which the formula is satisfied.
Questions:
Does the predicate P reduce the time needed by z3 to find a solution?
Alternatively: viewing that z3 perform search over all possible values for z and a will the predicate P reduce the size of the search space?
Note: The question was updated after remarks from Levent Erkok.
The SMTLib example you gave (generated or hand-written) doesn't make much sense to me. You have universal quantification over x and z, and inside of that you existentially quantify z again, and the whole formula seems meaningless. But perhaps that's not your point and this is just a toy. So, I'll simply ignore that.
Typically, "redundant equations" (as you put it), should not impact performance. (By redundant, I assume you mean things that are derivable from other facts you presented?) Aside: a=z in your above formula is not redundant at all.
This should be true as long as you remain in the decidable subset of the logics; which typically means linear and quantifier-free.
The issue here is that you have quantifier and in particular you have nested quantifiers. SMT-solvers do not deal well with them. (Search stack-overflow for many questions regarding quantifiers and z3.) So, if you have performance issues, the best strategy is to see if you really need them. Just by looking at the example you posted, it is impossible to tell as it doesn't seem to be stating a legitimate fact. So, see if you can express your property without quantifiers.
If you have to have quantifiers, then you are at the mercy of the e-matcher and the heuristics, and all bets are off. I've seen wild performance characteristics in that case. And if reasoning with quantifiers is your goal, then I'd argue that SMT solvers are just not the right tool for you, and you should instead use theorem provers like HOL/Isabelle/Coq etc., that have built-in support for quantifiers and higher-order logic.
If you were to post an actual example of what you're trying to have z3 prove for you, we might be able to see if there's another way to formulate it that might make it easier for z3 to handle. Without a specific goal and an example, it's impossible to opine any further on performance.
I've come up with an SMT formula in Z3 which outputs one solution to a constraint solving problem using only BitVectors and IntVectors of fixed length. The logic I use for the IntVectors is only simple Presburger arithmetic (of the form (x[i] - x[i + 1] <=/>= z) for some x and z). I also take the sum of all of the bits in the bitvector (NOT the binary value), and set that value to be within a range of [a, b].
This works perfectly. The only problem is that, as z3 works by always taking the easiest path towards determining satisfiability, I always get the same answer back, whereas in my domain I'd like to find a variety of substantially different solutions (I know for a fact that multiple, very different solutions exist). I'd like to use this nifty tool I found https://bitbucket.org/kuldeepmeel/weightgen, which lets you uniformly sample a constrained space of possibilities using SAT. To use this though, I need to convert my SMT formula into a SAT formula.
Do you know of any resources that would help me learn how to perform Presburger arithmetic and adding the bits of a bitvector as a SAT instance? Alternatively, do you know of any SMT solver which as an intermediate step outputs a readable description of the problem as a SAT instance?
Many thanks!
[Edited to reflect the fact that I really do need the uniform sampling feature.]
I would like to prove properties of expressions involving matrices and vectors (potentially large size, but size is fixed).
For example I want to prove that the outcome of an expression is a diagonal matrix or a triangular matrix, or it is positive definite, ...
To that end I'd like encode well known properties and identities from linear algebra, such as:
||x + y|| <= ||x|| + ||y||
(A * B) * C = A * (B * C)
det(A+B) = det(A) + det(B)
Tr(zA) = z * Tr(A)
(I + AB) ^ (-1) = I - A(I + BA) ^ (-1) * B
...
I have attempted to implement this in Z3. But even for simple properties it returns unknown or times out. I've tried with array theory and quantifiers.
I'd like know if this problem can be solved with an SMT solver or is it not suited for these kind of problems? Could you give a hint by giving a small example?
You can certainly use Z3 to do this.
I have constructed a small example here, which defines the identity matrix and what it means to be a diagonal matrix, and then proves that the identity matrix is diagonal.
So, it is definitely possible to do this kind of work in Z3. Though you may find you have a better time using a tool built on top of Z3 that has more interactive proving features, such as Dafny or F*.
So lets assume I have a large Problem to solve in Z3 and if i try to solve it in one take, it would take too much time. So i divide this problem in parts and solve them individually.
As a toy example lets assume that my complex problem is to solve those 3 equations:
eq1: x>5
eq2: y<6
eq3: x+y = 10
So my question is whether for example it would be possible to solve eq1 and eq2 first. And then using the result solve eq3.
assert eq1
assert eq2
(check-sat)
assert eq3
(check-sat)
(get-model)
seems to work but I m not sure whether it makes sense performancewise?
Would incremental solving maybe help me out there? Or is there any other feature of z3 that i can use to partition my problem?
The problems considered are usually satisfiability problems, i.e., the goal is to find one solution (model). A solution (model) that satisfies eq1 does not necessarily satisfy eq3, thus you can't just cut the problem in half. We would have to find all solutions (models) for eq1 so that we can replace x in eq3 with that (set of) solutions. (For example, this is what happens in Gaussian elimination after the matrix is diagonal.)