Z3PY extremely slow with many variables? - z3

I have been working with the optimizer in Z3PY, and only using Z3 ints and (x < y)-like constraints in my project. It has worked really well. I have been using up to 26 variables (Z3 ints), and it takes the solver about 5 seconds to find a solution and I have maybe 100 soft constraints, at least. But now I tried with 49 variables, and it does not solve it at all (I shut it down after 1 hour).
So I made a little experiment to find out what was slowing it down, is it the amount of variables or the amount of soft constraints? It seems like the bottle neck is the amount of variables.
I created 26 Z3-ints. Then I added as hard constraints, that it should not be lower than 1 or more than 26. Also, all numbers must be unique. No other constraints was added at all.
In other words, the solution that the solver will find is a simple order [1,2,3,4,5....up to 26]. Ordered in a way that the solver finds out.
I mean this is a simple thing, there are really no constraints except those I mentioned. And the solver solves this in 0.4 seconds or something like that, fast and sufficient. Which is expected. But if I increase the amount of variables to 49 (and of course the constraints now are that it should not be lower than 1 or more than 49), it takes the solver about 1 minute to solve. That seems really slow for such a simple task? Should it be like this, anybody knows? The time complexity is really extremely increased?
(I know that I can use Solver() instead of Optimizer() for this particular experiment, and it will be solved within a second, but in reality I need it to be done with Optimizer since I have a lot of soft constraints to work with.)
EDIT: Adding some code for my example.
I declare an array with Z3 ints that I call "reqs".
The array is consisting of 26 variables in one example and 49 in the other example I am talking about.
solver = Optimize()
for i in (reqs):
solver.add(i >= 1)
for i in (reqs):
solver.add(i <= len(reqs))
d = Distinct(reqs)
solver.add(d)
res = solver.check()
print(res)

Each benchmark is unique, and it's impossible to come up with a good strategy that applies equally well in all cases. But the scenario you describe is simple enough to deal with. The performance problem comes from the fact that Distinct creates too many inequalities (quadratic in number) for the solver, and the optimizer is having a hard time dealing with them as you increase the number of variables.
As a rule of thumb, you should avoid using Distinct if you can. For this particular case, it'd suffice to impose a strict ordering on the variables. (Of course, this may not always be possible depending on your other constraints, but it seems what you're describing can benefit from this trick.) So, I'd code it like this:
from z3 import *
reqs = [Int('i_%d' % i) for i in range(50)]
solver = Optimize()
for i in reqs:
solver.add(i >= 1, i <= len(reqs))
for i, j in zip(reqs, reqs[1:]):
solver.add(i < j)
res = solver.check()
print(res)
print(solver.model())
When I run this, I get:
$ time python a.py
sat
[i_39 = 40,
i_3 = 4,
...
i_0 = 1,
i_2 = 3]
python a.py 0.27s user 0.09s system 98% cpu 0.365 total
which is pretty snippy. Hopefully you can generalize this to your original problem.

Related

How to improve binary search based optimization in Z3py

I am trying to optimize with Z3py an instance of Set Covering Problem (SCP41) based on minimize.
The results are the following:
Using
(1) I know that Z3 supports optimization (https://rise4fun.com/Z3/tutorial/optimization). Many times I get to the optimum in SCP41 and others instances, a few do not.
(2) I understand that if I use the Z3py API without the optimization module I would have to do the typical sequential search described in (Minimum and maximum values of integer variable) by #Leonardo de Moura. It never gives me results.
My approach
(3) I have tried to improve the sequential search approach by implementing a binary search similar to how it explains #Philippe in (Does Z3 have support for optimization problems), when I run my algorithm it waits and I do not get any result.
I understand that the binary search should be faster and work in this case? I also know that the instance SCP41 is something big and that many restrictions are generated and it becomes extremely combinatorial, this is my full code (Code large instance) and this is my binary search it:
def min(F, Z, M, LB, UB, C):
i = 0
s = Solver()
s.add(F[0])
s.add(F[1])
s.add(F[2])
s.add(F[3])
s.add(F[4])
s.add(F[5])
r = s.check()
if r == sat:
UB = s.model()[Z]
while int(str(LB)) <= int(str(UB)):
C = int(( int(str(LB)) + int(str(UB)) / 2))
s.push()
s.add( Z > LB, Z <= C)
r = s.check()
if r==sat:
UB = Z
return s.model()
elif r==unsat:
LB = C
s.pop()
i = i + 1
if (i > M):
raise Z3Exception("maximum not found, maximum number of iterations was reached")
return unsat
And, this is another instance (Code short instance) that I used in initial tests and it worked well in any case.
What is incorrect binary search or some concept of Z3 not applied correctly?
regards,
Alex
I don't think your problem is to do with minimization itself. If you put a print r after r = s.check() in your program, you see that z3 simply struggles to return a result. So your loop doesn't even execute once.
It's impossible to read through your program since it's really large! But I see a ton of things of the form:
Or(X250 == 0, X500 == 1)
This suggests your variables X250 X500 etc. (and there's a ton of them) are actually booleans, not integers. If that is indeed true, you should absolutely stick to booleans. Solving integer constraints is significantly harder than solving pure boolean constraints, and when you use integers to model booleans like this, the underlying solver simply explores the search space that's just unreachable.
If this is indeed the case, i.e., if you're using Int values to model booleans, I'd strongly recommend modelling your problem to get rid of the Int values and just use booleans. If you come up with a "small" instance of the problem, we can help with modeling.
If you truly do need Int values (which might very well be the case), then I'd say your problem is simply too difficult for an SMT solver to deal with efficiently. You might be better off using some other system that is tuned for such optimization problems.

(Sub)optimal way to get a legit range info when using a SMT constraint with Z3

This question is related to my previous question
Is it possible to get a legit range info when using a SMT constraint with Z3
So it seems that "efficiently" finding the maximum range info is not proper, given typical 32-bit vectors and so on. But on the other hand, I am thinking whether it is feasible to find certain "sub-maximum" range info, which hopefully becomes more efficient. Another thing is that we may want to have certain "safe" guarantee, say for all elements in the sub-maximum range, they must satisfy the constraint, but there could exist some other solutions that would satisfy the constraint as well.
I am currently exploring whether model counting technique could make sense in this setting. Any thoughts would be appreciated very much. Thanks.
General case
This is not just a question of efficiency. Consider a problem where you have two variables a and b, and a single constraint:
a != b
What's the range of b? (maximum or otherwise?)
You can say all values are legitimate. But that would be wrong, as obviously the choice of a impacts the choice of b. The more variables you have around, the more complicated the problem will become. I don't think the problem is even well defined in this case, so searching for a solution (efficient or otherwise) doesn't make much sense.
Single variable assumption
Having said that, I think you can come up with a solution if you assume there's precisely one variable in the system. (Or, alternatively, if you fix all the other variables to some predefined constants.) If you're willing to go down this path, then you can implement a binary search algorithm to find a reasonably sized range by simply proving the quantified formula
Exists([b], And(b >= minBound, b <= maxBound, Not(constraints)))
Once you get unsat for this, you have your range. So long as you get sat, you can adjust your minBound/maxBound to search within smaller ranges. In the worst case, this can turn into a linear walk, but you can "cut-down" this search by making sure you go down a significant size in each step. That could be a parameter to the whole search, depending on how large you want your intervals to be. It'll have to be a choice between trying to find a maximal range, and how long you want to spend in this search. Of course, if you cut-down too much, you can miss a big interval, but that's the cost of efficiency.
Example1 (Good case) There's a single constraint that says b != 5. Then your search will be quick and depending on which branch you'll go, you'll either find [0, 4] or [6, 255] assuming 8-bit words.
Example2 (Bad case) There's a single constraint that says b is even. Then your search will exhibit worst-case behavior, and if your "cut-down" size is 1, you'll possibly iterate 255 times before you settle down on [0, 0]; assuming z3 gives you the maximum odd number in each call.
I hope that illustrates the point. In general, though, I'd assume you'd be closer to the "good case" for practical applications and even if your cut-down size is minimal you can most likely converge in a few iterations. Of course, this entirely depends on your problem domain, but I'd expect it to hold for software analysis in general.

Updating z3 Variable or Linear Solving

I'm new to Z3, so this is likely a silly question. I'm trying to model an execution flow of a program. Doing this through manual z3 calls for now. That said, I end up trying to model something like the following:
x = 1
x += 1
Performing the following commands gives me unsat, and I understand why.
x = z3.Int('x')
s.add(x == 1)
s.add(x == x + 1)
In small scale, it might be reasonable to manually change x == 1 to x == 2. My question is, is there a way to do this in z3 where I don't have to go back and attempt to modify the variables I put into the solver? The equations obviously would get much harrier than just +1, and attempting to work through that logic manually seems error prone and sloppy.
EDIT: After adjusting my program to use SSA as suggested, it works very easily now. I did opt to keep multiple versions of the variable, but that didn't turn into too much extra work.
You can rename variables such that they are in SSA form (https://en.wikipedia.org/wiki/Static_single_assignment_form).
Also, you probably don't need to introduce names for intermediate expressions. The only Z3 variables should be program inputs or things like that.

how to set a pattern in a variable using Z3Py

I'm pretty new in Z3, but a thing that my problem could be resolved with it.
I have two variables A and B and two pattern like this:
pattern_1: 1010x11x
pattern_2: x0x01111
where 1 and 0 are the bits zero and one, and x (dont care) cold be the bit 0 or 1.
I would like to use Z3Py for check if A with the pattern_1 and B with the pattern_2 can be true at the same time.
In this case if A = 10101111 and B = 10101111 than A and B cold be true ate the same time.
Can anyone help me with this?? It is possible resolve this with Z3Py
revised answer after clarification
Here's one way you could represent those constraints. There is an operation called Extract that can be applied to bit-vector terms. It is defined as follows:
def Extract(high, low, a):
"""Create a Z3 bit-vector extraction expression."""
where high is the high bit to be extracted, low is the low bit to be extracted, and a is the bitvector. This function represents the bits of a between high and low, inclusive.
Using the Extract function you can constrain each bit of whatever term you want to check so that it matches the pattern. For example, if the seventh bit of D must be a 1, then you can write s.add(Extract(7, 7, D) == 1). Repeat this for each bit in a pattern that isn't an x.

Does Z3 have support for optimization problems

I saw in a previous post from last August that Z3 did not support optimizations.
However it also stated that the developers are planning to add such support.
I could not find anything in the source to suggest this has happened.
Can anyone tell me if my assumption that there is no support is correct or was it added but I somehow missed it?
Thanks,
Omer
If your optimization has an integer valued objective function, one approach that works reasonably well is to run a binary search for the optimal value. Suppose you're solving the set of constraints C(x,y,z), maximizing the objective function f(x,y,z).
Find an arbitrary solution (x0, y0, z0) to C(x,y,z).
Compute f0 = f(x0, y0, z0). This will be your first lower bound.
As long as you don't know any upper-bound on the objective value, try to solve the constraints C(x,y,z) ∧ f(x,y,z) > 2 * L, where L is your best lower bound (initially, f0, then whatever you found that was better).
Once you have both an upper and a lower bound, apply binary search: solve C(x,y,z) ∧ 2 * f(x,y,z) > (U - L). If the formula is satisfiable, you can compute a new lower bound using the model. If it is unsatisfiable, (U - L) / 2 is a new upper-bound.
Step 3. will not terminate if your problem does not admit a maximum, so you may want to bound it if you are not sure it does.
You should of course use push and pop to solve the succession of problems incrementally. You'll additionally need the ability to extract models for intermediate steps and to evaluate f on them.
We have used this approach in our work on Kaplan with reasonable success.
Z3 currently does not support optimization. This is on the TODO list, but it has not been implemented yet. The following slide decks describe the approach that will be used in Z3:
Exact nonlinear optimization on demand
Computation in Real Closed Infinitesimal and Transcendental Extensions of the Rationals
The library for computing with infinitesimals has already been implemented, and is available in the unstable (work-in-progress) branch, and online at rise4fun.

Resources