(Semi-decidable) combination of first-order theories is possible in Z3, but what about an actual semantic/signature-wise combination? - z3

Disclaimer: This is a rather theoretical question, but think it fits here; in case not, let me know an alternative :)
Z3 seems expressive
Recently, I realized I can specify this type of formulae in Z3:
Exists x,y::Integer s.t. [Exists i::Integer s.t. (0<=i<|seq|) & (avg(seq)+t<seq[i])] & (y<Length(seq)) & (y<x)
Here is the code (in Python):
from z3 import *
#Average function
IntSeqSort = SeqSort(IntSort())
sumArray = RecFunction('sumArray', IntSeqSort, IntSort())
sumArrayArg = FreshConst(IntSeqSort)
RecAddDefinition( sumArray
, [sumArrayArg]
, If(Length(sumArrayArg) == 0
, 0
, sumArrayArg[0] + sumArray(SubSeq(sumArrayArg, 1, Length(sumArrayArg) - 1))
)
)
def avgArray(arr):
return ToReal(sumArray(arr)) / ToReal(Length(arr))
###The specification
t = Int('t')
y = Int('y')
x = Int('x')
i = Int('i') #Has to be declared, even if it is only used in the Existential
seq = Const('seq', SeqSort(IntSort()))
avg_seq = avgArray(seq)
phi_0 = And(2<t, t<10)
phi_1 = And(0 <= i, i< Length(seq))
phi_2 = (t+avg_seq<seq[i])
big_vee = And([phi_0, phi_1, phi_2])
phi = Exists(i, big_vee)
phi_3 = (y<Length(seq))
phi_4 = (y>x)
union = And([big_vee, phi_3, phi_4])
phiTotal = Exists([x,y], union)
s = Solver()
s.add(phiTotal)
print(s.check())
#s.model()
solve(phiTotal) #prettier display
We can see how it outputs sat and outputs models.
But...
However, even if this expressivity is useful (at least for me), there is something I am missing: formalization.
I mean, I am combining first-order theories that have different signature and semantics: a sequence-like theory, an integer arithmetic theory and also a (uninterpreted?) function avg. Thus, I would like to combine these theories with a Nelson-Oppen-like procedure, but this procedure only works with quantifier-free fragments.
I mean, I guess this combined theory is semi-decidable (because of quantifiers and because of sequences), but can we formalize it? In case yes, I would like to (in a correct way) combine these theories, but I have no idea how.
An exercise (as an orientation)
Thus, in order to understand this, I proposed a simpler exercise to myself: take the decidable array property fragment (What's decidable about arrays?http://theory.stanford.edu/~arbrad/papers/arrays.pdf), which has a particular set of formulae and signature.
Now, suppose I want to add an avg function to it. How can I do it?
Do I have to somehow combine the array property fragment with some kind of recursive function theory and an integer theory? How would I do this? Note that these theories involve quantifiers.
Do I have to first combine these theories and then create a decision procedure for the combined theory? How?
Maybe it suffices with creating a decision procedure within the array property fragment?
Or maybe it suffices with a syntactic adding to the signature?
Also, is the theory array property fragment with an avg function still decidable?

A non-answer answer: Start by reading Chapter 10 of https://www.decision-procedures.org/toc/
A short answer: Unless your theory supports quantifier-elimination, SMT solvers won't have a decision-procedure. Assuming they all admit quantifier-elimitation, then you can use Nelson-Oppen. Adding functions like avg etc. do not add significantly to expressive power: They are definitions that are "unfolded" as needed, and so long as you don't need induction, they're more or less conveniences. (This is a very simplified account, of course. In practice, you'll most likely need induction for any interesting property.)
If these are your concerns, it's probably best to move to more expressive systems than push-button solvers. Start looking at Lean: Sure, it's not push-button, but it provides a very nice framework for general purpose theorem proving: https://leanprover.github.io
Even longer answer is possible, but stack-overflow isn't the right forum for it. You're now looking into the the theory of decision procedures and theorem-proving, something that won't fit into any answer in any internet-based forum. (Though https://cstheory.stackexchange.com might be better, if you want to give it a try.)

Related

How to retrieve the optimization problem class

Is there any way to show what class of optimization problem we have? E.g. SOCP, SDP, or totally nonconvex?
Or do we just need to try a bunch of solvers and see when they fail? Is SNOPT the only one that supports nonconvex optimization?
You can refer to GetProgramType which returns if the program is LP, QP, SDP, etc.
Is SNOPT the only one that supports nonconvex optimization?
No, we also have IPOPT and NLOpt for non-convex nonlinear optimization.
Or do we just need to try a bunch of solvers and see when they fail?
You can use ChooseBestSolver to find the best-suited solver to the program.
from pydrake.solvers import mathematicalprogram as mp
# Make a program.
prog = mp.MathematicalProgram()
x = prog.NewContinuousVariables(2, "x")
prog.AddLinearConstraint(x[0] + x[1] == 0)
prog.AddLinearConstraint(2*x[0] - x[1] == 1)
# Find the best solver.
solver_id = mp.ChooseBestSolver(prog)
solver = mp.MakeSolver(solver_id)
assert solver.solver_id().name() == "Linear system"
# Solve.
result = solver.Solve(prog)
There is also a mp.Solve function that does all of that at once.
Is SNOPT the only one that supports nonconvex optimization?
The documentation has the full list of solvers. At the moment SNOPT, Ipopt, NLopt are the available non-linear solvers.

Z3PY extremely slow with many variables?

I have been working with the optimizer in Z3PY, and only using Z3 ints and (x < y)-like constraints in my project. It has worked really well. I have been using up to 26 variables (Z3 ints), and it takes the solver about 5 seconds to find a solution and I have maybe 100 soft constraints, at least. But now I tried with 49 variables, and it does not solve it at all (I shut it down after 1 hour).
So I made a little experiment to find out what was slowing it down, is it the amount of variables or the amount of soft constraints? It seems like the bottle neck is the amount of variables.
I created 26 Z3-ints. Then I added as hard constraints, that it should not be lower than 1 or more than 26. Also, all numbers must be unique. No other constraints was added at all.
In other words, the solution that the solver will find is a simple order [1,2,3,4,5....up to 26]. Ordered in a way that the solver finds out.
I mean this is a simple thing, there are really no constraints except those I mentioned. And the solver solves this in 0.4 seconds or something like that, fast and sufficient. Which is expected. But if I increase the amount of variables to 49 (and of course the constraints now are that it should not be lower than 1 or more than 49), it takes the solver about 1 minute to solve. That seems really slow for such a simple task? Should it be like this, anybody knows? The time complexity is really extremely increased?
(I know that I can use Solver() instead of Optimizer() for this particular experiment, and it will be solved within a second, but in reality I need it to be done with Optimizer since I have a lot of soft constraints to work with.)
EDIT: Adding some code for my example.
I declare an array with Z3 ints that I call "reqs".
The array is consisting of 26 variables in one example and 49 in the other example I am talking about.
solver = Optimize()
for i in (reqs):
solver.add(i >= 1)
for i in (reqs):
solver.add(i <= len(reqs))
d = Distinct(reqs)
solver.add(d)
res = solver.check()
print(res)
Each benchmark is unique, and it's impossible to come up with a good strategy that applies equally well in all cases. But the scenario you describe is simple enough to deal with. The performance problem comes from the fact that Distinct creates too many inequalities (quadratic in number) for the solver, and the optimizer is having a hard time dealing with them as you increase the number of variables.
As a rule of thumb, you should avoid using Distinct if you can. For this particular case, it'd suffice to impose a strict ordering on the variables. (Of course, this may not always be possible depending on your other constraints, but it seems what you're describing can benefit from this trick.) So, I'd code it like this:
from z3 import *
reqs = [Int('i_%d' % i) for i in range(50)]
solver = Optimize()
for i in reqs:
solver.add(i >= 1, i <= len(reqs))
for i, j in zip(reqs, reqs[1:]):
solver.add(i < j)
res = solver.check()
print(res)
print(solver.model())
When I run this, I get:
$ time python a.py
sat
[i_39 = 40,
i_3 = 4,
...
i_0 = 1,
i_2 = 3]
python a.py 0.27s user 0.09s system 98% cpu 0.365 total
which is pretty snippy. Hopefully you can generalize this to your original problem.

How to improve binary search based optimization in Z3py

I am trying to optimize with Z3py an instance of Set Covering Problem (SCP41) based on minimize.
The results are the following:
Using
(1) I know that Z3 supports optimization (https://rise4fun.com/Z3/tutorial/optimization). Many times I get to the optimum in SCP41 and others instances, a few do not.
(2) I understand that if I use the Z3py API without the optimization module I would have to do the typical sequential search described in (Minimum and maximum values of integer variable) by #Leonardo de Moura. It never gives me results.
My approach
(3) I have tried to improve the sequential search approach by implementing a binary search similar to how it explains #Philippe in (Does Z3 have support for optimization problems), when I run my algorithm it waits and I do not get any result.
I understand that the binary search should be faster and work in this case? I also know that the instance SCP41 is something big and that many restrictions are generated and it becomes extremely combinatorial, this is my full code (Code large instance) and this is my binary search it:
def min(F, Z, M, LB, UB, C):
i = 0
s = Solver()
s.add(F[0])
s.add(F[1])
s.add(F[2])
s.add(F[3])
s.add(F[4])
s.add(F[5])
r = s.check()
if r == sat:
UB = s.model()[Z]
while int(str(LB)) <= int(str(UB)):
C = int(( int(str(LB)) + int(str(UB)) / 2))
s.push()
s.add( Z > LB, Z <= C)
r = s.check()
if r==sat:
UB = Z
return s.model()
elif r==unsat:
LB = C
s.pop()
i = i + 1
if (i > M):
raise Z3Exception("maximum not found, maximum number of iterations was reached")
return unsat
And, this is another instance (Code short instance) that I used in initial tests and it worked well in any case.
What is incorrect binary search or some concept of Z3 not applied correctly?
regards,
Alex
I don't think your problem is to do with minimization itself. If you put a print r after r = s.check() in your program, you see that z3 simply struggles to return a result. So your loop doesn't even execute once.
It's impossible to read through your program since it's really large! But I see a ton of things of the form:
Or(X250 == 0, X500 == 1)
This suggests your variables X250 X500 etc. (and there's a ton of them) are actually booleans, not integers. If that is indeed true, you should absolutely stick to booleans. Solving integer constraints is significantly harder than solving pure boolean constraints, and when you use integers to model booleans like this, the underlying solver simply explores the search space that's just unreachable.
If this is indeed the case, i.e., if you're using Int values to model booleans, I'd strongly recommend modelling your problem to get rid of the Int values and just use booleans. If you come up with a "small" instance of the problem, we can help with modeling.
If you truly do need Int values (which might very well be the case), then I'd say your problem is simply too difficult for an SMT solver to deal with efficiently. You might be better off using some other system that is tuned for such optimization problems.

how to set a pattern in a variable using Z3Py

I'm pretty new in Z3, but a thing that my problem could be resolved with it.
I have two variables A and B and two pattern like this:
pattern_1: 1010x11x
pattern_2: x0x01111
where 1 and 0 are the bits zero and one, and x (dont care) cold be the bit 0 or 1.
I would like to use Z3Py for check if A with the pattern_1 and B with the pattern_2 can be true at the same time.
In this case if A = 10101111 and B = 10101111 than A and B cold be true ate the same time.
Can anyone help me with this?? It is possible resolve this with Z3Py
revised answer after clarification
Here's one way you could represent those constraints. There is an operation called Extract that can be applied to bit-vector terms. It is defined as follows:
def Extract(high, low, a):
"""Create a Z3 bit-vector extraction expression."""
where high is the high bit to be extracted, low is the low bit to be extracted, and a is the bitvector. This function represents the bits of a between high and low, inclusive.
Using the Extract function you can constrain each bit of whatever term you want to check so that it matches the pattern. For example, if the seventh bit of D must be a 1, then you can write s.add(Extract(7, 7, D) == 1). Repeat this for each bit in a pattern that isn't an x.

Does Z3 have support for optimization problems

I saw in a previous post from last August that Z3 did not support optimizations.
However it also stated that the developers are planning to add such support.
I could not find anything in the source to suggest this has happened.
Can anyone tell me if my assumption that there is no support is correct or was it added but I somehow missed it?
Thanks,
Omer
If your optimization has an integer valued objective function, one approach that works reasonably well is to run a binary search for the optimal value. Suppose you're solving the set of constraints C(x,y,z), maximizing the objective function f(x,y,z).
Find an arbitrary solution (x0, y0, z0) to C(x,y,z).
Compute f0 = f(x0, y0, z0). This will be your first lower bound.
As long as you don't know any upper-bound on the objective value, try to solve the constraints C(x,y,z) ∧ f(x,y,z) > 2 * L, where L is your best lower bound (initially, f0, then whatever you found that was better).
Once you have both an upper and a lower bound, apply binary search: solve C(x,y,z) ∧ 2 * f(x,y,z) > (U - L). If the formula is satisfiable, you can compute a new lower bound using the model. If it is unsatisfiable, (U - L) / 2 is a new upper-bound.
Step 3. will not terminate if your problem does not admit a maximum, so you may want to bound it if you are not sure it does.
You should of course use push and pop to solve the succession of problems incrementally. You'll additionally need the ability to extract models for intermediate steps and to evaluate f on them.
We have used this approach in our work on Kaplan with reasonable success.
Z3 currently does not support optimization. This is on the TODO list, but it has not been implemented yet. The following slide decks describe the approach that will be used in Z3:
Exact nonlinear optimization on demand
Computation in Real Closed Infinitesimal and Transcendental Extensions of the Rationals
The library for computing with infinitesimals has already been implemented, and is available in the unstable (work-in-progress) branch, and online at rise4fun.

Resources