Z3 getting last valid model - z3

I using Z3 C++ api to find a satisfiable formula that is minimal with respect to some boolean variables (let us call them b0,...,bn) being true.
I have a formula that includes boolean variables b0,...,bn and I want to find some satisfiable formula where I have the least number of b0,...,bn set to true.
I do this by initially finding a subset of b0,...,bn that can be assigned to true and satisfy my formula, and I incrementally ask the solver to find smaller subsets (i.e. where one of these boolean variables is flipped to false).
I find my local minimum when I cannot find a smaller subset, i.e. I get a unsat result from the Z3. At this point, I would like to access the last valid model.
Is that possible? Does Z3 modify the model when a call to "check" is unsat?
If so, how can I do this using the C++ api?
Many thanks in advance,

You can retrieve a model if the solver returns "sat". The model refers to the state of the solver, so if you add assertions, the state changes and models are no longer valid until you check satisfiability and it returns sat.
So you can retrieve a model every time the solver returns SAT, and then discharge all but the last model.

As Nikolaj mentioned, you need to keep track of models after each call that results in sat and return the last one when you get an unsat if you follow the strategy you outlined.
However, there might be another alternative that avoids repeated calls altogether. Instead of a satisfaction problem, you can cast your problem as an optimization one. You mentioned you have control variables b0, b1, .. bn such that you want to minimize the number of them getting set to true for a satisfying model. Create a metric that counts the number of ones in these variables. Something like:
metric = (if b0 then 1 else 0)
+ (if b1 then 1 else 0)
+ ...
+ (if bn then 1 else 0)
Then use Z3's optimization routines to minimize metric. I believe this will provide you with the solution you are looking for in one call only.
Some helpful references:
Here's the Z3 optimization tutorial: http://rise4fun.com/z3opt/tutorialcontent/guide.
This example, in particular, talks about soft-constraints, and might quite be applicable in your case as well: http://rise4fun.com/z3opt/tutorialcontent/guide#h25.
Here's the C++ API reference for the optimizer: http://z3prover.github.io/api/html/classz3_1_1optimize.html.

Related

(Semi-decidable) combination of first-order theories is possible in Z3, but what about an actual semantic/signature-wise combination?

Disclaimer: This is a rather theoretical question, but think it fits here; in case not, let me know an alternative :)
Z3 seems expressive
Recently, I realized I can specify this type of formulae in Z3:
Exists x,y::Integer s.t. [Exists i::Integer s.t. (0<=i<|seq|) & (avg(seq)+t<seq[i])] & (y<Length(seq)) & (y<x)
Here is the code (in Python):
from z3 import *
#Average function
IntSeqSort = SeqSort(IntSort())
sumArray = RecFunction('sumArray', IntSeqSort, IntSort())
sumArrayArg = FreshConst(IntSeqSort)
RecAddDefinition( sumArray
, [sumArrayArg]
, If(Length(sumArrayArg) == 0
, 0
, sumArrayArg[0] + sumArray(SubSeq(sumArrayArg, 1, Length(sumArrayArg) - 1))
)
)
def avgArray(arr):
return ToReal(sumArray(arr)) / ToReal(Length(arr))
###The specification
t = Int('t')
y = Int('y')
x = Int('x')
i = Int('i') #Has to be declared, even if it is only used in the Existential
seq = Const('seq', SeqSort(IntSort()))
avg_seq = avgArray(seq)
phi_0 = And(2<t, t<10)
phi_1 = And(0 <= i, i< Length(seq))
phi_2 = (t+avg_seq<seq[i])
big_vee = And([phi_0, phi_1, phi_2])
phi = Exists(i, big_vee)
phi_3 = (y<Length(seq))
phi_4 = (y>x)
union = And([big_vee, phi_3, phi_4])
phiTotal = Exists([x,y], union)
s = Solver()
s.add(phiTotal)
print(s.check())
#s.model()
solve(phiTotal) #prettier display
We can see how it outputs sat and outputs models.
But...
However, even if this expressivity is useful (at least for me), there is something I am missing: formalization.
I mean, I am combining first-order theories that have different signature and semantics: a sequence-like theory, an integer arithmetic theory and also a (uninterpreted?) function avg. Thus, I would like to combine these theories with a Nelson-Oppen-like procedure, but this procedure only works with quantifier-free fragments.
I mean, I guess this combined theory is semi-decidable (because of quantifiers and because of sequences), but can we formalize it? In case yes, I would like to (in a correct way) combine these theories, but I have no idea how.
An exercise (as an orientation)
Thus, in order to understand this, I proposed a simpler exercise to myself: take the decidable array property fragment (What's decidable about arrays?http://theory.stanford.edu/~arbrad/papers/arrays.pdf), which has a particular set of formulae and signature.
Now, suppose I want to add an avg function to it. How can I do it?
Do I have to somehow combine the array property fragment with some kind of recursive function theory and an integer theory? How would I do this? Note that these theories involve quantifiers.
Do I have to first combine these theories and then create a decision procedure for the combined theory? How?
Maybe it suffices with creating a decision procedure within the array property fragment?
Or maybe it suffices with a syntactic adding to the signature?
Also, is the theory array property fragment with an avg function still decidable?
A non-answer answer: Start by reading Chapter 10 of https://www.decision-procedures.org/toc/
A short answer: Unless your theory supports quantifier-elimination, SMT solvers won't have a decision-procedure. Assuming they all admit quantifier-elimitation, then you can use Nelson-Oppen. Adding functions like avg etc. do not add significantly to expressive power: They are definitions that are "unfolded" as needed, and so long as you don't need induction, they're more or less conveniences. (This is a very simplified account, of course. In practice, you'll most likely need induction for any interesting property.)
If these are your concerns, it's probably best to move to more expressive systems than push-button solvers. Start looking at Lean: Sure, it's not push-button, but it provides a very nice framework for general purpose theorem proving: https://leanprover.github.io
Even longer answer is possible, but stack-overflow isn't the right forum for it. You're now looking into the the theory of decision procedures and theorem-proving, something that won't fit into any answer in any internet-based forum. (Though https://cstheory.stackexchange.com might be better, if you want to give it a try.)

How to improve binary search based optimization in Z3py

I am trying to optimize with Z3py an instance of Set Covering Problem (SCP41) based on minimize.
The results are the following:
Using
(1) I know that Z3 supports optimization (https://rise4fun.com/Z3/tutorial/optimization). Many times I get to the optimum in SCP41 and others instances, a few do not.
(2) I understand that if I use the Z3py API without the optimization module I would have to do the typical sequential search described in (Minimum and maximum values of integer variable) by #Leonardo de Moura. It never gives me results.
My approach
(3) I have tried to improve the sequential search approach by implementing a binary search similar to how it explains #Philippe in (Does Z3 have support for optimization problems), when I run my algorithm it waits and I do not get any result.
I understand that the binary search should be faster and work in this case? I also know that the instance SCP41 is something big and that many restrictions are generated and it becomes extremely combinatorial, this is my full code (Code large instance) and this is my binary search it:
def min(F, Z, M, LB, UB, C):
i = 0
s = Solver()
s.add(F[0])
s.add(F[1])
s.add(F[2])
s.add(F[3])
s.add(F[4])
s.add(F[5])
r = s.check()
if r == sat:
UB = s.model()[Z]
while int(str(LB)) <= int(str(UB)):
C = int(( int(str(LB)) + int(str(UB)) / 2))
s.push()
s.add( Z > LB, Z <= C)
r = s.check()
if r==sat:
UB = Z
return s.model()
elif r==unsat:
LB = C
s.pop()
i = i + 1
if (i > M):
raise Z3Exception("maximum not found, maximum number of iterations was reached")
return unsat
And, this is another instance (Code short instance) that I used in initial tests and it worked well in any case.
What is incorrect binary search or some concept of Z3 not applied correctly?
regards,
Alex
I don't think your problem is to do with minimization itself. If you put a print r after r = s.check() in your program, you see that z3 simply struggles to return a result. So your loop doesn't even execute once.
It's impossible to read through your program since it's really large! But I see a ton of things of the form:
Or(X250 == 0, X500 == 1)
This suggests your variables X250 X500 etc. (and there's a ton of them) are actually booleans, not integers. If that is indeed true, you should absolutely stick to booleans. Solving integer constraints is significantly harder than solving pure boolean constraints, and when you use integers to model booleans like this, the underlying solver simply explores the search space that's just unreachable.
If this is indeed the case, i.e., if you're using Int values to model booleans, I'd strongly recommend modelling your problem to get rid of the Int values and just use booleans. If you come up with a "small" instance of the problem, we can help with modeling.
If you truly do need Int values (which might very well be the case), then I'd say your problem is simply too difficult for an SMT solver to deal with efficiently. You might be better off using some other system that is tuned for such optimization problems.

(Sub)optimal way to get a legit range info when using a SMT constraint with Z3

This question is related to my previous question
Is it possible to get a legit range info when using a SMT constraint with Z3
So it seems that "efficiently" finding the maximum range info is not proper, given typical 32-bit vectors and so on. But on the other hand, I am thinking whether it is feasible to find certain "sub-maximum" range info, which hopefully becomes more efficient. Another thing is that we may want to have certain "safe" guarantee, say for all elements in the sub-maximum range, they must satisfy the constraint, but there could exist some other solutions that would satisfy the constraint as well.
I am currently exploring whether model counting technique could make sense in this setting. Any thoughts would be appreciated very much. Thanks.
General case
This is not just a question of efficiency. Consider a problem where you have two variables a and b, and a single constraint:
a != b
What's the range of b? (maximum or otherwise?)
You can say all values are legitimate. But that would be wrong, as obviously the choice of a impacts the choice of b. The more variables you have around, the more complicated the problem will become. I don't think the problem is even well defined in this case, so searching for a solution (efficient or otherwise) doesn't make much sense.
Single variable assumption
Having said that, I think you can come up with a solution if you assume there's precisely one variable in the system. (Or, alternatively, if you fix all the other variables to some predefined constants.) If you're willing to go down this path, then you can implement a binary search algorithm to find a reasonably sized range by simply proving the quantified formula
Exists([b], And(b >= minBound, b <= maxBound, Not(constraints)))
Once you get unsat for this, you have your range. So long as you get sat, you can adjust your minBound/maxBound to search within smaller ranges. In the worst case, this can turn into a linear walk, but you can "cut-down" this search by making sure you go down a significant size in each step. That could be a parameter to the whole search, depending on how large you want your intervals to be. It'll have to be a choice between trying to find a maximal range, and how long you want to spend in this search. Of course, if you cut-down too much, you can miss a big interval, but that's the cost of efficiency.
Example1 (Good case) There's a single constraint that says b != 5. Then your search will be quick and depending on which branch you'll go, you'll either find [0, 4] or [6, 255] assuming 8-bit words.
Example2 (Bad case) There's a single constraint that says b is even. Then your search will exhibit worst-case behavior, and if your "cut-down" size is 1, you'll possibly iterate 255 times before you settle down on [0, 0]; assuming z3 gives you the maximum odd number in each call.
I hope that illustrates the point. In general, though, I'd assume you'd be closer to the "good case" for practical applications and even if your cut-down size is minimal you can most likely converge in a few iterations. Of course, this entirely depends on your problem domain, but I'd expect it to hold for software analysis in general.

Non-Evaluation of Numerical Expression in Maxima

I start with a simple Maxima question, the answer to which may provide the answer to the actual problem I'm grappling with.
Related Simple Question:
How can I get maxima to calculate:
bfloat((1+%i)^0.3);
Might there be an option variable that can be set so that this evaluates to a complex number?
Actual Question:
In evaluating approximations for numerical time integration for finite element methods, for this purpose I'm using spectral analysis, which requires the calculation of the eigenvalues of a 4 x 4 matrix. This matrix "cav" is also calculated within maxima, using some of the algebra capabilities of maxima, but sustituting numerical values, so that matrix is entirely numerical, i.e. containing no variables. I've calculated the eigenvalues with Mathematica and it returns 4 real eigenvalues. However Maxima calculates horrenduously complicated expressions for this case, which apparently it does not "know" how to simplify, even numerically as "bigfloat". Perhaps this problem arises because Maxima first approximates the matrix "cac" by rational numbers (i.e. fractions) and then tries to solve the problem fully exactly, instead of simply using numerical "bigfloat" computations throughout. Is there I way I can change this?
Note that if you only change the input value of gzv to say 0.5 it works fine, and returns numerical values of complex eigenvalues.
I include the code below. Note that all of the code up until "cav:subst(vs,ca)$" is just for the definition of the matrix cav and seems to work fine. It is in the few statements thereafter that it fails to calculate numerical values for the eigenvalues.
v1:v0+ (1-gg)*a0+gg*a1$
d1:d0+v0+(1/2-gb)*a0+gb*a1$
obf:a1+(1+ga)*(w^2*d1 + 2*gz*w*(d1-d0)) -
ga *(w^2*d0 + 2*gz*w*(d0-g0))$
obf:expand(obf)$
cd:subst([a1=1,d0=0,v0=0,a0=0,g0=0],obf)$
fd:subst([a1=0,d0=1,v0=0,a0=0,g0=0],obf)$
fv:subst([a1=0,d0=0,v0=1,a0=0,g0=0],obf)$
fa:subst([a1=0,d0=0,v0=0,a0=1,g0=0],obf)$
fg:subst([a1=0,d0=0,v0=0,a0=0,g0=1],obf)$
f:[fd,fv,fa,fg]$
cad1:expand(cd*[1,1,1/2-gb,0] - gb*f)$
cad2:expand(cd*[0,1,1-gg,0] - gg*f)$
cad3:expand(-f)$
cad4:[cd,0,0,0]$
cad:matrix(cad1,cad2,cad3,cad4)$
gav:-0.05$
ggv:1/2-gav$
gbv:(ggv+1/2)^2/4$
gzv:1.1$
dt:0.01$
wv:bfloat(dt*2*%pi)$
vs:[ga=gav,gg=ggv,gb=gbv,gz=gzv,w=wv]$
cav:subst(vs,ca)$
cav:bfloat(cav)$
evam:eigenvalues(cav)$
evam:bfloat(evam)$
eva:evam[1]$
The main problem here is that Maxima tries pretty hard to make computations exact, and it's hard to tell it to ease up and allow inexact results.
Is there a mistake in the code you posted above? You have cav:subst(vs,ca) but ca is not defined. Is that supposed to be cav:subst(vs,cad) ?
For the short problem, usually rectform can simplify complex expressions to something more usable:
(%i58) rectform (bfloat((1+%i)^0.3));
`rat' replaced 1.0B0 by 1/1 = 1.0B0
(%o58) 2.59023849130283b-1 %i + 1.078911979230303b0
About the long problem, if fixed-precision (i.e. ordinary floats, not bigfloats) is acceptable to you, then you can use the LAPACK function dgeev to compute eigenvalues and/or eigenvectors.
(%i51) load (lapack);
<bunch of messages here>
(%o51) /usr/share/maxima/5.39.0/share/lapack/lapack.mac
(%i52) dgeev (cav);
(%o52) [[- 0.02759949957202372, 0.06804641655485913, 0.997993508502892, 0.928429191717788], false, false]
If you really need variable precision, I don't know what to try. In principle it's possible to rework the LAPACK code to work with variable-precision floats, but that's a substantial task and I'm not sure about the details.

Does Z3 have support for optimization problems

I saw in a previous post from last August that Z3 did not support optimizations.
However it also stated that the developers are planning to add such support.
I could not find anything in the source to suggest this has happened.
Can anyone tell me if my assumption that there is no support is correct or was it added but I somehow missed it?
Thanks,
Omer
If your optimization has an integer valued objective function, one approach that works reasonably well is to run a binary search for the optimal value. Suppose you're solving the set of constraints C(x,y,z), maximizing the objective function f(x,y,z).
Find an arbitrary solution (x0, y0, z0) to C(x,y,z).
Compute f0 = f(x0, y0, z0). This will be your first lower bound.
As long as you don't know any upper-bound on the objective value, try to solve the constraints C(x,y,z) ∧ f(x,y,z) > 2 * L, where L is your best lower bound (initially, f0, then whatever you found that was better).
Once you have both an upper and a lower bound, apply binary search: solve C(x,y,z) ∧ 2 * f(x,y,z) > (U - L). If the formula is satisfiable, you can compute a new lower bound using the model. If it is unsatisfiable, (U - L) / 2 is a new upper-bound.
Step 3. will not terminate if your problem does not admit a maximum, so you may want to bound it if you are not sure it does.
You should of course use push and pop to solve the succession of problems incrementally. You'll additionally need the ability to extract models for intermediate steps and to evaluate f on them.
We have used this approach in our work on Kaplan with reasonable success.
Z3 currently does not support optimization. This is on the TODO list, but it has not been implemented yet. The following slide decks describe the approach that will be used in Z3:
Exact nonlinear optimization on demand
Computation in Real Closed Infinitesimal and Transcendental Extensions of the Rationals
The library for computing with infinitesimals has already been implemented, and is available in the unstable (work-in-progress) branch, and online at rise4fun.

Resources