Use Z3 to determine difficulty of quantifier elimination for BV-queries - z3

I'm currently using the Z3 C++ API for solving queries over bitvectors. Some queries may contain an existential quantifier at the top level.
Often times the quantifier elimination is simple and can be performed by Z3 quickly. However, in those cases where the quantifier elimination falls back to enumerating thousands of feasible solutions I'd like to abort this tactic and handle the query myself in some other way.
I've tried wrapping the 'qe'-tactic with a 'try-for'-tactic, hoping that if quantifier elimination fails (in say 100ms) I'd know that I'd better handle the query in some other way. Unfortunately, the 'try-for'-tactic fails to cancel the quantifier elimination (for any time bound).
In an old post a similar issue is discussed and the 'smt' tactic is being blamed for being not responsive. Does the same reasoning apply to the 'qe' tactic? The same post indicates that 'future' versions should be more responsive though. Is there any way or heuristic to determine whether quantifier elimination would take long (besides running the solver in a separate thread and killing it on timeout)?
I've attached a minimal example so you can try it yourselves:
z3::context ctx;
z3::expr bv1 = ctx.bv_const("bv1", 10);
z3::expr bv2 = ctx.bv_const("bv2", 10);
z3::goal goal(ctx);
goal.add(z3::exists(bv1, bv1 != bv2));
z3::tactic t = z3::try_for(z3::tactic(ctx,"qe"), 100);
auto res = t.apply(goal);
std::cout << res << std::endl;
Thanks!

The timeout cancellations have to be checked periodically by the tactic that is running.
We basically have to ensure that the code checks for cancellations and does not descend into a long running loop without checking. You can probably identify the code segment that fails to check for cancellation by running your code in a debugger, break and then determine which procedures it is in. Then file a bug on GitHub to have cancellation flag checked in the place that will help.
Overall, the quantifier elimination tactic is currently fairly simplistic when it comes to bit-vectors so it would be better to avoid qe for all but simple cases.

Related

(Semi-decidable) combination of first-order theories is possible in Z3, but what about an actual semantic/signature-wise combination?

Disclaimer: This is a rather theoretical question, but think it fits here; in case not, let me know an alternative :)
Z3 seems expressive
Recently, I realized I can specify this type of formulae in Z3:
Exists x,y::Integer s.t. [Exists i::Integer s.t. (0<=i<|seq|) & (avg(seq)+t<seq[i])] & (y<Length(seq)) & (y<x)
Here is the code (in Python):
from z3 import *
#Average function
IntSeqSort = SeqSort(IntSort())
sumArray = RecFunction('sumArray', IntSeqSort, IntSort())
sumArrayArg = FreshConst(IntSeqSort)
RecAddDefinition( sumArray
, [sumArrayArg]
, If(Length(sumArrayArg) == 0
, 0
, sumArrayArg[0] + sumArray(SubSeq(sumArrayArg, 1, Length(sumArrayArg) - 1))
)
)
def avgArray(arr):
return ToReal(sumArray(arr)) / ToReal(Length(arr))
###The specification
t = Int('t')
y = Int('y')
x = Int('x')
i = Int('i') #Has to be declared, even if it is only used in the Existential
seq = Const('seq', SeqSort(IntSort()))
avg_seq = avgArray(seq)
phi_0 = And(2<t, t<10)
phi_1 = And(0 <= i, i< Length(seq))
phi_2 = (t+avg_seq<seq[i])
big_vee = And([phi_0, phi_1, phi_2])
phi = Exists(i, big_vee)
phi_3 = (y<Length(seq))
phi_4 = (y>x)
union = And([big_vee, phi_3, phi_4])
phiTotal = Exists([x,y], union)
s = Solver()
s.add(phiTotal)
print(s.check())
#s.model()
solve(phiTotal) #prettier display
We can see how it outputs sat and outputs models.
But...
However, even if this expressivity is useful (at least for me), there is something I am missing: formalization.
I mean, I am combining first-order theories that have different signature and semantics: a sequence-like theory, an integer arithmetic theory and also a (uninterpreted?) function avg. Thus, I would like to combine these theories with a Nelson-Oppen-like procedure, but this procedure only works with quantifier-free fragments.
I mean, I guess this combined theory is semi-decidable (because of quantifiers and because of sequences), but can we formalize it? In case yes, I would like to (in a correct way) combine these theories, but I have no idea how.
An exercise (as an orientation)
Thus, in order to understand this, I proposed a simpler exercise to myself: take the decidable array property fragment (What's decidable about arrays?http://theory.stanford.edu/~arbrad/papers/arrays.pdf), which has a particular set of formulae and signature.
Now, suppose I want to add an avg function to it. How can I do it?
Do I have to somehow combine the array property fragment with some kind of recursive function theory and an integer theory? How would I do this? Note that these theories involve quantifiers.
Do I have to first combine these theories and then create a decision procedure for the combined theory? How?
Maybe it suffices with creating a decision procedure within the array property fragment?
Or maybe it suffices with a syntactic adding to the signature?
Also, is the theory array property fragment with an avg function still decidable?
A non-answer answer: Start by reading Chapter 10 of https://www.decision-procedures.org/toc/
A short answer: Unless your theory supports quantifier-elimination, SMT solvers won't have a decision-procedure. Assuming they all admit quantifier-elimitation, then you can use Nelson-Oppen. Adding functions like avg etc. do not add significantly to expressive power: They are definitions that are "unfolded" as needed, and so long as you don't need induction, they're more or less conveniences. (This is a very simplified account, of course. In practice, you'll most likely need induction for any interesting property.)
If these are your concerns, it's probably best to move to more expressive systems than push-button solvers. Start looking at Lean: Sure, it's not push-button, but it provides a very nice framework for general purpose theorem proving: https://leanprover.github.io
Even longer answer is possible, but stack-overflow isn't the right forum for it. You're now looking into the the theory of decision procedures and theorem-proving, something that won't fit into any answer in any internet-based forum. (Though https://cstheory.stackexchange.com might be better, if you want to give it a try.)

How to alter assertions in a solver without having to repeatedly create a new solver in z3 Python API

I am currently running into the problem that I create a large SMT formula (that I get from an external source) and I run Solver.check() with it. If the call fails I perform a rewrite on some assertions in the solver using the rewrite(s,f,t) presented here.
Now I am wondering how I can change the assertions in the solver that failed to the new ones obtained after the rewrite. That is assertions that still contain the same function defintions/declarations etc. as the previous solver except with the updated assertions that have been rewritten.
For example, this is how I would do it now. I wonder if there was a better/more efficient way:
solver = z3.Solver()
f = z3.Function(f, z3.Int(),z3.Int())
solver.add(f(3)== 5)
solver.check()
solver.model()
# I don't like the model let's rewrite
new_assertions = solver.assertions().rewrite(...)
new_solver = z3.Solver()
new_solver.add(new_assertions)
new_solver.check()
...
This is what the push, and pop statements are for:
push: Creates a back-tracking point that you can jump back to
pop: Goes back to the last push'ed point
That is, you push before you add your assertions, and when you want to change them pop back to where you were. You can create as many backtracking points as you want.
This style of solver usage is called "incremental" and is described in Section 4.1.4 of the SMTLib document. To simplify programming with your rewrite function, you might want to keep the assertions in a list of your own, so they are easy to manipulate; but that's more or less tangential to the discussion at hand here.

Measure and bound time spent in arithmetic sub-solvers

Q1: Is it possible to query the times Z3 spent in different sub-solvers?
Calling (get-info :all-statistics) gives the overall run time of Z3, but I would like to break it down into individual sub-solvers.
I am particularly interested in the time spent in arithmetic-related sub-solver, more precisely, in those that give rise to the statistics grobner and nonlinear-horner.
Q2: Furthermore, is it possible to put a timeout on sub-solver?
I could imagine something like defining a timeout per check-sat and sub-solver that bounds the time Z3 can spent in that sub-solver. Z3 would repeatedly call n different sub-solvers, and if the time bound of one of them is reached it continues, but only uses the remaining n-1 sub-solvers.
I read the tactics tutorial and got the impression that this might actually be possible by something along the lines of
(repeat
(par-or
(try-for <arithmetic-solvers> 500)
<all-other-solvers>))
but I couldn't figure out which solvers to use.
For Q1: No, you'd have to add your own timers on that and I would expect this to be nontrivial as it's not clear what exactly should and shouldn't be counted.
Q2: Yes, you can build your own custom strategies/tactics. Note that par-or means parallel or, i.e., it will try to run the provided tactics in parallel.
Not everything we call a "solver" has it's own tactic, so this might require some fiddling. Note that "solver" in this context is not necessarily the same as the Z3 C++ object called "solver". Some "solvers" are also integral parts of the SMT kernel.

How to estimate time spent in SAT solving part in z3 for SMT?

I have profiled my problems, which are in (pseudo-nonlinear) integer real fragment using the profiler gprof (stats here including the call graph) and was trying to separate out the time taken into two classes:
I)The SAT solving part (including [purely] boolean propagation and [purely] boolean conflict clause detection, backjumping, any other propositional manipulation)
II)The theory solving part (including theory consistency checks, generation of theory conflict-clauses and theory propagation).
Do lines 3280-3346 in smt_context.cpp within bounded_search() constitute the top-level DPLL(X) loop?
I believe it is easier to sum-up the time in SAT solver functions (since they are fewer)
and then the rest can be considered as theory solvers's time. I am trying to figure out which functions I should consider as falling under class I above? Are they smt::context::decide(), smt::context::bcp() within smt::context::propagate()? Any others?
smt::context: resolve_conflict() seems to be mixed with calls to theory solver?
Is it correct that smt::context::propagate() seems to be mostly theory propagation (class II) except its bcp() function? Also, smt::context::final_check() seems to be purely in class II.
Any hints greatly appreciated. Thanks.
You are correct, bcp() and decide() are part of the "SAT solver".
The function final_check() is just theory reasoning. It executes procedures that Z3 "claims" to be too "expensive". The resolve_conflict() procedure is mixed: it performs lemma learning, and backtracking. To generate new lemmas, Z3 uses Boolean resolution (which is in "SAT part"). In several cases, the most expensive part of resolve_conflict is backtracking the state of the theory solvers.

Does Z3 have support for optimization problems

I saw in a previous post from last August that Z3 did not support optimizations.
However it also stated that the developers are planning to add such support.
I could not find anything in the source to suggest this has happened.
Can anyone tell me if my assumption that there is no support is correct or was it added but I somehow missed it?
Thanks,
Omer
If your optimization has an integer valued objective function, one approach that works reasonably well is to run a binary search for the optimal value. Suppose you're solving the set of constraints C(x,y,z), maximizing the objective function f(x,y,z).
Find an arbitrary solution (x0, y0, z0) to C(x,y,z).
Compute f0 = f(x0, y0, z0). This will be your first lower bound.
As long as you don't know any upper-bound on the objective value, try to solve the constraints C(x,y,z) ∧ f(x,y,z) > 2 * L, where L is your best lower bound (initially, f0, then whatever you found that was better).
Once you have both an upper and a lower bound, apply binary search: solve C(x,y,z) ∧ 2 * f(x,y,z) > (U - L). If the formula is satisfiable, you can compute a new lower bound using the model. If it is unsatisfiable, (U - L) / 2 is a new upper-bound.
Step 3. will not terminate if your problem does not admit a maximum, so you may want to bound it if you are not sure it does.
You should of course use push and pop to solve the succession of problems incrementally. You'll additionally need the ability to extract models for intermediate steps and to evaluate f on them.
We have used this approach in our work on Kaplan with reasonable success.
Z3 currently does not support optimization. This is on the TODO list, but it has not been implemented yet. The following slide decks describe the approach that will be used in Z3:
Exact nonlinear optimization on demand
Computation in Real Closed Infinitesimal and Transcendental Extensions of the Rationals
The library for computing with infinitesimals has already been implemented, and is available in the unstable (work-in-progress) branch, and online at rise4fun.

Resources