Is it possible to clone Z3_context? - z3

I need it for incremental solving in the context of symbolic execution (Klee).
In points of branching of symbolic execution paths it is necessary to split solver context into 2 parts: with true and false conditions. Of course, there is an expensive workaround - create empty context and replay all constraints.
Is there a way to split Z3_context? Do you plan to add such functionality?
Note
splitting of context can be avoided if use depth-first symbolic exploration, that is exploring current execution path until it reaches "end" and hence this path won't be explored anymore in future. In this case it is enough to pop until branch point reached and continue to explore another condition branch. But in case of Klee many symbolic paths are explored "simultaneously" (exploration of true and false branches is interleaved), so you need solver context solver switching (there is Z3_context argument in each method) and branching (there are no methods for this, that is what I need).
Thanks!

No, the current version of Z3 (3.2) does not support this feature. We realize this is an important capability, and an equivalent feature will be available in the next release.
The idea is to separate the concepts of Context and Solver. In the next release, we will have APIs for creating (and copying) solvers. So, you will be able to use a different solver for each branch of the search. In a nutshell, the Context is used to manage/create Z3 expressions, and the Solver for checking satisfiability.

The approach I currently use for this sort of thing is to assert formulas like p => A instead of A, where p is a fresh Boolean literal. Then in my client I maintain the association between the list of guard literals that correspond to each branch, and use check_assumptions(). In my situation I happen to be able to get away with leaving all formulas allocated during each search, but YMMV. Even for depth-first explorations, I seem to get much more incremental reuse this way than by using push/pop.

Related

SAT queries are slowing down in Z3-Python: what about incremental SAT?

In Z3 (Python) my SAT queries inside a loop are slowing down, can I use incremental SAT to overcome this problem?
The problem is the following: I am performing a concrete SAT search inside a loop. On each iteration, I get a model (of course, I store the negation of the model in order not to explore the same model again). And also, if that model satisfies a certain property, then I also add a subquery of it and add other restrictions to the formula. And iterate again, until UNSAT (i.e. "no more models") is obtained.
I offer an orientative snapshot of the code:
...
s = Solver()
s.add(True)
while s.check() == sat:
s.check()
m = s.model()
phi = add_modelNegation(m)
s.add(phi) #in order not to explore the same model again
if holds_property(m): #if the model holds a property
s = add_moreConstraints(s,m) #add other constrains to the formula
...
The question is that, as the formula that s has to solve gets greater, Z3 is starting to have more trouble to find those models. That is okay: this should happen, since finding a model is now more difficult because of the added restrictions. However, in my case, it is happening too much: the computation speed has been even halved; i.e. the time that the solver needs to find a new model is the double after some iterations.
Thus, I would like to implement some kind of incremental solving and wondered whether there are native methods in Z3 to do so.
I have been reading about this in many pages (see, for instance, How incremental solving works in Z3?), but only found this response in How to use incremental solving with z3py interesting:
The Python API is automatically "incremental". This simply means the ability to call the command check() multiple times, without the solver forgetting what it has seen before (i.e., call check(), assert more facts, call check() again; the second check() will take into account all the assertions from the very beginning).
I am not sure I understand, thus I make a simple question: that the response mean that the incremental SAT is indeed used in Z3's SAT? The point I think I am looking for another incrementality; for example: if in the SAT iteration number 230 it is inevitable that a variable (say b1) is true, then that is a fact that will not change afterwards, you can set it to 1, simplify the formula and not re-reason anything to do with b1, because all models if any will have b1. Is this incremental SAT of Z3 considering these kind of cases?
In case not, how could I implement this?
I know there are some implementations in PySat or in MiniSat, but I would like to do it in Z3.
As with anything related to performance of z3 solving, there's no one size fits all. Each specific problem can benefit from different ideas.
Incremental Solving The term "incremental solving" has a very specific meaning in the SAT/SMT context. It means that you can continue to add assertions to the system after a call to check, without it forgetting the assertions you added before hand. This is what makes it incremental. Additionally, you can set jump-points; i.e., you can tell the solver to "forget" the assertions you put in after a certain point in your program, essentially moving through a stack of assertions. For details, see Section 3.9 of https://smtlib.cs.uiowa.edu/papers/smt-lib-reference-v2.6-r2021-05-12.pdf, specifically the part where it talks about the "Assertion Stack."
And, as noted before, you don't have to do anything specific for z3 to be incremental. It is incremental by default, i.e., you can simply add new assertions after calling check, or use push/pop calls etc. (Compare this to, for instance, CVC4; which is by default not incremental. If you want to use CVC4 in incremental mode, you have to pass a specific command line argument.) The main reason for this is that incremental mode requires extra bookkeeping, which CVC4 isn't willing to pay for unless you explicitly ask it to do so. For z3, the developers decided to always make it incremental without any command line switches.
Regarding your particular question about what happens if b1 is true: Well, if you somehow determined b1 is always true, simply assert it. And the solver will automatically take advantage of this; nothing special needs to be done. Note that z3 learns a ton of lemmas as it works through your program such as these and adds them to its internal database anyhow. But if you have some external mechanism that lets you deduce a particular constraint, just go ahead and add it. (Of course, the soundness of this will be on you, not on z3; but that's a different question.)
One specific "trick" in speeding up enumerating "find me all-solutions" loops like you are doing is to do a divide-and-conquer approach, instead of the "add the negation of the previous model and iterate." In practice this can make a significant difference in performance. I think you should try this idea. It's explained here: https://theory.stanford.edu/~nikolaj/programmingz3.html#sec-blocking-evaluations As you can see, the all_smt function defined at the end of that section takes specific advantage of incrementality (note the calls to push/pop) to speed up the model-search process, by carefully dividing the search space into disjoint segments, instead of doing a random-walk. Using this might give you the speed-up you need. But, again, as with anything performance specific, you'll need to tell us more about exactly what problem you are solving: None of these methods can avoid performance problems caused by modeling issues. (For instance, using integers to model booleans is one common pitfall.) See this answer for some generic advice: https://stackoverflow.com/a/57661441/936310

What is pb.conflict in Z3?

I am trying to find an optimal solution using the Z3 API for python. I have used set_option("verbose", 1) to print statements that Z3 generates while checking for sat. One of the statements it prints is pb.conflict statements. The statements look something like this -
pb.conflict statements.
I want to know what exactly is pb.conflict. What do these statements signify? Also, what are the two numbers that get printed along with it?
pb stands for Pseudo-boolean. A pseudo-boolean function is a function from booleans to some other domain, usually Real. A conflict happens when the choice of a variable leads to an unsatisfiable clause set, at which point the solver has to backtrack. Keeping the backtracking to a minimum is essential for efficiency, and many of the SAT engines carefully track that number. While the details are entirely solver specific (i.e., those two numbers you're asking about), in general the higher the numbers, the more conflict cases the solver met, and hence might decide to reset the state completely or take some other action. Often, there are parameters that users can set to specify when such actions are taken and exactly what those are. But again, this is entirely solver and implementation specific.
A google search on pseudo-boolean optimization will result in a bunch of scholarly articles that you might want to peruse.
If you really want to find Z3's treatment of pseudo-booleans, then your best bet is probably to look at the implementation itself: https://github.com/Z3Prover/z3/blob/master/src/smt/theory_pb.cpp

Why does Z3 branching (find_nl_var_for_branching) skip real variables?

The Z3 function find_nl_var_for_branching (https://github.com/Z3Prover/z3/blob/bba005154c2c753f0da108e39eb6abac2b3c7640/src/smt/theory_arith_nl.h#L719), which "tries to find an integer variable for performing branching," skips real-valued variables. Is there a fundamental reason for that which I am missing? It seems to me that the only requirement for branching on real variables (in contrast to integers) is that probably the order of preference needs to be reconsidered (eg simply, always select randomly between real variables) to ensure all variables get branched eventually.
This seemed potentially like a straightforward enhancement to make, but since I'm pretty new to the z3 source and SMT solvers in general, figured I should ask first.

Incremental SMT solver with ability to drop specific constraint

Is there an incremental SMT solver or an API for some incremental SMT solver where I can add constraints incrementally, where I can uniquely identify each constraint by some label/name?
The reason I want to identify the constraints uniquely is so that I can drop them later by specifying that label/name.
The need for dropping constraints is due to the fact that my earlier constraints become irrelevant with time.
I see that with Z3 I cannot use the push/pop based incremental approach because it follows a stack based idea whereas my requirement is to drop specific earlier/old constraints.
With the other incremental approach of Z3 based on assumptions, I would have to perform check-sat of the format "(check-sat p1 p2 p3)" i.e. if I had three assertions to check then I would require three boolean constants p1,p2,p3, but in my implementation I would have thousands of assertions to check at a time, indirectly requiring thousands of boolean constants.
I also checked JavaSMT, a Java API for SMT solvers, to see if the API provides some better way of handling this requirement, but I see only way to add constraints by "addConstraint" or "push" and was unable to find any way of dropping or removing specific constraints since the pop is the only option available.
I would like to know if there is any incremental solver where I can add or drop constraints uniquely identified by names, or an API where there is an alternative way to handle it. I would appreciate any suggestion or comments.
The "stack" based approach is pretty much ingrained into SMTLib, so I think it'll be tough to find a solver that does exactly what you want. Although I do agree it would be a nice feature.
Having said that, I can think of two solutions. But neither will serve your particular use-case well, though they will both work. It comes down to the fact that you want to be able to cherry-pick your constraints at each call to check-sat. Unfortunately this is going to be expensive. Each time the solver does a check-sat it learns a lot of lemmas based on all the present assertions, and a lot of internal data-structures are correspondingly modified. The stack-based approach essentially allows the solver to "backtrack" to one of those learned states. But of course, that does not allow cherry-picking as you observed.
So, I think you're left with one of the following:
Using check-sat-assuming
This is essentially what you described already. But to recap, instead of asserting booleans, you simply give them names. So, this:
(assert complicated_expression)
becomes
; for each constraint ci, do this:
(declare-const ci Bool)
(assert (= ci complicated_expression))
; then, check with whatever subset you want
(check-sat-assuming (ci cj ck..))
This does increase the number of boolean constants you have to manage, but in a sense these are the "names" you want anyhow. I understand you do not like this as it introduces a lot of variables; and that is indeed the case. And there's a good reason for that. See this discussion here: https://github.com/Z3Prover/z3/issues/1048
Using reset-assertions and :global-declarations
This is the variant that allows you to arbitrarily cherry-pick the assertions at each call to check-sat. But it will not be cheap. In particular, the solver will forget everything it learned each time you follow this recipe. But it will do precisely what you wanted. First issue:
(set-option :global-declarations true)
And somehow keep track of all these yourself in your wrapper. Now, if you want to arbitrarily "add" a constraint, you don't need to do anything. Just add it. If you want to remove something, then you say:
(reset-assertions)
(assert your-tracked-assertion-1)
(assert your-tracked-assertion-2)
;(assert your-tracked-assertion-3) <-- Note the comment, we're skipping
(assert your-tracked-assertion-4)
..etc
etc. That is, you "remove" the ones you don't want. Note that the :global-declarations call is important since it'll make sure all your data-declarations and other bindings stay intact when you call reset-assertions, which tells the solver to start from a clean-slate of what it assumed and learned.
Effectively, you're managing your own constraints, as you wanted in the first place.
Summary
Neither of these solutions is precisely what you wanted, but they will work. There's simply no SMTLib compliant way to do what you want without resorting to one of these two solutions. Individual solvers, however, might have other tricks up their sleeve. You might want to check with their developers to see if they might have something custom for this use case. While I doubt that is the case, it would be nice to find out!
Also see this previous answer from Nikolaj which is quite related: How incremental solving works in Z3?

How incremental solving works in Z3?

I have a question regarding how Z3 incrementally solves problems. After reading through some answers here, I found the following:
There are two ways to use Z3 for incremental solving: one is push/pop(stack) mode, the other is using assumptions. Soft/Hard constraints in Z3.
In stack mode, z3 will forget all learned lemmas in global (am I right?) scope even after one local "pop" Efficiency of constraint strengthening in SMT solvers
In assumptions mode (I don't know the name, that is the name that comes to my mind), z3 will not simplify some formulas, e.g. value propagation. z3 behaviour changing on request for unsat core
I did some comparison (you are welcome to ask for the formulas, they are just too large to put on the rise4fun), but here are my observations: On some formulas, including quantifiers, the assumptions mode is faster. On some formulas with lots of boolean variables (assumptions variables), stack mode is faster than assumptions mode.
Are they implemented for specific purposes? How does incremental solving work in Z3?
Yes, there are essentially two incremental modes.
Stack based: using push(), pop() you create a local context, that follows a stack discipline. Assertions added under a push() are removed after a matching pop(). Furthermore, any lemmas that are derived under a push are removed. Use push()/pop() to emulate freezing a state and adding additional constraints over the frozen state, then resume to the frozen state. It has the advantage that any additional memory overhead (such as learned lemmas) built up within the scope of a push() is released. The working assumption is that learned lemmas under a push would not be useful any longer.
Assumption based: using additional assumption literals passed to check()/check_sat() you can (1) extract unsatisfiable cores over the assumption literals, (2) gain local incrementality without garbage collecting lemmas that get derived independently of the assumptions. In other words, if Z3 learns a lemma that does not contain any of the assumption literals it expects to not garbage collect them. To use assumption literals effectively, you would have to add them to formulas too. So the tradeoff is that clauses used with assumptions contain some amount of bloat. For example if you want to locally assume some formula (<= x y), then you add a clause (=> p (<= x y)), and assume p when calling check_sat(). Note that the original assumption was a unit. Z3 propagates units efficiently. With the formulation that uses assumption literals it is no longer a unit at the base level of search. This incurs some extra overhead. Units become binary clauses, binary clauses become ternary clauses, etc.
The differentiation between push/pop functionality holds for Z3's default SMT engine. This is the engine most formulas will be using. Z3 contains some portfolio of engines. For example, for pure bit-vector problems, Z3 may end up using the sat based engine. Incrementality in the sat based engine is implemented differently from the default engine. Here incrementality is implemented using assumption literals. Any assertion you add within the scope of a push is asserted as an implication (=> scope_literals formula). check_sat() within such a scope will have to deal with assumption literals. On the flip-side, any consequence (lemma) that does not depend on the current scope is not garbage collected on pop().
In optimization mode, when you assert optimization objectives, or when you use the optimization objects over the API, you can also invoke push/pop. Likewise with fixedpoints. For these two features, push/pop are essentially for user-convenience. There is no internal incrementality. The reason is that these two modes use substantial pre-processing that is super non-incremental.

Resources