I have a huge set of linear real arithmetic constraints to solve, and I am incrementally feeding them to the solver. Z3 always seems to get stuck after a while. Is Z3 internally going to change its strategy in solving the constraints, such as moving away from the Simplex algorithm and try others, etc. Or do I have to explicitly instruct Z3 to do so? I am using Z3py.
Without further details it's impossible to answer this question precisely.
Generally, with no logic being set and the default tactic being run or (check-sat) being called without further options, Z3 will switch to a different solver the first time it sees a push command; prior to that it can use a non-incremental solver.
The incremental solver comes with all the positives and negatives of incremental solvers, i.e., it may be faster initially, but it may not be able to exploit previously learned lemmas after some time, and it may simply remember too many irrelevant facts. Also, the heuristics may 'remember' information that doesn't apply at a later time, e.g., a 'good' variable ordering may change into a bad one after everything is popped and a different problem over the same variables is pushed. In the past, some users found it works better for them to use the incremental solver for some number of queries, but to start from scratch when it becomes too slow.
Related
Currently, I have a somewhat superficial understanding of how SMT solvers work (the basics of algorithms like E-matching, MBQI, and CVC4/5's inductive reasoning). However, it's very frustrating to debug by trial-and-error.
Is there any guidance on how to debug SMT scripts that make heavy use of quantifiers?
A badly-written script often goes into infinite loop but I cannot tell if it's my mistake, or it's just taking too long to respond.
The SMT solvers tend to hide internals from users, so it's quite hard to figure out why it's stuck. Is there any way to print the "solving context"?
Or maybe I'm using SMT solvers the wrong way? I should design my own verification algorithm, only employing SMT solvers for local decisions?
Any help is appreciated!
This is a very subjective question, and largely opinion based. But a couple of general remarks:
Don't directly program in SMTLib. It is not meant to be for human-consumption. Instead, use a higher-level API, and script them from a language that you're more familiar with. There are bindings available from any number of languages, including C/C++/Java/Python/O'Caml/Haskell/Scala etc. Just doing this will get rid of most of the mundane mistakes you make.
Turn on verbosity output of the solver. You might be able to notice patterns in the log output. Unfortunately this is very solver specific, and can be hard to decipher; but can also indicate if, for instance, you're stuck in an e-matching loop in the presence of quantifiers.
If there's a custom algorithm for your verification problem (Hoare triples, separation logic, abstract interpretation, ...), then you first have to apply these techniques and delegate local/sub-lemmas to an SMT solver. Do not expect the SMT solver to be able to do large proofs, and anything that requires actual induction out-of-the box.
Try reducing complexity by putting in over-constraints and see which ones help. Based on your findings you might be able to do a case-split, for instance, if the over-constraints enumerate a reasonably small search-space.
Again, these are very general remarks and whether they'll apply for your specific problem is anyone's guess. But I'd start with coding in a higher-level API if you aren't already doing so.
Z3 supports the SMT-lib set-logic statement to restrict to specific fragments. In a program using (set-logic QF_LRA) for example, quantifier-free linear real arithmetic is enabled. If multiple theories are enabled, it makes sense to me that SAT would be required. However, it's not clear to me if it's possible to enable a single theory and guarantee that SAT is never run, thereby reducing Z3 purely to a solver for that single theory alone. This would be useful for example to claim that a tool honors the particular performance bound of the solver for a given theory.
Is there a way to do this in SMT-lib, or directly in Z3? Or is guaranteeing that the SAT solver is disabled impossible?
The Nelson-Oppen theory combination algorithm that many SMT solvers essentially implement crucially relies on the SAT solver: In a sense, the SAT solver is the engine that solves your SMT query, consulting the theory solvers to make sure the conflicts it finds are propagated/resolved according to the theory rules. So, it isn't really possible to talk about an SMT solver without an underlying SAT engine, and neither SMTLib nor any other tool I'm aware of allows you to "turn off" SAT. It's an integral piece of the whole system that cannot be turned on/off at will. Here's a nice set of slides for Nelson-Oppen: https://web.stanford.edu/class/cs357/lecture11.pdf
I suppose it would be possible to construct an SMT solver that did not use this architecture; but then every theory solver would need to essentially have a SAT solver embedded into itself. So, even in that scenario, extracting the "SAT" bits out is really not possible.
If you're after precise measurements of what part of the solver spends what amount of time, your best bet is to instrument the solver to collect statistics on where it spends its time. Even then, precisely calling which parts belong to the SAT solver, which parts belong to the theory solver, and which parts go to their combination will be tricky.
I would like to improve the scalability of SMT solving. I have actually implemented the incremental solving. But I would like to improve more. Any other general methods to improve it without the knowledge of the problem itself?
There's no single "trick" that can make z3 scale better for an arbitrary problem. It really depends on what the actual problem is and what sort of constraints you have. Of course, this goes for any general computing problem, but it really applies in the context of an SMT solver.
Having said that, here're some general ideas based on my experience, roughly in the order of ease of use:
Read the Programming Z3 book This is a very nice write-up and will teach you a ton of things about how z3 is architected and what the best idioms are. You might hit something in there that directly applies to your problem: https://theory.stanford.edu/~nikolaj/programmingz3.html
Keep booleans as booleans not integers Never use integers to represent booleans. (That is, use 1 for true, 0 for false; multiplication for and etc. This is a terrible idea that kills the powerful SAT engine underneath.) Explicitly convert if necessary. Most problems where people tend to deploy such tricks involve counting how many booleans are true etc.: Such problems should be solved using the pseudo-boolean tactics that's built into the solver. (Look up pbEq, pbLt etc.)
Don't optimize unless absolutely necessary The optimization engine is not incremental, nor it is well optimized (pun intended). It works rather slowly compared to all other engines, and for good reason: Optimization modulo theories is a very tricky thing to do. Avoid it unless you really have an optimization problem to tackle. You might also try to optimize "outside" the solver: Make a SAT call, get the results, and making subsequent calls asking for "smaller" cost values. You may not hit the optimum using this trick, but the values might be good enough after a couple of iterations. Of course, how well the results will be depends entirely on your problem.
Case split Try reducing your constraints by case-splitting on key variables. Example: If you're dealing with floating-point constraints, say; do a case split on normal, denormal, infinity, and NaN values separately. Depending on your particular domain, you might have such semantic categories where underlying algorithms take different paths, and mixing-and-matching them will always give the solver a hard time. Case splitting based on context can speed things up.
Use a faster machine and more memory This goes without saying; but having plenty of memory can really speed up certain problems, especially if you have a lot of variables. Get the biggest machine you can!
Make use of your cores You probably have a machine with many cores, further your operating system most likely provides fine-grained multi-tasking. Make use of this: Start many instances of z3 working on the same problem but with different tactics, random seeds, etc.; and take the result of the first one that completes. Random seeds can play a significant role if you have a huge constraint set, so running more instances with different seed values can get you "lucky" on average.
Try to use parallel solving Most SAT/SMT solver algorithms are sequential in nature. There has been a number of papers on how to parallelize some of the algorithms, but most engines do not have parallel counterparts. z3 has an interface for parallel solving, though it's less advertised and rather finicky. Give it a try and see if it helps out. Details are here: https://theory.stanford.edu/~nikolaj/programmingz3.html#sec-parallel-z3
Profile Profile z3 source code itself as it runs on your problem, and see
where the hot-spots are. See if you can recommend code improvements to developers to address these. (Better yet, submit a pull request!) Needless to say, this will require a deep study of z3 itself, probably not suitable for end-users.
Bottom line: There's no free lunch. No single method will make z3 run better for your problems. But above ideas might help improve run times. If you describe the particular problem you're working on in some detail, you'll most likely get better advice as it applies to your constraints.
DPLL SAT solvers typically apply a Phase Saving heuristic. The idea is to remember the last assignment of each variable and use it first in branching.
To experiment with the effects of branching polarity and phase saving, I tried several SAT solvers and modified the phase settings. All are Windows 64-bit ports, run in single-threaded mode. I always used the same example input of moderate complexity (5619 variables, 11261 clauses, in the solution 4% of all variable are true, 96% false).
The resulting run-times are listed below:
It might be just (bad) luck, but the differences are remarkably big. It is a special surprise, that MiniSat outperformed all modern solvers for my example.
My question:
Are there any explanations for the differences?
Best practices for polarity and phase saving?
Nothing conclusive can be deduced from your test. DPLL and solvers based on it are known to exhibit heavy-tailed behavior depending on initial search conditions. This means the same solver can have both short and long runtimes on the same instance depending on factors like when random restarts occur. Search times across different solvers can vary wildly depending on (for example) how they choose decision variables, even without the added complications of phase saving and random restarts.
I am trying to use tactic solver in Z3, to solver a problem of some X constraints as opposed to general purpose solver.
I am using the following tactics -
simplify purify-arith elim-term-ite reduce-args propagate-values solve-eqs symmetry-reduce smt sat sat-preprocess
I apply the tactics one after another to the problem by using Z3_tactic_and_then API.
Also I am using this technique in order to configure the time-out for the solver.
Now, for the same problem if I use a general solver, it times out for the query for the specified time-out. However, if I use the mentioned tactics for the solver, then it does not time-out in the given time. It runs much longer.
For example, I specified a timeout of 180*1000 milliseconds, but it timed out in 730900 milliseconds.
I tried to remove a few tactics mentioned above, but the behaviour was still the same.
Z3 version 4.1
Unfortunately, not every tactic respects the timeout. The tactic smt is a big "offender". This tactic wraps a very old solver implemented in Z3. Unfortunately, this solver cannot be interrupted during some expensive computations because it would leave the system in a corrupted state. That is, Z3 would crash in future operations. When this solver was implemented, a very simplistic design was used. If we want to interrupt the process: kill it. Of course, this is not acceptable when using Z3 embedded in bigger applications. New code is usually much more responsive to timeouts, and we try to avoid this kind of poor design.