Measure and bound time spent in arithmetic sub-solvers - timeout

Q1: Is it possible to query the times Z3 spent in different sub-solvers?
Calling (get-info :all-statistics) gives the overall run time of Z3, but I would like to break it down into individual sub-solvers.
I am particularly interested in the time spent in arithmetic-related sub-solver, more precisely, in those that give rise to the statistics grobner and nonlinear-horner.
Q2: Furthermore, is it possible to put a timeout on sub-solver?
I could imagine something like defining a timeout per check-sat and sub-solver that bounds the time Z3 can spent in that sub-solver. Z3 would repeatedly call n different sub-solvers, and if the time bound of one of them is reached it continues, but only uses the remaining n-1 sub-solvers.
I read the tactics tutorial and got the impression that this might actually be possible by something along the lines of
(repeat
(par-or
(try-for <arithmetic-solvers> 500)
<all-other-solvers>))
but I couldn't figure out which solvers to use.

For Q1: No, you'd have to add your own timers on that and I would expect this to be nontrivial as it's not clear what exactly should and shouldn't be counted.
Q2: Yes, you can build your own custom strategies/tactics. Note that par-or means parallel or, i.e., it will try to run the provided tactics in parallel.
Not everything we call a "solver" has it's own tactic, so this might require some fiddling. Note that "solver" in this context is not necessarily the same as the Z3 C++ object called "solver". Some "solvers" are also integral parts of the SMT kernel.

Related

Are statistics accumulated across multiple check-sats?

Are the numbers returned by (get-info :all-statistics) accumulated across multiple calls to check-sat, and across multiple push-pop scopes? Or are they reset at each check-sat (or at push or pop)?
Phrased differently, if I get statistics at various points during a run of Z3, and if a certain statistics, e.g.quant-instantiations always has the same value, can I then conclude that no quantifier instantiation happened in between getting these statistics?
I did a quick search and it appears that no obj::reset_statistics() is called between check-sat calls for various objs. There are a few re-initializations happening though, so no guarantee that any of this is precise enough for your purpose.

which is more important, number of variables or subexpressions?

I presume the technique detecting shared expressions is applied on most of modern SMT solvers. The performance should be very good when it processes a sequence of similar expressions. However, I got unexpected results after I run Z3 on input1 and input2. Instead of build a long constraint A in "input1", some intermediate variables are defined to map to the sub-expressions of A in "input2". In that case, input1 has less variables, which should be solved faster than input2. I cannot find useful information from the statistic as they are exactly same except the solving time and memory consumed:
I would very much appreciate if someone can answer/explain what affects the performance of the SMT solvers more, the number of variables or number of subexpressions?
I've done some profiling, and it seems that both inputs behave exactly the same in the solver. All (check-sat) commands take exactly the same time. Note that input 2 is a file of size 255KB, but input1 is a file of size 240MB, i.e., this file is about 1000 times larger than the first one. According to my profiler, all of the additional time required to solve these queries is spent in the parser. So, it simply takes a long time to read and check the input; the actual queries are all easy.

How to estimate time spent in SAT solving part in z3 for SMT?

I have profiled my problems, which are in (pseudo-nonlinear) integer real fragment using the profiler gprof (stats here including the call graph) and was trying to separate out the time taken into two classes:
I)The SAT solving part (including [purely] boolean propagation and [purely] boolean conflict clause detection, backjumping, any other propositional manipulation)
II)The theory solving part (including theory consistency checks, generation of theory conflict-clauses and theory propagation).
Do lines 3280-3346 in smt_context.cpp within bounded_search() constitute the top-level DPLL(X) loop?
I believe it is easier to sum-up the time in SAT solver functions (since they are fewer)
and then the rest can be considered as theory solvers's time. I am trying to figure out which functions I should consider as falling under class I above? Are they smt::context::decide(), smt::context::bcp() within smt::context::propagate()? Any others?
smt::context: resolve_conflict() seems to be mixed with calls to theory solver?
Is it correct that smt::context::propagate() seems to be mostly theory propagation (class II) except its bcp() function? Also, smt::context::final_check() seems to be purely in class II.
Any hints greatly appreciated. Thanks.
You are correct, bcp() and decide() are part of the "SAT solver".
The function final_check() is just theory reasoning. It executes procedures that Z3 "claims" to be too "expensive". The resolve_conflict() procedure is mixed: it performs lemma learning, and backtracking. To generate new lemmas, Z3 uses Boolean resolution (which is in "SAT part"). In several cases, the most expensive part of resolve_conflict is backtracking the state of the theory solvers.

What are the advantages of the "apply" functions? When are they better to use than "for" loops, and when are they not? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Is R's apply family more than syntactic sugar
Just what the title says. Stupid question, perhaps, but my understanding has been that when using an "apply" function, the iteration is performed in compiled code rather than in the R parser. This would seem to imply that lapply, for instance, is only faster than a "for" loop if there are a great many iterations and each operation is relatively simple. For instance, if a single call to a function wrapped up in lapply takes 10 seconds, and there are only, say, 12 iterations of it, I would imagine that there's virtually no difference at all between using "for" and "lapply".
Now that I think of it, if the function inside the "lapply" has to be parsed anyway, why should there be ANY performance benefit from using "lapply" instead of "for" unless you're doing something that there are compiled functions for (like summing or multiplying, etc)?
Thanks in advance!
Josh
There are several reasons why one might prefer an apply family function over a for loop, or vice-versa.
Firstly, for() and apply(), sapply() will generally be just as quick as each other if executed correctly. lapply() does more of it's operating in compiled code within the R internals than the others, so can be faster than those functions. It appears the speed advantage is greatest when the act of "looping" over the data is a significant part of the compute time; in many general day-to-day uses you are unlikely to gain much from the inherently quicker lapply(). In the end, these all will be calling R functions so they need to be interpreted and then run.
for() loops can often be easier to implement, especially if you come from a programming background where loops are prevalent. Working in a loop may be more natural than forcing the iterative computation into one of the apply family functions. However, to use for() loops properly, you need to do some extra work to set-up storage and manage plugging the output of the loop back together again. The apply functions do this for you automagically. E.g.:
IN <- runif(10)
OUT <- logical(length = length(IN))
for(i in IN) {
OUT[i] <- IN > 0.5
}
that is a silly example as > is a vectorised operator but I wanted something to make a point, namely that you have to manage the output. The main thing is that with for() loops, you always allocate sufficient storage to hold the outputs before you start the loop. If you don't know how much storage you will need, then allocate a reasonable chunk of storage, and then in the loop check if you have exhausted that storage, and bolt on another big chunk of storage.
The main reason, in my mind, for using one of the apply family of functions is for more elegant, readable code. Rather than managing the output storage and setting up the loop (as shown above) we can let R handle that and succinctly ask R to run a function on subsets of our data. Speed usually does not enter into the decision, for me at least. I use the function that suits the situation best and will result in simple, easy to understand code, because I'm far more likely to waste more time than I save by always choosing the fastest function if I can't remember what the code is doing a day or a week or more later!
The apply family lend themselves to scalar or vector operations. A for() loop will often lend itself to doing multiple iterated operations using the same index i. For example, I have written code that uses for() loops to do k-fold or bootstrap cross-validation on objects. I probably would never entertain doing that with one of the apply family as each CV iteration needs multiple operations, access to lots of objects in the current frame, and fills in several output objects that hold the output of the iterations.
As to the last point, about why lapply() can possibly be faster that for() or apply(), you need to realise that the "loop" can be performed in interpreted R code or in compiled code. Yes, both will still be calling R functions that need to be interpreted, but if you are doing the looping and calling directly from compiled C code (e.g. lapply()) then that is where the performance gain can come from over apply() say which boils down to a for() loop in actual R code. See the source for apply() to see that it is a wrapper around a for() loop, and then look at the code for lapply(), which is:
> lapply
function (X, FUN, ...)
{
FUN <- match.fun(FUN)
if (!is.vector(X) || is.object(X))
X <- as.list(X)
.Internal(lapply(X, FUN))
}
<environment: namespace:base>
and you should see why there can be a difference in speed between lapply() and for() and the other apply family functions. The .Internal() is one of R's ways of calling compiled C code used by R itself. Apart from a manipulation, and a sanity check on FUN, the entire computation is done in C, calling the R function FUN. Compare that with the source for apply().
From Burns' R Inferno (pdf), p25:
Use an explicit for loop when each
iteration is a non-trivial task. But a
simple loop can be more clearly and
compactly expressed using an apply
function. There is at least one
exception to this rule ... if the result will
be a list and some of the components
can be NULL, then a for loop is
trouble (big trouble) and lapply gives
the expected answer.

Which Improvements can be done to AnyTime Weighted A* Algorithm?

Firstly , For those of your who dont know - Anytime Algorithm is an algorithm that get as input the amount of time it can run and it should give the best solution it can on that time.
Weighted A* is the same as A* with one diffrence in the f function :
(where g is the path cost upto node , and h is the heuristic to the end of path until reaching a goal)
Original = f(node) = g(node) + h(node)
Weighted = f(node) = (1-w)g(node) +h(node)
My anytime algorithm runs Weighted A* with decaring weight from 1 to 0.5 until it reaches the time limit.
My problem is that most of the time , it takes alot time until this it reaches a solution , and if given somthing like 10 seconds it usaully doesnt find solution while other algorithms like anytime beam finds one in 0.0001 seconds.
Any ideas what to do?
If I were you I'd throw the unbounded heuristic away. Admissible heuristics are much better in that given a weight value for a solution you've found, you can say that it is at most 1/weight times the length of an optimal solution.
A big problem when implementing A* derivatives is the data structures. When I implemented a bidirectional search, just changing from array lists to a combination of hash augmented priority queues and array lists on demand, cut the runtime cost by three orders of magnitude - literally.
The main problem is that most of the papers only give pseudo-code for the algorithm using set logic - it's up to you to actually figure out how to represent the sets in your code. Don't be afraid of using multiple ADTs for a single list, i.e. your open list. I'm not 100% sure on Anytime Weighted A*, I've done other derivatives such as Anytime Dynamic A* and Anytime Repairing A*, not AWA* though.
Another issue is when you set the g-value too low, sometimes it can take far longer to find any solution that it would if it were a higher g-value. A common pitfall is forgetting to check your closed list for duplicate states, thus ending up in a (infinite if your g-value gets reduced to 0) loop. I'd try starting with something reasonably higher than 0 if you're getting quick results with a beam search.
Some pseudo-code would likely help here! Anyhow these are just my thoughts on the matter, you may have solved it already - if so good on you :)
Beam search is not complete since it prunes unfavorable states whereas A* search is complete. Depending on what problem you are solving, if incompleteness does not prevent you from finding a solution (usually many correct paths exist from origin to destination), then go for Beam search, otherwise, stay with AWA*. However, you can always run both in parallel if there are sufficient hardware resources.

Resources