using z3 for ALLSAT - z3

I'm using Z3 as a black box to find all possible combinations of some real-world objects with C# code like this:
while (solver.Check() == Status.SATISFIABLE)
{
SATModel = solver.Model;
....
//invert the Model
....
solver.Assert(InvertedModel)
}
For most of my problems the program is working fine, but now I have a bigger problem, where there would be 8.5E+64 possible combinations without constraints.
I'm starting with some 6000 constraints.
What I observe is that the check action takes less than .02 seconds at the beginning and builds up slowly. After 100000 found solutions it takes already 1 second per turn and after 130000 turns I measure 2 seconds.
Is there an easy way to improve the performance?

It's not unreasonable that the solver is taking longer and longer with each constraint. But to make sure it's not some sort of a memory-leak on the C# part, you should check that the time taken in your while loop is really in the Check part and not in the invert/assert part. If you determine z3 is the responsible party, perhaps filing it at https://github.com/Z3Prover/z3/issues might solicit a better answer from the developers.

Related

Apache-camel Xpathbuilder performance

I have following question. I set up an camel -project to parse certain xml files. I have to selecting take out certain nodes from a file.
I have two files 246kb and 347kb in size. I am extracting a parent-child pair of 250 nodes in the above given example.
With the default factory here are the times. For the 246kb file respt 77secs and 106 secs. I wanted to improve the performance so switched to saxon and the times are as follows 47secs and 54secs. I was able to cut the time down by at least half.
Is it possible to cut the time further, any other factory or optimizations I can use will be appreciated.
I am using XpathBuilder to cut the xpaths out. here is an example. Is it possible to not to have to create XpathBuilder repeatedly, it seems like it has to be constructed for every xpath, I would have one instance and keep pumping the xpaths into it, maybe it will improve performance further.
return XPathBuilder.xpath(nodeXpath)
.saxon()
.namespace(Consts.XPATH_PREFIX, nameSpace)
.evaluate(exchange.getContext(), exchange.getIn().getBody(String.class), String.class);
Adding more details based on Michael's comments. So I am kind of joining them, will become clear with my example below. I am combining them into a json.
So here we go, Lets say we have following mappings for first and second path.
pData.tinf.rexd: bm:Document/bm:xxxxx/bm:PmtInf[{0}]/bm:ReqdExctnDt/text()
pData.tinf.pIdentifi.instId://bm:Document/bm:xxxxx/bm:PmtInf[{0}]/bm:CdtTrfTxInf[{1}]/bm:PmtId/bm:InstrId/text()
This would result in a json as below
pData:{
tinf: {
rexd: <value_from_xml>
}
pIdentifi:{
instId: <value_from_xml>
}
}
Hard to say without seeing your actual XPath expression, but given the file sizes and execution time my guess would be that you're doing a join which is being executed naively as a cartesian product, i.e. with O(n*m) performance. There is probably some way of reorganizing it to have logarithmic performance, but the devil is in the detail. Saxon-EE is quite good at optimizing join queries automatically; if not, there are often ways of doing it manually -- though XSLT gives you more options (e.g. using xsl:key or xsl:merge) than XPath does.
Actually I was able to bring the time down to 10 secs. I am using apache-camel. So I added threads there so that multiple files can be read in separate threads. Once the file was being read, it had serial operation to based on the length of the nodes that had to be traversed. I realized that it was not necessary to be serial here so introduced parrallelStream and that now gave it enough power. One thing to guard agains is not to have a proliferation of threads since that can degrade the performance. So I try to restrict the number of threads to twice or thrice the number of cores on the operating machine.

(Sub)optimal way to get a legit range info when using a SMT constraint with Z3

This question is related to my previous question
Is it possible to get a legit range info when using a SMT constraint with Z3
So it seems that "efficiently" finding the maximum range info is not proper, given typical 32-bit vectors and so on. But on the other hand, I am thinking whether it is feasible to find certain "sub-maximum" range info, which hopefully becomes more efficient. Another thing is that we may want to have certain "safe" guarantee, say for all elements in the sub-maximum range, they must satisfy the constraint, but there could exist some other solutions that would satisfy the constraint as well.
I am currently exploring whether model counting technique could make sense in this setting. Any thoughts would be appreciated very much. Thanks.
General case
This is not just a question of efficiency. Consider a problem where you have two variables a and b, and a single constraint:
a != b
What's the range of b? (maximum or otherwise?)
You can say all values are legitimate. But that would be wrong, as obviously the choice of a impacts the choice of b. The more variables you have around, the more complicated the problem will become. I don't think the problem is even well defined in this case, so searching for a solution (efficient or otherwise) doesn't make much sense.
Single variable assumption
Having said that, I think you can come up with a solution if you assume there's precisely one variable in the system. (Or, alternatively, if you fix all the other variables to some predefined constants.) If you're willing to go down this path, then you can implement a binary search algorithm to find a reasonably sized range by simply proving the quantified formula
Exists([b], And(b >= minBound, b <= maxBound, Not(constraints)))
Once you get unsat for this, you have your range. So long as you get sat, you can adjust your minBound/maxBound to search within smaller ranges. In the worst case, this can turn into a linear walk, but you can "cut-down" this search by making sure you go down a significant size in each step. That could be a parameter to the whole search, depending on how large you want your intervals to be. It'll have to be a choice between trying to find a maximal range, and how long you want to spend in this search. Of course, if you cut-down too much, you can miss a big interval, but that's the cost of efficiency.
Example1 (Good case) There's a single constraint that says b != 5. Then your search will be quick and depending on which branch you'll go, you'll either find [0, 4] or [6, 255] assuming 8-bit words.
Example2 (Bad case) There's a single constraint that says b is even. Then your search will exhibit worst-case behavior, and if your "cut-down" size is 1, you'll possibly iterate 255 times before you settle down on [0, 0]; assuming z3 gives you the maximum odd number in each call.
I hope that illustrates the point. In general, though, I'd assume you'd be closer to the "good case" for practical applications and even if your cut-down size is minimal you can most likely converge in a few iterations. Of course, this entirely depends on your problem domain, but I'd expect it to hold for software analysis in general.

No solutions found in MT StrategyTester optimization mode when using INIT_FAILED or INIT_PARAMTERS_INCORRECT

I'm trying to optimize my current EA that contains approximately 40 different inputs with MetaTrader genetic algorithm.
The inputs have constraints such as I1 < I2 < I3, I24 > 0, ... For total of about 20 constraints.
I tried to filter the solutions that do not respect the constraints with the following code :
int OnInit(){
if(I1 >= I2 || I2 >= I3) {
return(INIT_FAILED);
}
...
}
The problem is then the following : no viable solutions are found after the first 512 iterations and the optimization stops (same happens with the non genetic optimizer).
If I remove the constraints the algorithm will run and optimize the solutions but then those solutions will not respect the constraints.
Has anyone already faced similar issues ? Currently I think I'll have to use an external tool to optimize but this does not feel right
As Daniel has yesterday recommended an OnInit(){...}-handler located shortcutting, the Genetic-mode optimiser will and has to give-up, as it has not seen any progression on the evolutionary journey across some recent amount of population modifications/mutations down the road.
What has surprised me, is that the fully-meshed mode ( going across the whole Cartesian parameterSetSPACE ) rejected to test each and every parameterSetSPACE-vector, one after another. Having spent remarkable hundreds of machine-years in this very sort of testing, this sounds strange to my prior MT4 [ Strategy Tester ] experience.
One more trick :
Let me share one more option :
let pass the tested code through the OnInit(){...}, but make the conditions shortcut the OnTick(){...}-event-handler, returning straight upon entering there. This was a trick, we have invented so as our code was able to simulate some delayed starts ( an internal time-based iterator, for a sliding window location in a flow of time ) of the actual trading-under-test. This way one may simulate some adverse effect of "wrong" parameterSet-vectors, and the Genetics may evolve further, even finding as a side-effect what types of parametrisation gets penalised :o)
SearchSpace having 40+ parameters ? ... The Performance !
If this is your concerd, your next level of performance gets delivered, once you start using a distributed-computing testing-farm, where many machines perform tests upon centrally managed distribution of parameterSet-vectors and report back the results.
This was indeed a performance booster for our Quant R&D.
After some time, we have also implemented a "standalone" farm for ( again, distributed-computing ) off-platform Quant R&D prototyping and testing.

Measure and bound time spent in arithmetic sub-solvers

Q1: Is it possible to query the times Z3 spent in different sub-solvers?
Calling (get-info :all-statistics) gives the overall run time of Z3, but I would like to break it down into individual sub-solvers.
I am particularly interested in the time spent in arithmetic-related sub-solver, more precisely, in those that give rise to the statistics grobner and nonlinear-horner.
Q2: Furthermore, is it possible to put a timeout on sub-solver?
I could imagine something like defining a timeout per check-sat and sub-solver that bounds the time Z3 can spent in that sub-solver. Z3 would repeatedly call n different sub-solvers, and if the time bound of one of them is reached it continues, but only uses the remaining n-1 sub-solvers.
I read the tactics tutorial and got the impression that this might actually be possible by something along the lines of
(repeat
(par-or
(try-for <arithmetic-solvers> 500)
<all-other-solvers>))
but I couldn't figure out which solvers to use.
For Q1: No, you'd have to add your own timers on that and I would expect this to be nontrivial as it's not clear what exactly should and shouldn't be counted.
Q2: Yes, you can build your own custom strategies/tactics. Note that par-or means parallel or, i.e., it will try to run the provided tactics in parallel.
Not everything we call a "solver" has it's own tactic, so this might require some fiddling. Note that "solver" in this context is not necessarily the same as the Z3 C++ object called "solver". Some "solvers" are also integral parts of the SMT kernel.

Which Improvements can be done to AnyTime Weighted A* Algorithm?

Firstly , For those of your who dont know - Anytime Algorithm is an algorithm that get as input the amount of time it can run and it should give the best solution it can on that time.
Weighted A* is the same as A* with one diffrence in the f function :
(where g is the path cost upto node , and h is the heuristic to the end of path until reaching a goal)
Original = f(node) = g(node) + h(node)
Weighted = f(node) = (1-w)g(node) +h(node)
My anytime algorithm runs Weighted A* with decaring weight from 1 to 0.5 until it reaches the time limit.
My problem is that most of the time , it takes alot time until this it reaches a solution , and if given somthing like 10 seconds it usaully doesnt find solution while other algorithms like anytime beam finds one in 0.0001 seconds.
Any ideas what to do?
If I were you I'd throw the unbounded heuristic away. Admissible heuristics are much better in that given a weight value for a solution you've found, you can say that it is at most 1/weight times the length of an optimal solution.
A big problem when implementing A* derivatives is the data structures. When I implemented a bidirectional search, just changing from array lists to a combination of hash augmented priority queues and array lists on demand, cut the runtime cost by three orders of magnitude - literally.
The main problem is that most of the papers only give pseudo-code for the algorithm using set logic - it's up to you to actually figure out how to represent the sets in your code. Don't be afraid of using multiple ADTs for a single list, i.e. your open list. I'm not 100% sure on Anytime Weighted A*, I've done other derivatives such as Anytime Dynamic A* and Anytime Repairing A*, not AWA* though.
Another issue is when you set the g-value too low, sometimes it can take far longer to find any solution that it would if it were a higher g-value. A common pitfall is forgetting to check your closed list for duplicate states, thus ending up in a (infinite if your g-value gets reduced to 0) loop. I'd try starting with something reasonably higher than 0 if you're getting quick results with a beam search.
Some pseudo-code would likely help here! Anyhow these are just my thoughts on the matter, you may have solved it already - if so good on you :)
Beam search is not complete since it prunes unfavorable states whereas A* search is complete. Depending on what problem you are solving, if incompleteness does not prevent you from finding a solution (usually many correct paths exist from origin to destination), then go for Beam search, otherwise, stay with AWA*. However, you can always run both in parallel if there are sufficient hardware resources.

Resources