Does Drake guarantee optimizations will succeed if they are satisfied by the initial guess? - drake

If I create a mathematical program in drake, with some constraints and some costs, and give it an initial guess that satisfies all the constraints, are all drake solvers guaranteed to find a solution? And is that solution guaranteed to have a cost less than or equal to the cost of the initial guess?

No, none of Drake's supported solvers guarantee these.
I suppose you have a nonlinear non-convex optimization problem. For these problems Drake support SNOPT, IPOPT and NLopt solvers. For all these solvers, they don't guarantee constraint violation during the iterative optimization process. Temporarily violating the constraints enable the solver to find possibly better solutions. For example if you have a problem
min x
s.t x² = 1
and you start from the initial guess x=1, you probably would hope the solver to find the better solution x=-1, but to jump from x=1 to x=-1 the solver has to go through the intermediate values (like x = 0.5, 0, ...) which violates the constraint.
It doesn't guarantee to have a cost less than or equal to the cost of the initial guess either. It could end up with a worse solution. In non-convex optimization problem, there isn't much we can guarantee. If you find the solver returns a worse solution than your initial guess, you could just use the initial guess.

Related

Impact of input order on performance of constraint solver

Does the input ( boolean and arithmetic equations ) order matters to the constraint solvers like Gecode and SMT solvers like microsoft Z3 ?? If yes, which one of these two will perform better provided that I can take advantage of some pre-known heuristics using branching function in Gecode ??
(Note : I dont know if function similar to branch() in Gecode, exists in Z3)
In theory, no; order should not matter. The order of assertions should not make a difference. But in practice, they can have an impact as heuristics might end up spending a lot of time in dead-ends. SMT solvers usually work as black-boxes, i.e., it's hard to see how they are progressing unless you know their exact internals. You can, however, turn up verbosity (use z3's -v flag) and might look at the output to see if you spot any divergent behavior.
As with any general "performance of SMT solver" question, it is impossible to answer in the abstract. Each problem instance has specific characteristics, and there might be different ways of coding it to make it easier for the solver. If you post specific problems, you can get better suggestions.

How to leverage Z3 SMT solver for ILP problems

Problem
I'm trying to use z3 to disprove reachability assertions on a Petri net.
So I declare N state variables v0,..v_n-1 which are positive integers, one for each place of a Petri net.
My main strategy given an atomic proposition P on states is the following :
compute (with an exterior engine) any "easy" positive invariants as linear constraints on the variables, of the form alpha_0 * v_0 + ... = constant with only positive or zero alpha_i, then check_sat if any state reachable under these constraints satisfies P, if unsat conclude, else
compute (externally to z3) generalized invariants, where the alpha_i can be negative as well and check_sat, conclude if unsat, else
add one positive variable t_i per transition of the system, and assert the Petri net state equation, that any reachable state has a Parikh firing count vector (a value of t_i's) such that M0 the initial state + product of this Parikh vector by incidence matrix gives the reached state. So this one introduces many new variables, and involves some multiplication of variables, but stays a linear integer programming problem.
I separate the steps because since I want UNSAT, any check_sat that returns UNSAT stops the procedure, and the last step in particular is very costly.
I have issues with larger models, where I get prohibitively long answer times or even the dreaded "unknown" answer, particularly when adding state equation (step 3).
Background
So besides splitting the problem into incrementally harder segments I've tried setting logic to QF_LRA rather than QF_LIA, and declaring the variables as Real than integers.
This overapproximation is computationally friendly (z3 is fast on these !) but unfortunately for many models the solutions are not integers, nor is there an integer solution.
So I've tried setting Reals, but specifying that each variable is either =0 or >=1, to remove solutions with fractions of firings < 1. This does eliminate spurious solutions, but it "kills" z3 (timeout or unknown) in many cases, the problem is obviously much harder (e.g. harder than with just integers).
Examples
I don't have a small example to show, though I can produce some easily. The problem is if I go for QF_LIA it gets prohibitively slow at some number of variables. As a metric, there are many more transitions than places, so adding the state equation really ups the variable count.
This code is generating the examples I'm asking about.
This general presentation slides 5 and 6 express the problem I'm encoding precisely, and slides 7 and 8 develop the results of what "unsat" gives us, if you want more mathematical background.
I'm generating problems from the Model Checking Contest, with up to thousands of places (primary variables) and in some cases above a hundred thousand transitions. These are extremum, the middle range is a few thousand places, and maybe 20 thousand transitions that I would really like to deal with.
Reals + the greater than 1 constraint is not a good solution even for some smaller problems. Integers are slow from the get-go.
I could try Reals then iterate into Integers if I get a non integral solution, I have not tried that, though it involves pretty much killing and restarting the solver it might be a decent approach on my benchmark set.
What I'm looking for
I'm looking for some settings for Z3 that can better help it deal with the problems I'm feeding it, give it some insight.
I have some a priori idea about what could solve these problems, traditionally they've been fed to ILP solvers. So I'm hoping to trigger a simplex of some sort, but maybe there are conditions preventing z3 from using the "good" solution strategy in some cases.
I've become a decent level SMT/Z3 user, but I've never played with the fine settings of :options, to guide the solver.
Have any of you tried feeding what are basically ILP problems to SMT, and found options settings or particular encodings that help it deploy the right solutions ? thanks.

Maximizing minimum in z3

I want to find the n-dimensional point (x1...xn) in integer space that satisfies some properties, while also maximizing the minimum distance between x and any element of a collection of m (pre-defined/constant) n-dimensional points (z11...z1n, z21...z2n... zm1...zmn). Is there a way to do this using Z3?
Sure. See: https://rise4fun.com/Z3/tutorial/optimization
The above link talks about the SMTLib interface, but the same is also available from the Python interface as well. (And from most other bindings to Z3.)
Note that optimization is largely for linear properties. If you have non-linear terms, you might want to formulate them so that a linear-counter-part can be optimized instead. Even with non-linear terms, you might get good results, impossible to know without trying.

Improving scalability when using Z3Py for passive learning

My team has been using the Z3 solver to perform passive learning. Passive learning entails obtaining from a set of observations a model consistent with all observations in the set. We consider models of different formalisms, the simplest being Deterministic Finite Automata (DFA) and Mealy machines. For DFAs, observations are just positive or negative samples.
The approach is very simplistic. Given the formalism and observations, we encode each observation into a Z3 constraint over (uninterpreted) functions which correspond to functions in the formalism definition. For DFAs for example, this definition includes a transition function (trans: States X Inputs -> States) and an output function (out: States -> Boolean).
Encoding say the observation (aa, +) would be done as follows:
out(trans(trans(start,a),a)) == True
Where start is the initial state. To construct a model, we add all the observation constraints to the solver. We also add a constraint which limits the number of states in the model. We solve the constraints for a limit of 1, 2, 3... states until the solver can find a solution. The solution is a minimum state-model that is consistent with the observations.
I posted a code snippet using Z3Py which does just this. Predictably, our approach is not scalable (the problem is NP-complete). I was wondering if there were any (small) tweaks we could perform to improve scalability? (in the way of trying out different sorts, strategies...)
We have already tried arranging all observations into a Prefix Tree and using this tree in encoding, but scalability was only marginally improved. I am well aware that there are much more scalable SAT-based approaches to this problem (reducing it to a graph coloring problem). We would like to see how far a simple SMT-based approach can take us.
So far, what I have found is that the best Sorts for defining inputs and states are DeclareSort. It also helps if we eliminate quantifiers from the state-size constraint. Interestingly enough, incremental solving did not really help. But it could be that I am not using it properly (I am an utter novice in SMT theory).
Thanks! BTW, I am unsure how viable/useful this test is as a benchmark for SMT solvers.

Are heuristic functions that produce negative values inadmissible?

As far as I understand, admissibility for a heuristic is staying within bounds of the 'actual cost to distance' for a given, evaluated node. I've had to design some heuristics for an A* solution search on state-spaces and have received a lot of positive efficiency using a heuristic that may sometimes returns negative values, therefore making certain nodes who are more 'closely formed' to the goal state have a higher place in the frontier.
However, I worry that this is inadmissible, but can't find enough information online to verify this. I did find this one paper from the University of Texas that seems to mention in one of the later proofs that "...since heuristic functions are nonnegative". Can anyone confirm this? I assume it is because returning a negative value as your heuristic function would turn your g-cost negative (and therefore interfere with the 'default' dijkstra-esque behavior of A*).
Conclusion: Heuristic functions that produce negative values are not inadmissible, per se, but have the potential to break the guarantees of A*.
Interesting question. Fundamentally, the only requirement for admissibility is that a heuristic never over-estimates the distance to the goal. This is important, because an overestimate in the wrong place could artificially make the best path look worse than another path, and prevent it from ever being explored. Thus a heuristic that can provide overestimates loses any guarantee of optimality. Underestimating does not carry the same costs. If you underestimate the cost of going in a certain direction, eventually the edge weights will add up to be greater than the cost of going in a different direction, so you'll explore that direction too. The only problem is loss of efficiency.
If all of your edges have positive costs, a negative heuristic value can only over be an underestimate. In theory, an underestimate should only ever be worse than a more precise estimate, because it provides strictly less information about the potential cost of a path, and is likely to result in more nodes being expanded. Nevertheless, it will not be inadmissible.
However, here is an example that demonstrates that it is theoretically possible for negative heuristic values to break the guaranteed optimality of A*:
In this graph, it is obviously better to go through nodes A and B. This will have a cost of three, as opposed to six, which is the cost of going through nodes C and D. However, the negative heuristic values for C and D will cause A* to reach the end through them before exploring nodes A and B. In essence, the heuristic function keeps thinking that this path is going to get drastically better, until it is too late. In most implementations of A*, this will return the wrong answer, although you can correct for this problem by continuing to explore other nodes until the greatest value for f(n) is greater than the cost of the path you found. Note that there is nothing inadmissible or inconsistent about this heuristic. I'm actually really surprised that non-negativity is not more frequently mentioned as a rule for A* heuristics.
Of course, all that this demonstrates is that you can't freely use heuristics that return negative values without fear of consequences. It is entirely possible that a given heuristic for a given problem would happen to work out really well despite being negative. For your particular problem, it's unlikely that something like this is happening (and I find it really interesting that it works so well for your problem, and still want to think more about why that might be).

Resources