Accessing suboptimal solution in Mosek+Cvxpy - cvxpy

We are using Mosek solver via its Cvxpy interface.
We deal with large-scale optimization problem on a regular-basis and sometimes the runtime is very high. So, we specify a upper limit on runtime using Mosek's mosek.dparam.optimizer_max_time parameter.
In those cases, the pain-point is that we get no solution.
Is it possible to get the suboptimal/best found solution so far?

If Mosek did not find any feasible integer solution within the time limit then there is nothing to return, so you get nothing.
If Mosek found some feasible integer solution then CVXPY should return it with solution status s.OPTIMAL_INACCURATE, judging from a quick look at the code.
So the question is what does it say in the log output and what happens at the end of the optimization when CVXPY is processing the answer from the solver.

Related

Can I get a solution using "timeout" when using Optimize.minimize()?

I'm trying to minimize a variable, but z3 takes to long in order to give me a solution.
And I would like to know if it's possible to get a solution when timeout gets triggered.
If yes how can i do that?
Thx in advance!
If by "solution" you mean the latest approximation of the optimal value, then you may be able to retrieve it, provided that the optimization algorithm being used finds any intermediate solution along the way. (Some optimization algorithms --like, e.g., maxres-- don't find any intermediate solution).
Example:
import z3
o = z3.Optimize()
o.add(...very hard problem...)
cf = z3.Int('cf')
o.add(cf = ...)
obj = o.minimize(cf)
o.set(timeout=...)
res = o.check()
print(res)
print(obj.upper())
Even when res = unknown because of a timeout, the objective instance contains the latest approximation of the optimum value found by z3 before the timeout.
Unfortunately, I am not sure whether it is also possible to retrieve the corresponding sub-optimal model with o.model() (or any other method).
For OptiMathSAT, I show how to retrieve the latest approximation of the optimum value and the corresponding model in the unit-test timeout.py.

using z3 for ALLSAT

I'm using Z3 as a black box to find all possible combinations of some real-world objects with C# code like this:
while (solver.Check() == Status.SATISFIABLE)
{
SATModel = solver.Model;
....
//invert the Model
....
solver.Assert(InvertedModel)
}
For most of my problems the program is working fine, but now I have a bigger problem, where there would be 8.5E+64 possible combinations without constraints.
I'm starting with some 6000 constraints.
What I observe is that the check action takes less than .02 seconds at the beginning and builds up slowly. After 100000 found solutions it takes already 1 second per turn and after 130000 turns I measure 2 seconds.
Is there an easy way to improve the performance?
It's not unreasonable that the solver is taking longer and longer with each constraint. But to make sure it's not some sort of a memory-leak on the C# part, you should check that the time taken in your while loop is really in the Check part and not in the invert/assert part. If you determine z3 is the responsible party, perhaps filing it at https://github.com/Z3Prover/z3/issues might solicit a better answer from the developers.

Hyperopt Exploration/Exploitation strategy

What kind of settings Hyperopt provides to adjust balance between exploration with exploitation ? There's something like "bandit" and "bandit_algo" in the code but no explanation.
Could someone provide any code sample.
Thanks a lot for any help!
I just found hyperopt partial() a magical wrapper function for the optimizer algo. It allows to balance between different strategies and then E/E:
Partial returns the result of a randomly-chosen suggest function. For example to search by sometimes using random search, sometimes anneal, and sometimes tpe, type:
fmin(...,
algo=partial(mix.suggest,
p_suggest=[
(.1, rand.suggest),
(.2, anneal.suggest),
(.7, tpe.suggest),]),
)
Parameter "p_suggest": list of (probability, suggest) pairs. Make a suggestion from one of the suggest functions, in proportion to its corresponding probability. sum(probabilities) must be [close to] 1.0.
If you want an even sharper control of algo progression: you can use the fact that hyperopt optimizer algos are stateless and return the trial object which can be provided as an input to a new fmin to continue the process. Then you can call fmin with max_evals at 1 and handle the process in a loop, therefore you could modify "trials" and "suggest algo" between each iteration.
For the best bet, read the papers by Bergstra et. al. 1 2 and 3. I am not 100% clear on what the bandit_algo is, except that one of the papers mentions it as an alternative method to Gaussian Process and Tree of Parzen Estimators - maybe you can use it in the same way as those two?
My guess is that if it not documented, it may not be finished yet. You can try raising an issue on Github - the devs are fairly responsive from what I have seen.
EDIT: Looking at this paper, these bandit algorithms may be the base class that the others inherit from.

Subkernel memory control in Mathematica

I have a somewhat similar question as:
Mathematica running out of memory
I am interested in something like this:
ParallelTable[F[i], {i, 0, 14.9, 0.001}]
where F[i] is a complicated numerical integral (I haven't yet found an easy way to reproduce the problem without page filling definitions for an integral).
My problem is that the subkernels blow up in memory and I have to stop evaluation if I won't let the machine swapping.
But even if I have stopped evaluation the kernels won't give free their occupied memory.
ClearSystemCache[]
I even have tried
ParallelEvaluate[ClearSystemCache[]]
but
ParallelEvaluate[MemoryInUse[]]
stays at
{823185944, 833146832, 812429208, 840150336, 850057024, 834441704,
847068768, 850424224}
it seems that all memory controlling only works for the main kernel?
By now the only way is to shut down all the kernels and launch them again.
I really hope there are some solutions out there...
Thanks a lot.
Memory control works for the kernel where control expressions involving such functions as MemoryConstrained, MemoryInUse, Clear, Unset, Remove, $HistoryLength, ClearSystemCache etc. are evaluated. It seems that in your case the source of the memory leaks is not due to Mathematica's internal caching mechanism (thanks for the link, BTW!).
Have you tried to evaluate $HistoryLength=0; in all subkernels before using them for computations? If you have not yet, I strongly recommend to try.
Since you are working with numerical integration functions, I suggest also to try to optimize usage of them. For example, if you make numerical integration using NDSolve and need only a limited set of calculated points (or even the only one point) you should use the form NDSolve[eqns,y,{x,x_needed_min,x_needed_max}] (or even NDSolve[eqns,y,{x,x_max,x_max}]) instead of NDSolve[eqns,y,{x,x_min,x_max}] or NDSolve[eqns,y,{x,0,x_max}]. This can dramatically reduce memory usage in some cases! You can also use EventLocator for memory control.
I was(am?) having the exact same problem, almost word for word. I just had some good luck with adding the option to the problem integral:
Method-> {"GlobalAdaptive", "SymbolicProcessing"->False}
You can probably choose any other method if you'd like, but I had success with this within the last few minutes. Also, a lot of nasty inconsistencies I used to be getting are gone, and integration proceeds MUCH faster.

What's the point of basis path coverage?

The article at onjava seems to imply that basis path coverage is a sufficient substitute for full path coverage, due to some linear-independence/cyclomatic-complexity magic.
Using an example similar to the article:
public int returnInput(int x, boolean one, boolean two)
{
int y = x;
if(one)
{
y = x-1;
}
if(two)
{
x = y;
}
return x;
}
with the basis set {FF,TF,FT}, the bug is not exposed. Only the untested TT path would expose it.
So, how is basis path coverage useful? It doesn't seem much better than branch coverage.
[Disclaimer: I've never heard of this technique before, it just looks interesting so I've done a few searches and here's what I think I've found out. Hopefully someone who knows what they're talking about will contribute too...]
I think it's supposed to be a better way of generating branch coverage tests, not a complete substitute for path coverage. There's a far longer document here which restates the goals a bit: http://www.westfallteam.com/sites/default/files/papers/Basis_Path_Testing_Paper.pdf
The onjava article says "the goal of basis path testing is to test all decision outcomes independently of one another. Testing the four basis paths achieves this goal, making the other paths extraneous"
I think "extraneous" here means, "unnecessary to the goal of basis path testing", not as one might assume, "a complete waste of everyone's time".
I think the point of testing branches independently, is to break those accidental correlations between the paths which work, and the paths you test, that occur with terrifying frequency when I write both the code and an arbitrary set of branch coverage tests myself. There's no magic in the linear independence, it's just a systematic way of generating branch coverage, which discourages the tester from making the same assumptions as the programmer about correlation between branch choices.
So you're right, basis path testing misses your bug, and in general misses 2^(N-1)-N bugs, where N is the cyclomatic complexity. It just aims not to miss the 2^(N-1)-N paths most likely to be buggy, as letting the coder choose N paths to test typically does ;-)
path coverage is no better than any other coverage metrics - it is just that metrics that shows how much of 'code' has been tested. The fact that you can achieve 100% branch coverage with (TF,FT) set of TCs as well as (TT,FF) means it is all up to your luck if your exit criteria tell you exit after 100% coverage is done.
The coverage should not be a focus for the tester - finding bugs should be and TC is just a way to show the bug just as well as coverage a proxy showing how much of this showing the bug activity has been done. As with all other white box methods - striving for max coverage with minimum costs require actually understanding the code so that you could actually write a defect w/o a TC. TC is just good for regression and as a documentation of the defect.
As a tester coverage is just a hint on how much has been done - only experience can be really helpful as to say how much is enough. As this is difficult to present in numerical values we use other methods i.e. coverage statistics.
Not sure if this makes sense to you I guess judging on the date you are far gone since the date you publish your question...
My recollection from McCabe's work on this exact subject is: you generate the basis paths systematically, changing one condition at a time, and only changing the last condition, until you can't change any new conditions.
Suppose we start with FF, which is the shortest path. Following the algorithm, we change the last if in the chain, yielding FT. We've covered the second if now, meaning: if there was a bug in the second if, surely our two tests were paying attention to what happened when the second if statement suddenly started executing, otherwise our tests aren't working or our code isn't verifiable. Both possibilities suggest our code needs reworking.
Having covered FT, we go back up one node in the path and change the first T to F. When building basis paths, we only change one condition at a time. So we are forced to leave the second if the same, yielding... TT!
We are left with these basis paths: {FF, FT, TT}. Which address the issue you raised.
But wait, you say, what if the bug occurs in the TF case?? The answer is: we should have already noticed it between two of the other three tests. Think about it:
The second if already had its chance to demonstrate its effect on the code independently any other changes to the execution of the program through the FF and FT tests.
The first if had its chance to demonstrate its independent effect going from FT to TT.
We could have started with the TT case (the longest path). We would have arrived at slightly different basis paths, but they would still exercise each if statement independently.
Notice in your simple example, there is no co-linearity in the conditions of the if statements. Co-linearity cripples basis path generation.
In short: basis path testing, done systematically, avoids the problems you think it has. Basis path testing doesn't tell you how to write verifiable code. (TDD does that.) More to the point, path testing doesn't tell you which assertions you need to make. That's your job as the human.
Source: this is my research area, but I read McCabe's paper on this exact subject a few years back: http://mccabe.com/pdf/mccabe-nist235r.pdf

Resources