I'm trying to solve a satisfiabilty problem over integers using z3py where one of my constraints requires that the value of each variable in an array be greater than the median of the last eleven variables. Right now, I'm encoding the constraint like:
PbLe( ((v[i] <= v[i-11],1),...(v[i] <= v[i-1],1)), 6)
And this seems to be causing the solver to use a tremendous amount of memory. I assume the PbLe is getting converted to disjunctive normal form on the backend or something similar and causing a large blowup in the size of the expression.
Is there a more efficient way to express a 'greater than a rolling median' for values in an array in z3?
Related
I am attempting to model an allocation problem via (quasi)convex optimization: given a matrix of unknowns X containing the amount of a certain product in a certain moment and its value C, i want to maximize the resulting income cp.sum(cp.multiply(X, C)).
Among close/equivalent solutions, i want to have the least amount of different products types in inventary, - cp.sum(cp.maximum(0, cp.sign(X)), which counts the number of non-null entries in X as a penalty.
According to cvxpy, both functions are quasiconcave (the first is affine, and the second quasilinear), but when i compose them linearly, cp.sum(cp.multiply(X, C)) - cp.sum(cp.maximum(0, cp.sign(X)), the resulting problem has an UNKNOWN curvature, and refuses to solve since not 'DQCP'.
The problem is simplified, as there are additional parameters and weights, and in the constrains X is bound to be non-negative and to satisfy space availability, but this simple version reproduces this unexpected behavior.
Is the bug in cvxpy or in my math?
The difference of two quasiconcave functions is not necessarily quasiconcave, so this operation is not permitted in CVXPY.
I am using neo4j to calculate some statistics on a data set. For that I am often using sum on a floating point value. I am getting different results depending on the circumstances. For example, a query that does this:
...
WITH foo
ORDER BY foo.fooId
RETURN SUM(foo.Weight)
Returns different result than the query that simply does the sum:
...
RETURN SUM(foo.Weight)
The differences are miniscule (293.07724195098984 vs 293.07724195099007). But it is enough to make simple equality checks fail. Another example would be a different instance of the database, loaded with the same data using the same loading process can produce the same issue (the dbs might not be 1:1, the load order of some relations might be different). I took the raw values that neo4j sums (by simply removing the SUM()) and verified that they are the same in all cases (different dbs and ordered/not ordered).
What are my options here? I don't mind losing some precision (I already tried to cut down the precision from 15 to 12 decimal places but that did not seem to work), but I need the results to match up.
Because of rounding errors, floats are not associative. (a+b)+c!=a+(b+c).
The result of every operation is rounded to fit the floats coding constraints and (a+b)+c is implemented as round(round(a+b) +c) while a+(b+c) as round(a+round(b+c)).
As an obvious illustration, consider the operation (2^-100 + 1 -1). If interpreted as a (2^-100 + 1)-1, it will return 0, as 1+2^-100 would require a precision too large for floats or double coding in IEEE754 and can only be coded as 1.0. While (2^-100 +(1-1)) correctly returns 2^-100 that can be coded by either floats or doubles.
This is a trivial example, but these rounding errors may exist after every operation and explain why floating point operations are not associative.
Databases generally do not return data in a garanteed order and depending on the actual order, operations will be done differently and that explains the behaviour that you have.
In general, for this reason, it not a good idea to do equality comparison on floats. Generally, it is advised to replace a==b by abs(a-b) is "sufficiently" small.
"sufficiently" may depend on your algorithm. float are equivalent to ~6-7 decimals and doubles to 15-16 decimals (and I think that it is what is used on your DB). Depending on the number of computations, you may have the last 1--3 decimals affected.
The best is probably to use
abs(a-b)<relative-error*max(abs(a),abs(b))
where relative-error must be adjusted to your problem. Probably something around 10^-13 can be correct, but you must experiment, as rounding errors depends on the number of computations, on the dispersion of the values and on what you may consider as "equal" for you problem.
Look at this site for a discussion on comparison methods. And read What Every Computer Scientist Should Know About Floating-Point Arithmetic by David Goldberg that discusses, among others, these problems.
This question is related to my previous question
Is it possible to get a legit range info when using a SMT constraint with Z3
So it seems that "efficiently" finding the maximum range info is not proper, given typical 32-bit vectors and so on. But on the other hand, I am thinking whether it is feasible to find certain "sub-maximum" range info, which hopefully becomes more efficient. Another thing is that we may want to have certain "safe" guarantee, say for all elements in the sub-maximum range, they must satisfy the constraint, but there could exist some other solutions that would satisfy the constraint as well.
I am currently exploring whether model counting technique could make sense in this setting. Any thoughts would be appreciated very much. Thanks.
General case
This is not just a question of efficiency. Consider a problem where you have two variables a and b, and a single constraint:
a != b
What's the range of b? (maximum or otherwise?)
You can say all values are legitimate. But that would be wrong, as obviously the choice of a impacts the choice of b. The more variables you have around, the more complicated the problem will become. I don't think the problem is even well defined in this case, so searching for a solution (efficient or otherwise) doesn't make much sense.
Single variable assumption
Having said that, I think you can come up with a solution if you assume there's precisely one variable in the system. (Or, alternatively, if you fix all the other variables to some predefined constants.) If you're willing to go down this path, then you can implement a binary search algorithm to find a reasonably sized range by simply proving the quantified formula
Exists([b], And(b >= minBound, b <= maxBound, Not(constraints)))
Once you get unsat for this, you have your range. So long as you get sat, you can adjust your minBound/maxBound to search within smaller ranges. In the worst case, this can turn into a linear walk, but you can "cut-down" this search by making sure you go down a significant size in each step. That could be a parameter to the whole search, depending on how large you want your intervals to be. It'll have to be a choice between trying to find a maximal range, and how long you want to spend in this search. Of course, if you cut-down too much, you can miss a big interval, but that's the cost of efficiency.
Example1 (Good case) There's a single constraint that says b != 5. Then your search will be quick and depending on which branch you'll go, you'll either find [0, 4] or [6, 255] assuming 8-bit words.
Example2 (Bad case) There's a single constraint that says b is even. Then your search will exhibit worst-case behavior, and if your "cut-down" size is 1, you'll possibly iterate 255 times before you settle down on [0, 0]; assuming z3 gives you the maximum odd number in each call.
I hope that illustrates the point. In general, though, I'd assume you'd be closer to the "good case" for practical applications and even if your cut-down size is minimal you can most likely converge in a few iterations. Of course, this entirely depends on your problem domain, but I'd expect it to hold for software analysis in general.
I start with a simple Maxima question, the answer to which may provide the answer to the actual problem I'm grappling with.
Related Simple Question:
How can I get maxima to calculate:
bfloat((1+%i)^0.3);
Might there be an option variable that can be set so that this evaluates to a complex number?
Actual Question:
In evaluating approximations for numerical time integration for finite element methods, for this purpose I'm using spectral analysis, which requires the calculation of the eigenvalues of a 4 x 4 matrix. This matrix "cav" is also calculated within maxima, using some of the algebra capabilities of maxima, but sustituting numerical values, so that matrix is entirely numerical, i.e. containing no variables. I've calculated the eigenvalues with Mathematica and it returns 4 real eigenvalues. However Maxima calculates horrenduously complicated expressions for this case, which apparently it does not "know" how to simplify, even numerically as "bigfloat". Perhaps this problem arises because Maxima first approximates the matrix "cac" by rational numbers (i.e. fractions) and then tries to solve the problem fully exactly, instead of simply using numerical "bigfloat" computations throughout. Is there I way I can change this?
Note that if you only change the input value of gzv to say 0.5 it works fine, and returns numerical values of complex eigenvalues.
I include the code below. Note that all of the code up until "cav:subst(vs,ca)$" is just for the definition of the matrix cav and seems to work fine. It is in the few statements thereafter that it fails to calculate numerical values for the eigenvalues.
v1:v0+ (1-gg)*a0+gg*a1$
d1:d0+v0+(1/2-gb)*a0+gb*a1$
obf:a1+(1+ga)*(w^2*d1 + 2*gz*w*(d1-d0)) -
ga *(w^2*d0 + 2*gz*w*(d0-g0))$
obf:expand(obf)$
cd:subst([a1=1,d0=0,v0=0,a0=0,g0=0],obf)$
fd:subst([a1=0,d0=1,v0=0,a0=0,g0=0],obf)$
fv:subst([a1=0,d0=0,v0=1,a0=0,g0=0],obf)$
fa:subst([a1=0,d0=0,v0=0,a0=1,g0=0],obf)$
fg:subst([a1=0,d0=0,v0=0,a0=0,g0=1],obf)$
f:[fd,fv,fa,fg]$
cad1:expand(cd*[1,1,1/2-gb,0] - gb*f)$
cad2:expand(cd*[0,1,1-gg,0] - gg*f)$
cad3:expand(-f)$
cad4:[cd,0,0,0]$
cad:matrix(cad1,cad2,cad3,cad4)$
gav:-0.05$
ggv:1/2-gav$
gbv:(ggv+1/2)^2/4$
gzv:1.1$
dt:0.01$
wv:bfloat(dt*2*%pi)$
vs:[ga=gav,gg=ggv,gb=gbv,gz=gzv,w=wv]$
cav:subst(vs,ca)$
cav:bfloat(cav)$
evam:eigenvalues(cav)$
evam:bfloat(evam)$
eva:evam[1]$
The main problem here is that Maxima tries pretty hard to make computations exact, and it's hard to tell it to ease up and allow inexact results.
Is there a mistake in the code you posted above? You have cav:subst(vs,ca) but ca is not defined. Is that supposed to be cav:subst(vs,cad) ?
For the short problem, usually rectform can simplify complex expressions to something more usable:
(%i58) rectform (bfloat((1+%i)^0.3));
`rat' replaced 1.0B0 by 1/1 = 1.0B0
(%o58) 2.59023849130283b-1 %i + 1.078911979230303b0
About the long problem, if fixed-precision (i.e. ordinary floats, not bigfloats) is acceptable to you, then you can use the LAPACK function dgeev to compute eigenvalues and/or eigenvectors.
(%i51) load (lapack);
<bunch of messages here>
(%o51) /usr/share/maxima/5.39.0/share/lapack/lapack.mac
(%i52) dgeev (cav);
(%o52) [[- 0.02759949957202372, 0.06804641655485913, 0.997993508502892, 0.928429191717788], false, false]
If you really need variable precision, I don't know what to try. In principle it's possible to rework the LAPACK code to work with variable-precision floats, but that's a substantial task and I'm not sure about the details.
Good morning all,
I'm having some issues with floating point math, and have gotten totally lost in ".to_f"'s, "*100"'s and ".0"'s!
I was hoping someone could help me with my specific problem, and also explain exactly why their solution works so that I understand this for next time.
My program needs to do two things:
Sum a list of decimals, determine if they sum to exactly 1.0
Determine a difference between 1.0 and a sum of numbers - set the value of a variable to the exact difference to make the sum equal 1.0.
For example:
[0.28, 0.55, 0.17] -> should sum to 1.0, however I keep getting 1.xxxxxx. I am implementing the sum in the following fashion:
sum = array.inject(0.0){|sum,x| sum+ (x*100)} / 100
The reason I need this functionality is that I'm reading in a set of decimals that come from excel. They are not 100% precise (they are lacking some decimal points) so the sum usually comes out of 0.999999xxxxx or 1.000xxxxx. For example, I will get values like the following:
0.568887955,0.070564759,0.360547286
To fix this, I am ok taking the sum of the first n-1 numbers, and then changing the final number slightly so that all of the numbers together sum to 1.0 (must meet validation using the equation above, or whatever I end up with). I'm currently implementing this as follows:
sum = 0.0
array.each do |item|
sum += item * 100.0
end
array[i] = (100 - sum.round)/100.0
I know I could do this with inject, but was trying to play with it to see what works. I think this is generally working (from inspecting the output), but it doesn't always meet the validation sum above. So if need be I can adjust this one as well. Note that I only need two decimal precision in these numbers - i.e. 0.56 not 0.5623225. I can either round them down at time of presentation, or during this calculation... It doesn't matter to me.
Thank you VERY MUCH for your help!
If accuracy is important to you, you should not be using floating point values, which, by definition, are not accurate. Ruby has some precision data types for doing arithmetic where accuracy is important. They are, off the top of my head, BigDecimal, Rational and Complex, depending on what you actually need to calculate.
It seems that in your case, what you're looking for is BigDecimal, which is basically a number with a fixed number of digits, of which there are a fixed number of digits after the decimal point (in contrast to a floating point, which has an arbitrary number of digits after the decimal point).
When you read from Excel and deliberately cast those strings like "0.9987" to floating points, you're immediately losing the accurate value that is contained in the string.
require "bigdecimal"
BigDecimal("0.9987")
That value is precise. It is 0.9987. Not 0.998732109, or anything close to it, but 0.9987. You may use all the usual arithmetic operations on it. Provided you don't mix floating points into the arithmetic operations, the return values will remain precise.
If your array contains the raw strings you got from Excel (i.e. you haven't #to_f'd them), then this will give you a BigDecimal that is the difference between the sum of them and 1.
1 - array.map{|v| BigDecimal(v)}.reduce(:+)
Either:
continue using floats and round(2) your totals: 12.341.round(2) # => 12.34
use integers (i.e. cents instead of dollars)
use BigDecimal and you won't need to round after summing them, as long as you start with BigDecimal with only two decimals.
I think that algorithms have a great deal more to do with accuracy and precision than a choice of IEEE floating point over another representation.
People used to do some fine calculations while still dealing with accuracy and precision issues. They'd do it by managing the algorithms they'd use and understanding how to represent functions more deeply. I think that you might be making a mistake by throwing aside that better understanding and assuming that another representation is the solution.
For example, no polynomial representation of a function will deal with an asymptote or singularity properly.
Don't discard floating point so quickly. I could be that being smarter about the way you use them will do just fine.