Asymmetric behavior in ctx-solver-simplify - z3

I'm seeing rather surprising behavior from ctx-solver-simplify, where the order of parameters to z3.And() seems to matter, using the latest version of Z3 from the master branch of https://git01.codeplex.com/z3 (89c1785b):
#!/usr/bin/python
import z3
a, b = z3.Bools('a b')
p = z3.Not(a)
q = z3.Or(a, b)
for e in z3.And(p, q), z3.And(q, p):
print e, '->', z3.Tactic('ctx-solver-simplify')(e)
produces:
And(Not(a), Or(a, b)) -> [[Not(a), Or(a, b)]]
And(Or(a, b), Not(a)) -> [[b, Not(a)]]
Is this a bug in Z3?

No, this is not a bug. The tactic ctx-solver-simplify is very expensive and inherently asymmetric. That is, the order the sub-formulas are visited affect the final result. This tactic is implemented in the file src/smt/tactic/ctx_solver_simplify_tactic.cpp. The code is quite readable. Note that, it uses a "SMT" solver (m_solver), and makes several calls to m_solver.check() while traversing the input formula. Each one of these calls can be quite expensive. For particular problem domains, we can implement a version of this tactic that is even more expensive, and avoids the asymmetry described in your question.
EDIT:
You may also consider the tactic ctx-simplify, it is cheaper than ctx-solver-simplify, but it is symmetric. The tactic ctx-simplify will essentially applies rules such as:
A \/ F[A] ==> A \/ F[false]
x != c \/ F[x] ==> F[c]
Where F[A] is a formula that may contain A. It is cheaper than ctx-solver-simplify because it does not invoke a SMT solver when it is traversing the formula. Here is an example using this tactic (also available online):
a, b = Bools('a b')
p = Not(a)
q = Or(a, b)
c = Bool('c')
t = Then('simplify', 'propagate-values', 'ctx-simplify')
for e in Or(c, And(p, q)), Or(c, And(q, p)):
print e, '->', t(e)
Regarding humand-readability, this was never a goal when implementing any tactic. Please, tell us if the tactic above is not good enough for your purposes.

I think it will be better to write a custom tactic because there are
other tradeoffs when you essentially ask for canonicity.
Z3 does not have any tactics that convert formulas into a canonical form.
So if you want something that always produces the same answer for formulas
that are ground equivalent you will have to write a new normalizer that
ensures this.
The ctx_solver_simplify_tactic furthermore makes some tradeoffs when simplifying formulas:
it avoids simplifying the same sub-formula multiple times. If it did, it could increase
the size of the resulting formula significantly (exponentially).

Related

How to represent a symbolic summation in Z3?

I'm trying to represent sum of integers from an integer a to an integer b of the form
\sum_{i=a}^b i
Of courese there is a closed form solution for this version, but in general I'd like to sum over expressions parameterized by i. I've currently tried to define a function symSum and described its behavior using universal quantifiers:
from z3 import *
s = Solver()
symSum = Function('symSum', IntSort(), IntSort(), IntSort())
a = Int('a')
b = Int('b')
s.add(ForAll([a,b],If(a > b,symSum(a,b) == 0,symSum(a,b) == a + symSum(a+1,b))))
x = Int('x')
s.add(x == symSum(1,5))
print(s.check())
print(s.model())
I have not gotten this code to terminate (I have only allowed it to run for a couple minutes at max though). Is this outside the capabilities of Z3?
EDIT:
Looking into this a bit more I was able to use recursive functions to define this!
from z3 import *
ctx = Context()
symSum = RecFunction('symSum', IntSort(ctx), IntSort(ctx), IntSort(ctx))
a = Int('a',ctx)
b = Int('b',ctx)
RecAddDefinition(symSum, [a,b], If(a > b, 0, a + symSum(a+1,b)))
x = Int('x',ctx)
s = Solver(ctx=ctx)
s.add(symSum(1,5) == x)
print(s.check())
print(s.model())
Yes; this is beyond the current capabilities of SMT solvers.
Imagine how you'd prove something like this by hand. You'd have to do induction on the natural numbers. SMT solvers do not perform induction, at least not out-of-the box. You can coax them to do so with a lot of helper lemmas and careful guiding via what's known as patterns; but it's not something they're designed for, or even good at. At least not for the time being.
See this answer for more details regarding a relevant question: Is it possible to prove this defined function is an involution in z3?
Having said that, recent versions of SMTLib does include capabilities for definitions recursive functions. (See https://smtlib.cs.uiowa.edu/papers/smt-lib-reference-v2.6-r2017-07-18.pdf, Section 4.2.3.) And z3 and other solvers have some support for the new syntax, though they can't really prove any interesting properties of these functions. I suspect the direction of the community is that eventually some level of (user-guided) induction will become part of the toolbox offered by these solvers, but that is currently not the case for the time being.

Clarification SMT types; bug in variable truncation?; allowing Bool*Int

I have the following questions:
A. If I have a variable x which I declare it (case 1) to be Bool (case 2) to be Int and assert x=0 or x=1 (case 3) to be Int and assert 0<=x<=1. What happens in Z3, or more generally in SMT, in each of the cases? Is, (case 1) desired since things happen at SAT and not SMT level? Do you have some reference paper here?
B. I have the following statement in Python (for using Z3 Python API)
tmp.append(sum([self.a[i * self.nrVM + k] * componentsRequirements[i][1] for i
in range(self.nrComp)]) <= self.MemProv[k])
where a is declared as a Z3 Int variable (0 or 1), componentsRequirements is a Python variable which is to be considered float, self.MemProv is declared as a Z3 Real variable.
The strange thing is that for float values for componentsRequirements, e.g 0.5, the constraint built by the solver considers it 0 and not 0.5. For example in:
(assert (<= (to_real (+ 0 (* 0 C1_VM2)
(* 0 C2_VM2)
(* 2 C3_VM2)
(* 2 C4_VM2)
(* 3 C5_VM2)
(* 1 C6_VM2)
(* 0 C7_VM2)
(* 2 C8_VM2)
(* 1 C9_VM2)
(* 2 C10_VM2)))
StorageProv2))
the 0 above should be actually 0.5. When changing a to a Real value then floats are considered, but this is admitted by us because of the semantics of a. I tried to use commutativity between a and componentsRequirements but with no success.
C. If I'd like to use x as type Bool and multiply it by an Int, I get, of course, an error (Bool*Real/Int not allowed), in Z3. Is there a way to overcome this problem but keeping the types of both multipliers? An example in this sense is the above (a - Bool, componentsRequirements - Real/Int):
a[i * self.nrVM + k] * componentsRequirements[i][1]
Thanks
Regarding A
Patrick gave a nice explanation for this one. In practice, you really have to try and see what happens. Different problems might exhibit different behavior due to when/how heuristics kick in. I don't think there's a general rule of thumb here. My personal preference would be to keep booleans as booleans and convert as necessary, but that's mostly from a programming/maintenance perspective, not an efficiency one.
Regarding B
Your "division" getting truncated problem is most likely the same as this: Retrieve a value in Z3Py yields unexpected result
You can either be explicit and write 1.0/2.0 etc.; or use:
from __future__ import division
at the top of your program.
Regarding C
SMTLib is a strongly typed language; you cannot multiply by a bool. You have to convert it to a number first. Something like (* (ite b 1 0) x), where b is your boolean. Or you can write a custom function that does this for you and call that instead; something like:
(define-fun mul_bool_int ((b Bool) (i Int)) Int (ite b i 0))
(assert (= 0 (mul_bool_int false 5)))
(assert (= 5 (mul_bool_int true 5)))
I honestly can't answer for z3 specifically, but I want to state my opinion on point A).
In general, the constraint x=0 v x=1 is abstracted into t1 v t2, where t1 is x=0 and t2 is x=1, at the SAT engine level.
Thus, the SAT engine might try to assign both t1 and t2 to true during the construction of a satisfiable truth assignment for the input formula. It is important to note that this is a contradiction in the theory of LAR, but the SAT engine is not capable of such reasoning. Therefore, the SAT engine might continue its search for a while after taking this decision.
When the LAR Solver is finally invoked, it will detect the LAR-unsatisfiability of the given (possibly partial) truth assignment. As a consequence, it (should) hand the clause not t1 or not t2 to the SAT engine as learning clause so as to block the bad assignment of values to be generated again.
If there are many of such bad assignments, it might require multiple calls to the LAR Solver so as to generate all blocking clauses that are needed to rectify the SAT truth assignment, possibly up to one for each of those bad combinations.
Upon receiving the conflict clause, the SAT engine will have to backjump to the place where it made the bad assignment and do the right decision. Obviously, not all the work done by the SAT engine since it made the bad assignment of values gets wasted: any useful clause that was learned in the meanwhile will certainly be re-used. However, this can still result in a noticeable performance hit on some formulas.
Compare all of this back-and-forth computation being wasted with simply declaring x as a Bool: now the SAT engine knows that it can only assign one of two values to it, and won't ever assign x and not x contemporarily.
In this specific situation, one might mitigate this problem by providing the necessary blocking clauses right into the input formula. However, it is not trivial to conclude that this is always the best practice: explicitly listing all known blocking clauses might result in an explosion of the formula size in some situations, and into an even worse degradation of performance in practice. The best advice is to try different approaches and perform an experimental evaluation before settling for any particular encoding of a problem.
Disclaimer: it is possible that some SMT solvers are smarter than others, and automatically generate appropriate blocking clauses for 0/1 variables upon parsing the formula or avoid this situation altogether through other means (i.e. afaik, Model-Based SMT solvers shouldn't incur in the same problem)

Erlang implementing an amb operator.

On wikipedia it says that using call/cc you can implement the amb operator for nondeterministic choice, and my question is how would you implement the amb operator in a language in which the only support for continuations is to write in continuation passing style, like in erlang?
If you can encode the constraints for what constitutes a successful solution or choice as guards, list comprehensions can be used to generate solutions. For example, the list comprehension documentation shows an example of solving Pythagorean triples, which is a problem frequently solved using amb (see for example exercise 4.35 of SICP, 2nd edition). Here's the more efficient solution, pyth1/1, shown on the list comprehensions page:
pyth1(N) ->
[ {A,B,C} ||
A <- lists:seq(1,N-2),
B <- lists:seq(A+1,N-1),
C <- lists:seq(B+1,N),
A+B+C =< N,
A*A+B*B == C*C
].
One important aspect of amb is efficiently searching the solution space, which is done here by generating possible values for A, B, and C with lists:seq/2 and then constraining and testing those values with guards. Note that the page also shows a less efficient solution named pyth/1 where A, B, and C are all generated identically using lists:seq(1,N); that approach generates all permutations but is slower than pyth1/1 (for example, on my machine, pyth(50) is 5-6x slower than pyth1(50)).
If your constraints can't be expressed as guards, you can use pattern matching and try/catch to deal with failing solutions. For example, here's the same algorithm in pyth/1 rewritten as regular functions triples/1 and the recursive triples/5:
-module(pyth).
-export([triples/1]).
triples(N) ->
triples(1,1,1,N,[]).
triples(N,N,N,N,Acc) ->
lists:reverse(Acc);
triples(N,N,C,N,Acc) ->
triples(1,1,C+1,N,Acc);
triples(N,B,C,N,Acc) ->
triples(1,B+1,C,N,Acc);
triples(A,B,C,N,Acc) ->
NewAcc = try
true = A+B+C =< N,
true = A*A+B*B == C*C,
[{A,B,C}|Acc]
catch
error:{badmatch,false} ->
Acc
end,
triples(A+1,B,C,N,NewAcc).
We're using pattern matching for two purposes:
In the function heads, to control values of A, B and C with respect to N and to know when we're finished
In the body of the final clause of triples/5, to assert that conditions A+B+C =< N and A*A+B*B == C*C match true
If both conditions match true in the final clause of triples/5, we insert the solution into our accumulator list, but if either fails to match, we catch the badmatch error and keep the original accumulator value.
Calling triples/1 yields the same result as the list comprehension approaches used in pyth/1 and pyth1/1, but it's also half the speed of pyth/1. Even so, with this approach any constraint could be encoded as a normal function and tested for success within the try/catch expression.

How to create a lazy-seq generating, anonymous recursive function in Clojure?

Edit: I discovered a partial answer to my own question in the process of writing this, but I think it can easily be improved upon so I will post it anyway. Maybe there's a better solution out there?
I am looking for an easy way to define recursive functions in a let form without resorting to letfn. This is probably an unreasonable request, but the reason I am looking for this technique is because I have a mix of data and recursive functions that depend on each other in a way requires a lot of nested let and letfn statements.
I wanted to write the recursive functions that generate lazy sequences like this (using the Fibonacci sequence as an example):
(let [fibs (lazy-cat [0 1] (map + fibs (rest fibs)))]
(take 10 fibs))
But it seems in clojure that fibs cannot use it's own symbol during binding. The obvious way around it is using letfn
(letfn [(fibo [] (lazy-cat [0 1] (map + (fibo) (rest (fibo)))))]
(take 10 (fibo)))
But as I said earlier this leads to a lot of cumbersome nesting and alternating let and letfn.
To do this without letfn and using just let, I started by writing something that uses what I think is the U-combinator (just heard of the concept today):
(let [fibs (fn [fi] (lazy-cat [0 1] (map + (fi fi) (rest (fi fi)))))]
(take 10 (fibs fibs)))
But how to get rid of the redundance of (fi fi)?
It was at this point when I discovered the answer to my own question after an hour of struggling and incrementally adding bits to the combinator Q.
(let [Q (fn [r] ((fn [f] (f f)) (fn [y] (r (fn [] (y y))))))
fibs (Q (fn [fi] (lazy-cat [0 1] (map + (fi) (rest (fi))))))]
(take 10 fibs))
What is this Q combinator called that I am using to define a recursive sequence? It looks like the Y combinator with no arguments x. Is it the same?
(defn Y [r]
((fn [f] (f f))
(fn [y] (r (fn [x] ((y y) x))))))
Is there another function in clojure.core or clojure.contrib that provides the functionality of Y or Q? I can't imagine what I just did was idiomatic...
letrec
I have written a letrec macro for Clojure recently, here's a Gist of it. It acts like Scheme's letrec (if you happen to know that), meaning that it's a cross between let and letfn: you can bind a set of names to mutually recursive values, without the need for those values to be functions (lazy sequences are ok too), as long as it is possible to evaluate the head of each item without referring to the others (that's Haskell -- or perhaps type-theoretic -- parlance; "head" here might stand e.g. for the lazy sequence object itself, with -- crucially! -- no forcing involved).
You can use it to write things like
(letrec [fibs (lazy-cat [0 1] (map + fibs (rest fibs)))]
fibs)
which is normally only possible at top level. See the Gist for more examples.
As pointed out in the question text, the above could be replaced with
(letfn [(fibs [] (lazy-cat [0 1] (map + (fibs) (rest (fibs)))))]
(fibs))
for the same result in exponential time; the letrec version has linear complexity (as does a top-level (def fibs (lazy-cat [0 1] (map + fibs (rest fibs)))) form).
iterate
Self-recursive seqs can often be constructed with iterate -- namely when a fixed range of look-behind suffices to compute any given element. See clojure.contrib.lazy-seqs for an example of how to compute fibs with iterate.
clojure.contrib.seq
c.c.seq provides an interesting function called rec-seq, enabling things like
(take 10 (cseq/rec-seq fibs (map + fibs (rest fibs))))
It has the limitation of only allowing one to construct a single self-recursive sequence, but it might be possible to lift from it's source some implementation ideas enabling more diverse scenarios. If a single self-recursive sequence not defined at top level is what you're after, this has to be the idiomatic solution.
combinators
As for combinators such as those displayed in the question text, it is important to note that they are hampered by the lack of TCO (tail call optimisation) on the JVM (and thus in Clojure, which elects to use the JVM's calling conventions directly for top performance).
top level
There's also the option of putting the mutually recursive "things" at top level, possibly in their own namespace. This doesn't work so great if those "things" need to be parameterised somehow, but namespaces can be created dynamically if need be (see clojure.contrib.with-ns for implementation ideas).
final comments
I'll readily admit that the letrec thing is far from idiomatic Clojure and I'd avoid using it in production code if anything else would do (and since there's always the top level option...). However, it is (IMO!) nice to play with and it appears to work well enough. I'm personally interested in finding out how much can be accomplished without letrec and to what degree a letrec macro makes things easier / cleaner... I haven't formed an opinion on that yet. So, here it is. Once again, for the single self-recursive seq case, iterate or contrib might be the best way to go.
fn takes an optional name argument with that name bound to the function in its body. Using this feature, you could write fibs as:
(def fibs ((fn generator [a b] (lazy-seq (cons a (generator b (+ a b))))) 0 1))

refactoring boolean equation

Let's say you have a Boolean rule/expression like so
(A OR B) AND (D OR E) AND F
You want to convert it into as many AND only expressions as possible, like so
A AND D AND F
A AND E AND F
B AND D AND F
B AND E AND F
You are just reducing the OR's so it becomes
(A AND D AND F) OR (A AND E AND F) OR (...)
Is there a property in Boolean algebra that would do this?
Take a look at DeMorgan's theorem. The link points to a document relating to electronic gates, but the theory remains the same.
It says that any logical binary expression remains unchanged if we
Change all variables to their complements.
Change all AND operations to ORs.
Change all OR operations to ANDs.
Take the complement of the entire expression.
(quoting from the above linked document)
Your example is exploiting the the distributivity of AND over OR, as shown here.
All you need to do is apply that successively. For example, using x*(y+z) = (x*y)+(x*z) (where * denotes AND and + denotes OR):
0. (A + B) * (D + E) * F
1. Apply to the first 2 brackets results in ((A+B)*D)+((A+B)*E)
2. Apply to content of each bracket results in (A*D+B*D)+(A*E+B*E)
3. So now you have ((A*D+B*D)+(A*E+B*E))*F
4. Applying the law again results in (A*D+B*D)*F+(A*E+B*E)*F
5. Apply one more time results in A*D*F+B*D*F+A*E*F+B*E*F, QED
You may be interested in reading about Karnaugh maps. They are a tool for simplifying boolean expressions, but you could use them to determine all of the individual expressions as well. I'm not sure how you might generalize this into an algorithm you could write a program for though.
You might be interested in Conjunctive Normal form or its brother, Disjunctive normal form.
As far as I know boolean algebra can not be build only with AND and OR operations.
If you have only this two operation you are not able to receive NOT operation.
You can convert any expression to the full set of boolean operations.
Here is some full sets:
AND and NOT
OR and NOT
Assuming you can use the NOT operation, you can rewrite any Boolean expression with only ANDs or only ORs. In your case:
(A OR B) AND (D OR E) AND F
I tend to use engineering shorthand for the above and write:
AND as a product (. or nothing);
OR as a sum (+); and
NOT as a single quote (').
So:
(A+B)(D+E)F
The corollary to arithmetic is actually quite useful for factoring terms.
By De Morgan's Law:
(A+B) => (A'B')'
So you can rewrite your expression as:
(A+B)(D+E)F
(A'B')'(D'E')'F

Resources