I'm using Z3 to construct a bounded model-checker. I'm running into a strange performance problem when trying to implement a completeness test. The completeness test has to make sure that all states that every path contains each state at most once. If there are still paths which fulfill this property, Z3 is quick with an answer, however in the case where all paths have been considered, Z3 seems to be exponentially slow.
I've reduced my test-case to the following:
; Set the problem size (length of path)
(define-fun sz () Int 5)
; Used to define valid states
(define-fun limit ((s Int)) Bool
(and (>= s 0)
(<= s sz)))
; Constructs a path of a given length
(define-fun path-of-len ((path (Array Int Int)) (len Int)) Bool
(forall ((i Int))
(=> (and (>= i 0)
(< i len))
(limit (select path i)))))
; Asserts that a given path only contains unique states
(define-fun loop-free ((path (Array Int Int)) (len Int)) Bool
(forall ((i Int) (j Int))
(=> (and (>= i 0)
(>= j 0)
(< i len)
(< j len)
(not (= i j)))
(not (= (select path i) (select path j))))))
; Construct a unique path of a given length
(define-fun path ((path (Array Int Int)) (len Int)) Bool
(and (path-of-len path len)
(loop-free path len)))
; Declare a concrete path
(declare-const tpath (Array Int Int))
; Assert that the path is loop free
(assert (path tpath (+ sz 2)))
(check-sat)
On my computer this results in the following running times (depending on path length):
3: 0.057s
4: 0.561s
5: 42.602s
6: >15m (aborted)
If I switch from Int to bitvectors of size 64, the performance gets a little better, but still seems exponential:
3: 0.035s
4: 0.053s
5: 0.061s
6: 0.106s
7: 0.467s
8: 1.809s
9: 2m49.074s
Strangely enough, for a length of 10 it only takes 2m34.197s.
If I switch to bitvectors of smaller size, the performance gets a little better, but is still exponential.
So my question is: Is this to be expected? Is there a better way to formulate this "loop-free" constraint?
Your formula is essentially encoding the “pigeon-hole” problem.
You have sz+1 holes (the values 0, 1, …, sz) and sz+2 pigeons (the array cells (select tpath 0), …, (select tpath (+ sz 1))).
You first quantifier is stating that each pigeon should be put in one of the holes.
The second is stating that two different pigeons should not be in the same hole.
There is no polynomial size resolution proof for the “pigeon-hole” problem.
So, the exponential growth in run time is expected.
Note that any SAT solver based on resolution, DPLL, or CDCL will perform badly on the pigeon-hole problem..
You get better performance when using bit-vectors because Z3 reduces them to propositional logic and case analysis is much more efficient at that level.
BTW, you are using quantifiers for encoding parametric problems.
That is an elegant solution, but it is not the most efficient approach for Z3.
For Z3, in general, it is better to assert the 'expanded' quantifier free problem.
However, for the problem described in your question, it will not make a big difference, since you will still experience the exponential growth.
Just like what Leonardo said, since the pigeon-hole problem is exponential in nature, the performance will go bad eventually whatsoever. The only thing you could probably do is to postpone the time the performance goes bad. Since you have tried bit-vector, my suggestions are try converting the problem into quantifier-free problem as suggested by Leonardo, given that the problem size is pre-defined, and try using some tactics.
Related
I noticed that if I create my own array type that stores bitvectors and assert the first array update axiom, simple assertions afterwards fail to find a solution (my example below neither returns sat nor unsat but just keeps running):
(declare-sort MyArray)
; Indices into the array
(declare-sort Id)
; Returns the value in the array located at the specified index
(declare-fun index
(MyArray Id)
(_ BitVec 8))
; Updates the array so that the provided value is stored at the specified index
(declare-fun upd
(MyArray Id (_ BitVec 8))
MyArray)
; First array update axiom
(assert (forall ((a MyArray) (i Id) (v (_ BitVec 8)))
(=
(index (upd a i v) i)
v)))
(declare-const x Int)
(declare-const y Int)
(echo "")
(echo "Sanity check, should be sat:")
(assert (= x y))
(check-sat)
However, if I instead specify that my array stores a custom sort z3 finds a solution very quickly:
(declare-sort MyArray)
; Indices into the array
(declare-sort Id)
; Values stored in the array
(declare-sort Elem)
; Returns the value in the array located at the specified index
(declare-fun index
(MyArray Id)
Elem)
; Updates the array so that the provided value is stored at the specified index
(declare-fun upd
(MyArray Id Elem)
MyArray)
; First array update axiom
(assert (forall ((a MyArray) (i Id) (v Elem))
(=
(index (upd a i v) i)
v)))
(declare-const x Int)
(declare-const y Int)
(echo "")
(echo "Sanity check, should be sat:")
(assert (= x y))
(check-sat)
Does anyone know why this is the case? It's possible that z3 gets caught in some kind of instantiation loop (since the upd function both takes and returns MyArray sort), but I'm surprised that it only seems to get tripped up with bitvectors as the elements. Is this related to Nikolaj's answer that the quantifier elimination tactic is currently fairly simplistic when it comes to bit-vectors?
I'm using bitvectors because my problem ultimately involves some bitvector operations (especially bvxor). Is it better just to define my own operations and essentially recreate part of the theory of bitvectors? Or is there a better way to go about this (than mixing quantifiers, bitvectors, and part of the theory of arrays)? I'm really just interested in operations on bytes so all my bitvectors are of length 8.
I don't think there's a good reason why z3 is not terminating on the first program you gave here. Running with z3 -v:10 suggests it gets into some sort of an unproductive loop, that the second version avoids. I think you should report this at https://github.com/Z3Prover/z3/issues. Even though it's not strictly speaking a "bug," it's surprising behavior and the developers might want to look at it. (Please report back what you find.)
Regarding your second question: Do not reinvent what z3 already has support for! Use the internal arrays; they have a custom decision procedure and it has gone years of tuning. The introduction of quantified axioms will no doubt create more work than is necessary. Does something go wrong if you use internal arrays? Why would you not use them anyhow? Only look at your own axiomatization, if the internal built-in one isn't working well. (Even then, I'd first check with developers to see why.)
I am trying to model a small programming language in SMT-LIB 2.
My intent is to express some program analysis problems and solve them with Z3.
I think I am misunderstanding the forall statement though.
Here is a snippet of my code.
; barriers.smt2
(declare-datatype Barrier ((barrier (proc Int) (rank Int) (group Int) (complete-time Int))))
; barriers in the same group complete at the same time
(assert
(forall ((b1 Barrier) (b2 Barrier))
(=> (= (group b1) (group b2))
(= (complete-time b1) (complete-time b2)))))
(check-sat)
When I run z3 -smt2 barriers.smt2 I get unsat as the result.
I am thinking that an instance of my analysis problem would be a series of forall assertions like the above and a series of const declarations with assertions that describe the input program.
(declare-const b00 Barrier)
(assert (= (proc b00) 0))
(assert (= (rank b00) 0))
...
But apparently I am using the forall expression incorrectly because I expected z3 to decide that there was a satisfying model for that assertion. What am I missing?
When you declare a datatype like this:
(declare-datatype Barrier
((barrier (proc Int)
(rank Int)
(group Int)
(complete-time Int))))
you are generating a universe that is "freely" generated. That's just a fancy word for saying there is a value for Barrier for each possible element in the cartesian product Int x Int x Int x Int.
Later on, when you say:
(assert
(forall ((b1 Barrier) (b2 Barrier))
(=> (= (group b1) (group b2))
(= (complete-time b1) (complete-time b2)))))
you are making an assertion about all possible values of b1 and b2, and you are saying that if groups are the same then completion times must be the same. But remember that datatypes are freely generated so z3 tells you unsat, meaning that your assertion is clearly violated by picking up proper values of b1 and b2 from that cartesian product, which have plenty of inhabitant pairs that violate this assertion.
What you were trying to say, of course, was: "I just want you to pay attention to those elements that satisfy this property. I don't care about the others." But that's not what you said. To do so, simply turn your assertion to a function:
(define-fun groupCompletesTogether ((b1 Barrier) (b2 Barrier)) Bool
(=> (= (group b1) (group b2))
(= (complete-time b1) (complete-time b2))))
then, use it as the hypothesis of your implications. Here's a silly example:
(declare-const b00 Barrier)
(declare-const b01 Barrier)
(assert (=> (groupCompletesTogether b00 b01)
(> (rank b00) (rank b01))))
(check-sat)
(get-model)
This prints:
sat
(model
(define-fun b01 () Barrier
(barrier 3 0 2437 1797))
(define-fun b00 () Barrier
(barrier 2 1 1236 1796))
)
This isn't a particularly interesting model, but it is correct nonetheless. I hope this explains the issue and sets you on the right path to model. You can use that predicate in conjunction with other facts as well, and I suspect in a sat scenario, that's really what you want. So, you can say:
(assert (distinct b00 b01))
(assert (and (= (group b00) (group b01))
(groupCompletesTogether b00 b01)
(> (rank b00) (rank b01))))
and you'd get the following model:
sat
(model
(define-fun b01 () Barrier
(barrier 3 2436 0 1236))
(define-fun b00 () Barrier
(barrier 2 2437 0 1236))
)
which is now getting more interesting!
In general, while SMTLib does support quantifiers, you should try to stay away from them as much as possible as it renders the logic semi-decidable. And in general, you only want to write quantified axioms like you did for uninterpreted constants. (That is, introduce a new function/constant, let it go uninterpreted, but do assert a universally quantified axiom that it should satisfy.) This can let you model a bunch of interesting functions, though quantifiers can make the solver respond unknown, so they are best avoided if you can.
[Side note: As a rule of thumb, When you write a quantified axiom over a freely-generated datatype (like your Barrier), it'll either be trivially true or will never be satisfied because the universe literally will contain everything that can be constructed in that way. Think of it like a datatype in Haskell/ML etc.; where it's nothing but a container of all possible values.]
For what it is worth I was able to move forward by using sorts and uninterpreted functions instead of data types.
(declare-sort Barrier 0)
(declare-fun proc (Barrier) Int)
(declare-fun rank (Barrier) Int)
(declare-fun group (Barrier) Int)
(declare-fun complete-time (Barrier) Int)
Then the forall assertion is sat. I would still appreciate an explanation of why this change made a difference.
I was working with z3 with the following example.
f=Function('f',IntSort(),IntSort())
n=Int('n')
c=Int('c')
s=Solver()
s.add(c>=0)
s.add(f(0)==0)
s.add(ForAll([n],Implies(n>=0, f(n+1)==f(n)+10/(n-c))))
The last equation is inconsistent (since n=c would make it indeterminate). But, Z3 cannot detect this kind of inconsistencies. Is there any way in which Z3 can be made to detect it, or any other tool that can detect it?
As far as I can tell, your assertion that the last equation is inconsistent does not match the documentation of the SMT-LIB standard. The page Theories: Reals says:
Since in SMT-LIB logic all function symbols are interpreted
as total functions, terms of the form (/ t 0) are meaningful in
every instance of Reals. However, the declaration imposes no
constraints on their value. This means in particular that
for every instance theory T and
for every value v (as defined in the :values attribute) and
closed term t of sort Real,
there is a model of T that satisfies (= v (/ t 0)).
Similarly, the page Theories: Ints says:
See note in the Reals theory declaration about terms of the form
(/ t 0).
The same observation applies here to terms of the form (div t 0) and
(mod t 0).
Therefore, it stands to reason to believe that no SMT-LIB compliant tool would ever print unsat for the given formula.
Z3 does not check for division by zero because, as Patrick Trentin mentioned, the semantics of division by zero according to SMT-LIB are that it returns an unknown value.
You can manually ask Z3 to check for division by zero, to ensure that you never depend division by zero. (This is important, for example, if you are modeling a language where division by zero has a different semantics from SMT-LIB.)
For your example, this would look as follows:
(declare-fun f (Int) Int)
(declare-const c Int)
(assert (>= c 0))
(assert (= (f 0) 0))
; check for division by zero
(push)
(declare-const n Int)
(assert (>= n 0))
(assert (= (- n c) 0))
(check-sat) ; reports sat, meaning division by zero is possible
(get-model) ; an example model where division by zero would occur
(pop)
;; Supposing the check had passed (returned unsat) instead, we could
;; continue, safely knowing that division by zero could not happen in
;; the following.
(assert (forall ((n Int))
(=> (>= n 0)
(= (f (+ n 1))
(+ (f n) (/ 10 (- n c)))))))
Using smtlib I would like to make something like modulo using QF_UFNRA. This disables me from using mod, to_int, to_real an such things.
In the end I want to get the fractional part of z in the following code:
(set-logic QF_UFNRA)
(declare-fun z () Real)
(declare-fun z1 () Real)
(define-fun zval_1 ((x Real)) Real
x
)
(declare-fun zval (Real) Real)
(assert (= z 1.5));
(assert (=> (and (<= 0.0 z) (< z 1.0)) (= (zval z) (zval_1 z))))
(assert (=> (>= z 1.0) (= (zval z) (zval (- z 1.0)))))
(assert (= z1 (zval z)))
Of course, as I am asking this question here, implies, that it didn't work out.
Has anybody got an idea how to get the fractional part of z into z1 using logic QF_UFNRA?
This is a great question. Unfortunately, what you want to do is not possible in general if you restrict yourself to QF_UFNRA.
If you could encode such functionality, then you can decide arbitrary Diophantine equations. You would simply cast a given Diophantine equation over reals, compute the "fraction" of the real solution with this alleged method, and assert that the fraction is 0. Since reals are decidable, this would give you a decision procedure for Diophantine equations, accomplishing the impossible. (This is known as Hilbert's 10th problem.)
So, as innocent as the task looks, it is actually not doable. But that doesn't mean you cannot encode this with some extensions, and possibly have the solver successfully decide instances of it.
If you allow quantifiers and recursive functions
If you allow yourself quantifiers and recursive-functions, then you can write:
(set-logic UFNRA)
(define-fun-rec frac ((x Real)) Real (ite (< x 1) x (frac (- x 1))))
(declare-fun res () Real)
(assert (= (frac 1.5) res))
(check-sat)
(get-value (res))
To which z3 responds:
sat
((res (/ 1.0 2.0)))
Note that we used the UFNRA logic allowing quantification, which is required here implicitly due to the use of the define-fun-rec construct. (See the SMTLib manual for details.) This is essentially what you tried to encode in your question, but instead using the recursive-function-definition facilities instead of implicit encoding. There are several caveats in using recursive functions in SMTLib however: In particular, you can write functions that render your system inconsistent rather easily. See Section 4.2.3 of http://smtlib.cs.uiowa.edu/papers/smt-lib-reference-v2.5-draft.pdf for details.
If you can use QF_UFNIRA
If you move to QF_UFNIRA, i.e., allow mixing reals and integers, the encoding is easy:
(set-logic QF_UFNIRA)
(declare-fun z () Real)
(declare-fun zF () Real)
(declare-fun zI () Int)
(assert (= z (+ zF zI)))
(assert (<= 0 zF))
(assert (< zF 1))
(assert (= z 1.5))
(check-sat)
(get-value (zF zI))
z3 responds:
sat
((zF (/ 1.0 2.0))
(zI 1))
(You might have to be careful about the computation of zI when z < 0, but the idea is the same.)
Note that just because the encoding is easy doesn't mean z3 will always be able to answer the query successfully. Due to mixing of Real's and Integer's, the problem remains undecidable as discussed before. If you have other constraints on z, z3 might very well respond unknown to this encoding. In this particular case, it happens to be simple enough so z3 is able to find a model.
If you have sin and pi:
This is more of a thought experiment than a real alternative. If SMTLib allowed for sin and pi, then you can check whether sin (zI * pi) is 0, for a suitably constrained zI. Any satisfying model to this query would ensure that zI is integer. You can then use this value to extract the fractional part by subtracting zI from z.
But this is futile as SMTLib neither allows for sin nor pi. And for good reason: Decidability would be lost. Having said that, maybe some brave soul can design a logic that supported sin, pi, etc., and successfully answered your query correctly, while returning unknown when the problem becomes too hard for the solver. This is already the case for nonlinear arithmetic and the QF_UFNIRA fragment: The solver may give up in general, but the heuristics it employs might solve problems of practical interest.
Restriction to Rationals
As a theoretical aside, it turns out that if you restrict yourself to rationals only (instead of actual reals) then you can indeed write a first-order formula to recognize integers. The encoding is not for the faint of heart, however: http://math.mit.edu/~poonen/papers/ae.pdf. Furthermore, since the encoding involves quantifiers, it's probably quite unlikely that SMT solvers will do well with a formulation based on this idea.
[Incidentally, I should extend thanks to my work colleagues; this question made for a great lunch-time conversation!]
I have a question about Quantifiers.
Suppose that I have an array and I want to calculate array index 0, 1 and 2 for this array -
(declare-const cpuA (Array Int Int))
(assert (or (= (select cpuA 0) 0) (= (select cpuA 0) 1)))
(assert (or (= (select cpuA 1) 0) (= (select cpuA 1) 1)))
(assert (or (= (select cpuA 2) 0) (= (select cpuA 2) 1)))
Or otherwise I can specify the same using forall construct as -
(assert (forall ((x Int)) (=> (and (>= x 0) (<= x 2)) (or (= (select cpuA x) 0) (= (select cpuA x) 1)))))
Now I would like to understand the difference between two of them.
The first method executes quickly and gives a simple and readable model.
In contrast the code size with second option is very less, but the program takes time to execute. And also the solution is complex.
I would like to use the second method as my code will become smaller.
However, I want to find a readable simple model.
Quantifier reasoning is usually very expensive. In your example, the quantified formula is equivalent to the three assertions you provided.
However, that is not how Z3 decides/solves your formula. Z3 solves your formula using a technique called Model-Based Quantifier Instantiation (MBQI).
This technique can decide many fragments (see http://rise4fun.com/Z3/tutorial/guide). It is mainly effective on the fragments described in this guide.
It supports uninterpreted functions, arithmetic and bit-vector theories. It also has limited support for arrays and datatypes.
This is sufficient for solving your example. The model produced by Z3 seems more complicated because the same engine is used to decide more complicated fragments.
The model should be seem as a small functional program. You can find more information on how this approach works in the following articles:
Complete instantiation for quantified SMT formulas
Efficiently Solving Quantified Bit-Vector Formula
Note that, array theory is mainly useful for representing/modeling unbounded or big arrays. That is, the actual size of the array is not known or is too big. By big, I mean the number of array accesses (i.e., selects) in your formula is much smaller than the actual size of the array. We should ask ourselves : "do we really need arrays for modeling/solving problem X?". You may consider the following alternatives:
(Uninterpreted) functions instead of arrays. Your example can be encoded also as:
(declare-fun cpuA (Int) Int)
(assert (or (= (cpuA 0) 0) (= (cpuA 0) 1)))
(assert (or (= (cpuA 1) 0) (= (cpuA 1) 1)))
(assert (or (= (cpuA 2) 0) (= (cpuA 2) 1)))
Programmatic API. We've seen many examples where arrays (and functions) are used to provide a compact encoding. A compact and elegant encoding is not necessarily easier to solve. Actually, it is usually the other way around. You can achieve the best of both worlds (performance and compactness) using a programmatic API for Z3. In the following link, I encoded your example using one "variable" for each position of the "array". Macros/functions are used to encode constraints such as: an expression is a 0 or 1.
http://rise4fun.com/Z3Py/JF