Fully evaluated results in Z3? - z3

Z3 often gives back models defined in terms of a bunch of intermediate functions. For example, it's common to see the following (pardon my improper syntax):
(define-const myArray (Array Bool Int) (_ as-array f))
(define-fun f (x Bool) Int (f!10 (k!26 x)))
... And so on.
I'd like to be able to get a result back that I can take in my program (calling Z3 using library bindings) and both print the results, and parse them into a function I can actually run. This is much easier if I can get my model functions as single, straight line programs that I can run, instead of as multiple functions defined in terms of each other.
Is this possible? I'm dealing only with finite domain functions, if that helps.

We will be updating the model construction in a future release to compress away intermediate functions when possible. There are, however, cases where this can cause an exponential overhead because the same auxiliary function can be reused in several contexts. For those models, it doesn't make sense to expand the auxiliary functions. So users are still going to be forced to deal with such functions if they want to post-process the models.

Related

Curried functions and function application in Z3

I'm working on a language very similar to STLC that I'm converting to Z3 propositions, hence a few (sub)questions about how Z3 treats (uninterpreted) functions:
Functions are naturally curried in my language and as I'm converting the terms of my language recursively, I'd like to be able to build the corresponding Z3 AST recursively as well. That is, when I have a term f x y I'd like to first apply f to x and then apply that to y. Is there a way to do this? The API I've found so far (Z3_mk_func_decl/Z3_mk_app) seems to require me to collect all arguments first and apply them all at once.
Is there a reasonable way to represent something like (if b then f else g) x?
In both cases, I'm totally fine with functions being uninterpreted and restricting the reasoning to things like "b = True /\ f x = 0 => (if b then f else g) x = 0 holds".
SMTLib (as described in http://smtlib.cs.uiowa.edu/papers/smt-lib-reference-v2.6-r2017-07-18.pdf) is a many-sorted first-order logic. All functions (uninterpreted or not) must be applied to all its arguments, and you cannot have any form of currying. Also, you cannot do higher-order if-then-else, i.e., the branches of an if-then-else will have to be first-order values. (However, they can be arrays, and you can imagine "faking" functions with arrays. But that's besides the point.)
It should be noted that the next iteration of SMTLib (v3) will be based on a higher-order logic, at which point features like you're asking might become available. See: http://smtlib.cs.uiowa.edu/version3.shtml. Of course, this is still a proposal and it'll take a while before it's settled and actual solvers start implementing it faithfully. Eventually that'll happen, but I wouldn't expect it in the very near term.
Aside: Since you mentioned STLC (simply-typed-lambda-calculus), I presume you might be familiar with functional languages like Haskell. If that's the case, you might want to look into using SBV: https://hackage.haskell.org/package/sbv. It provides a framework for doing some of these things by carefully translating them away behind the scenes. Here's an example:
Prelude Data.SBV> sat $ \b -> (ite b (uninterpret "f") (uninterpret "g")) (0::SInteger) .== (0::SInteger)
Satisfiable. Model:
s0 = True :: Bool
f :: Integer -> Integer
f _ = 0
g :: Integer -> Integer
g _ = 2
Here we created two functions and used the ite construct to "merge" them; and got the solver to return us a model. Behind the scenes, SBV will fully saturate these applications and let you "pretend" you're programming in a higher-order sense, just like in STLC or Haskell. Of course, the devil is in the details and there are limitations to the approach, but modeling STLC in Haskell is a classic pastime for many people and doing it symbolically using SBV can be a fun exercise.

Using a Closure instead of a Global Variable

This question is a continuation of the comments at Using Local Special Variables, regarding how best to avoid global variables. As I understand it, global variables are problematic mainly because they have the potential to interfere with referential transparency. Transparency is violated if an expression changes a global value using information outside its calling context (eg, a previous value of the global variable itself, or any other external values). In these cases evaluating the expression may have different results at different times, either in the value(s) returned or in side effects. (However, it seems not all global updates are problematic, since some updates may not depend on any external information--eg, resetting a global counter to 0). The normal global approach for a deeply embedded counter might look like:
* (defparameter *x* 0)
*X*
* (defun foo ()
(incf *x*))
FOO
* (defun bar ()
(foo))
BAR
* (bar)
1
* *x*
1
This would seem to violate referential transparency because (incf *x*) depends on the external (global) value of *x* to do its work. The following is an attempt to maintain both functionality and referential transparency by eliminating the global variable, but I'm not convinced that it really does:
* (let ((x 0))
(defun inc-x () (incf x))
(defun reset-x () (setf x 0))
(defun get-x () x))
GET-X
* (defun bar ()
(inc-x))
BAR
* (defun foo ()
(bar))
FOO
* (get-x)
0
* (foo)
1
* (get-x)
1
The global variable is now gone, but it still seems like the expression (inc-x) has a (latent) side effect, and it will return different (but unused) values each time it is called. Does this confirm that using a closure on the variable in question does not solve the transparency problem?
global variables are problematic mainly because they have the potential to interfere with referential transparency
If one wants to create a global configuration value, a global variable in Common Lisp is just fine.
Often it's desirable to package a bunch of configuration state and then it may be better to put that into an object.
There is no general requirement for procedures to be referential transparent.
It's useful to guide software design by software engineering principles, but often easy debugging and maintenance is more important than strict principles.
(let ((x 0))
(defun inc-x () (incf x))
(defun reset-x () (setf x 0))
(defun get-x () x))
Practically above means that it
is difficult to inspect
has problematic effects of reloading the code
prohibits the file compiler to recognize the top-level nature of the functions
creates a whole API for just managing a single variable
Referential transparency means that if you bind some variable x to an expression e, you can replace all occurrences of x by e without changing the outcome. For example:
(let ((e (* pi 2)))
(list (cos e) (sin e)))
The above could be written:
(list (cos (* pi 2))
(sin (* pi 2)))
The resulting value is equivalent to the first one for some useful definition of equivalence (here equalp, but you could choose another one). Contrast this with:
(let ((e (random))
(list e e))
Here above, each call to random gives a different result (statistically), and thus the behaviour is different if you reuse the same result multiple times or generate a new after each call.
Special variables are like additional arguments to functions, they can influence the outcome of a result simply by being bound to different values. Consider *default-pathname-defaults*, which is used to build pathnames.
In fact, for a given binding of that variable, each call to (merge-pathnames "foo") returns the same result. The result changes only if you use the same expression in different dynamical context, which is no different than calling a function with different arguments.
The main difficulty is that the special variable is hidden, i.e. you might not know that it influences some expressions, and that's why you need them documented and limited in number.
What breaks referential transparency is the presence of side-effects, whether you are using lexical or special variables. In both cases, a place is modified as part of the execution of the function, which means that you need to consider when and how often you call it.
You could have better suggestions if you explained a bit more how your code is organized. You said that you have many special variables due to prototyping but in the refactoring you want to do it seems as-if you want to keep to prototypal code mostly untouched. Maybe there is a way to pack things in a nice modular way but we can't help without knowing more about why you need many special variables, etc.
That code isn't referentially transparent. It is an improvement from special variables though.
The code you put would be a functional nonce if you dropped the reset-x.
My answer to your previous question had general guidelines about special variables. For your specific case, perhaps they are worth it? I could see the case for using special variables as a nonce, for example, where it is probably silly to pass them around.
Common Lisp has so many facilities for dealing with global information, so there is rarely a need for having lots of global variables. You could define an *env* alist to store your values in, or put them in a hash table, or put them into symbol plists, or package them in a closure to pass around, or do something else, or use CLOS.
Where is the side effect of the second example ? The x inside the let isn't accessible from the outside.
Here's another closure example, with top-level functions, and a counter explicitly inside it.
(defun repeater (n)
(let ((counter -1))
(lambda ()
(if (< counter n)
(incf counter)
(setf counter 0)))))
(defparameter *my-repeater* (repeater 3))
;; *MY-REPEATER*
(funcall *my-repeater*)
0
(funcall *my-repeater*)
1
https://lispcookbook.github.io/cl-cookbook/functions.html#closures

Modelling generic datatypes for Z3 and or SMT(v2.6)

I would like to model the behaviour of generic datatypes in SMT v2.6. I am using Z3 as constraint solver. I modelled, based on the official example, a generic list as parameterised datatype in the following way:
(declare-datatypes (T) ((MyList nelem (cons (hd T) (tl MyList)))))
I would like the list to be generic with respect to the datatype. Later on, I would like to declare constants the following way:
(declare-const x (MyList Int))
(declare-const y (MyList Real))
However, now I would like to define functions on the generic datatype MyList (e.g., a length operation, empty operation, ...) so that they are re-usable for all T's. Do you have an idea how I could achieve this? I did try something like:
(declare-sort K)
(define-fun isEmpty ((in (MyList K))) Bool
(= in nelem)
)
but this gives me an error message; for this example to work Z3 would need to do some type-inference, I suppose.
Would be great if you could could give me a hint.
SMT-Lib does not allow polymorphic user-defined functions. Section 4.1.5 of http://smtlib.cs.uiowa.edu/papers/smt-lib-reference-v2.6-r2017-07-18.pdf states:
Well-sortedness checks, required for commands that use sorts or terms,
are always done with respect to the current signature. It is an error
to declare or define a symbol that is already in the current
signature. This implies in particular that, contrary to theory
function symbols, user-defined function symbols cannot be overloaded.
Which is further expanded in Footnote-29:
The motivation for not overloading user-defined symbols is to simplify
their processing by a solver. This restriction is significant only for
users who want to extend the signature of the theory used by a script
with a new polymorphic function symbol—i.e., one whose rank would
contain parametric sorts if it was a theory symbol. For instance,
users who want to declare a “reverse” function on arbitrary lists,
must define a different reverse function symbol for each (concrete)
list sort used in the script. This restriction might be removed in
future versions.
So, as you suspected, you cannot define "polymorphic" functions at the user level. But as the footnote indicates, this restriction might be removed in the future, something that will most likely happen as SMT-solvers are more widely deployed. Exactly when that might happen, however, is anyone's guess.

How incremental solving works in Z3?

I have a question regarding how Z3 incrementally solves problems. After reading through some answers here, I found the following:
There are two ways to use Z3 for incremental solving: one is push/pop(stack) mode, the other is using assumptions. Soft/Hard constraints in Z3.
In stack mode, z3 will forget all learned lemmas in global (am I right?) scope even after one local "pop" Efficiency of constraint strengthening in SMT solvers
In assumptions mode (I don't know the name, that is the name that comes to my mind), z3 will not simplify some formulas, e.g. value propagation. z3 behaviour changing on request for unsat core
I did some comparison (you are welcome to ask for the formulas, they are just too large to put on the rise4fun), but here are my observations: On some formulas, including quantifiers, the assumptions mode is faster. On some formulas with lots of boolean variables (assumptions variables), stack mode is faster than assumptions mode.
Are they implemented for specific purposes? How does incremental solving work in Z3?
Yes, there are essentially two incremental modes.
Stack based: using push(), pop() you create a local context, that follows a stack discipline. Assertions added under a push() are removed after a matching pop(). Furthermore, any lemmas that are derived under a push are removed. Use push()/pop() to emulate freezing a state and adding additional constraints over the frozen state, then resume to the frozen state. It has the advantage that any additional memory overhead (such as learned lemmas) built up within the scope of a push() is released. The working assumption is that learned lemmas under a push would not be useful any longer.
Assumption based: using additional assumption literals passed to check()/check_sat() you can (1) extract unsatisfiable cores over the assumption literals, (2) gain local incrementality without garbage collecting lemmas that get derived independently of the assumptions. In other words, if Z3 learns a lemma that does not contain any of the assumption literals it expects to not garbage collect them. To use assumption literals effectively, you would have to add them to formulas too. So the tradeoff is that clauses used with assumptions contain some amount of bloat. For example if you want to locally assume some formula (<= x y), then you add a clause (=> p (<= x y)), and assume p when calling check_sat(). Note that the original assumption was a unit. Z3 propagates units efficiently. With the formulation that uses assumption literals it is no longer a unit at the base level of search. This incurs some extra overhead. Units become binary clauses, binary clauses become ternary clauses, etc.
The differentiation between push/pop functionality holds for Z3's default SMT engine. This is the engine most formulas will be using. Z3 contains some portfolio of engines. For example, for pure bit-vector problems, Z3 may end up using the sat based engine. Incrementality in the sat based engine is implemented differently from the default engine. Here incrementality is implemented using assumption literals. Any assertion you add within the scope of a push is asserted as an implication (=> scope_literals formula). check_sat() within such a scope will have to deal with assumption literals. On the flip-side, any consequence (lemma) that does not depend on the current scope is not garbage collected on pop().
In optimization mode, when you assert optimization objectives, or when you use the optimization objects over the API, you can also invoke push/pop. Likewise with fixedpoints. For these two features, push/pop are essentially for user-convenience. There is no internal incrementality. The reason is that these two modes use substantial pre-processing that is super non-incremental.

Z3: A better way to model?

I've two SMT problem instances. The first is here:
http://gist.github.com/1232766
Z3 returns a model for this problem in about 2 minutes on my not-so-great machine, which is great.. I also have this one:
http://gist.github.com/1232769
I've ran z3 overnight on this problem, without Z3 completing. If you check the contents of these files, you'll see that the second one is identical to the first, except it has an extra assertion to "reject" the model returned by the first instance. (You can do a "diff" between them to see what I mean.) I happen to know that this problem has multiple satisfying models, and I'm trying to use z3 to find all satisfying models.
I understand that this might be completely expected, but I was curious to know why the second one is a much tougher problem for Z3 compared to the first. Is there a better way to formulate the second problem so Z3 will have an easier time?
Thanks..
It is hard to give you a precise answer without knowing more about your application.
As you suggested, modeling plays a big role in the logic you are using: AUFBV.
The strategy used by Z3 also has a big impact on the overall performance.
Z3 comes equipped with several builtin strategies. It has many parameters that can be used to influence the search.
Z3 also has a strategy specification language. This is a new feature. I’m not advertising it because it is working in progress, and the language will most likely change in the next versions.
You can access more information about the strategy language by executing the commands:
(help check-sat-using)
(help-strategy)
That being said, there is a builtin strategy in Z3 that seems to be effective on your problem.
It is the strategy used for the logic UFBV. Your problem uses arrays, but they can be avoided by transforming table0 into a function with two arguments:
(declare-fun table0 ((_ BitVec 64) (_ BitVec 64)) (_ BitVec 8))
And replacing every term of the form (select (table0 s65) t) with (table0 s65 t) where t is an arbitrary term.
Finally, you must also add the command (set-logic UFBV) in the beginning of the file. With this setting, I managed to generate 4 different models for your query.
I didn’t try to generate more than that. Each call consumed approx 75 secs.

Resources