Simulate the semantics of x86 opcode 'bsf' in Z3 - z3

I am working on some assembly program analysis task using Z3. And I am trapped in simulating the semantics of x86 opcode bsf.
The semantics of bsf operand1 operand2 is defined as searches the source operand (operand1) for the least significant set bit (1 bit).
Its semantics can be simulated in C as:
if(operand1 == 0) {
ZF = 0;
operand2 = Undefined;
}
else {
ZF = 0;
Temporary = 0;
while(Bit(operand1, Temporary) == 0) {
Temporary = Temporary + 1;
operand2 = Temporary;
}
}
Right now, suppose each operand (e.g., register) maintains a symbolic expression, I am trying to simulate the above semantics in Z3Py. The code I wrote is something like this (simplified):
def aux_bsf(x): # x is operand1
if simplify(x == 0):
raise Exception("undefined in aux_bsf")
else:
n = x.size()
for i in range(n):
b = Extract(i, i, x)
if simplify(b == 1):
return BitVecVal(i, 32)
raise Exception("undefined in bsf")
However, I find that the evaluation of simplify(x==0) (e.g., x equals BitVecVal(13, 32) + BitVec("symbol1", 32),) is always equal to True. In other words, I am always trapped in the first exception!
Am I doing anything wrong here..?
====================================================
OK, so I think what I need is something like:
def aux_bsf(x):
def aux(x, i):
if i == 31:
return 31
else:
return If(Extract(i, i, x) == 1, i, aux(x, i+1))
return aux(x, 0)

simplify(x == 0) returns an expression, it does not return True/False, where False = 0. Python would treat an expression reference as a non-zero value and therefore take the first branch. Unless 'x' is a bit-vector constant, simplification would not return a definite value. The same issue is with simplify(b == 1).
You could encode such functions as a relation between operand1 and operand2, e.g., something along the lines of:
def aux_bsf(s, x, y):
for k in range(x.size()):
s.Add(Implies(lsb(k, x), y == k)
def lsb(k, x):
first0 = True
if k > 0:
first0 = Extract(x, k-1,0) == 0
return And(Extract(x,k,k) == 1, first0)
You can also use uninterpreted functions for the cases where aux_bsf is under-specified.
For example:
def aux_bsf(x):
bv = x.sort()
bsf_undef = Function('bsf-undef', bv, bv)
result = bsf_undef(x)
for k in reverse(range(bv.size()))
result = If(Extract(x, k, k) == 1), BitVecVal(k, bv), result)
return result
def reverse(xs):
....

Related

Max and Min of a set of variables in z3py

I have a problem where I want to limit the range of a real variable between the maximum and minimum value of another set of real variables.
s = Solver()
y = Real('y')
Z = RealVector('z', 10)
s.add(And(y >= min(Z), y <= max(Z)))
Is there a way to do this in z3py?
You can use Axel's solution; though that one requires you to create an extra variable and also asserts more constraints than needed. Moreover, it doesn't let you use min and max as simple functions. It might be easier to just program this in a functional way, like this:
# Return minimum of a vector; error if empty
def min(vs):
m = vs[0]
for v in vs[1:]:
m = If(v < m, v, m)
return m
# Return maximum of a vector; error if empty
def max(vs):
m = vs[0]
for v in vs[1:]:
m = If(v > m, v, m)
return m
Another difference is that in the functional style we throw an error if the vector is empty. In the other style, the result will essentially be unconstrained. (i.e., min/max can take any value.) You should consider which semantics is right for your application, in case the vector you're passing might be empty. (At the least, you should change it so it prints out a nicer error message. Currently it'll throw an IndexError: list index out of range error if given an empty vector.)
Now you can say:
s = Solver()
y = Real('y')
Z = RealVector('z', 10)
s.add(And(y >= min(Z), y <= max(Z)))
print (s.check())
print (s.model())
This prints:
sat
[z__7 = -1,
z__0 = -7/2,
z__4 = -5/2,
z__5 = -2,
z__3 = -9/2,
z__2 = -4,
z__8 = -1/2,
y = 0,
z__9 = 0,
z__6 = -3/2,
z__1 = -3]
You could benefit from Hakan Kjellerstrand's collection of useful z3py definitions:
from z3 import *
# Functions written by Hakan Kjellerstrand
# http://hakank.org/z3/
# The following can be used by importing http://www.hakank.org/z3/z3_utils_hakank.py
# v is the maximum value of x
def maximum(sol, v, x):
sol.add(Or([v == x[i] for i in range(len(x))])) # v is an element in x)
for i in range(len(x)):
sol.add(v >= x[i]) # and it's the greatest
# v is the minimum value of x
def minimum(sol, v, x):
sol.add(Or([v == x[i] for i in range(len(x))])) # v is an element in x)
for i in range(len(x)):
sol.add(v <= x[i]) # and it's the smallest
s = Solver()
y = Real('y')
zMin = Real('zMin')
zMax = Real('zMax')
Z = RealVector('z', 10)
maximum(s, zMin, Z)
minimum(s, zMax, Z)
s.add(And(y >= zMin, y <= zMax))
print(s.check())
print(s.model())

Reducing an integer set in z3 over addition

I'm experimenting with (and failing at) reducing sets in z3 over operations like addition. The idea is eventually to prove stuff about arbitrary reductions over reasonably-sized fixed-sized sets.
The first of the two examples below seems like it should yield unsat, but it doesn't. The second does work, but I would prefer not to use it as it requires incrementally fiddling with the model.
def test_reduce():
LIM = 5
VARS = 10
poss = [Int('i%d'%x) for x in range(VARS)]
i = Int('i')
s = Solver()
arr = Array('arr', IntSort(), BoolSort())
s.add(arr == Lambda(i, And(i < LIM, i >= 0)))
a = arr
for x in range(len(poss)):
s.add(Implies(a != EmptySet(IntSort()), arr[poss[x]]))
a = SetDel(a, poss[x])
def final_stmt(l):
if len(l) == 0: return 0
return If(Not(arr[l[0]]), 0, l[0] + (0 if len(l) == 1 else final_stmt(l[1:])))
sm = final_stmt(poss)
s.push()
s.add(sm == 1)
assert s.check() == unsat
Interestingly, the example below works much better, but I'm not sure why...
def test_reduce_with_loop_model():
s = Solver()
i = Int('i')
arr = Array('arr', IntSort(), BoolSort())
LIM = 1000
s.add(arr == Lambda(i, And(i < LIM, i >= 0)))
sm = 0
f = Int(str(uuid4()))
while True:
s.push()
s.add(arr[f])
chk = s.check()
if chk == unsat:
s.pop()
break
tmp = s.model()[f]
sm = sm + tmp
s.pop()
s.add(f != tmp)
s.push()
s.add(sm == sum(range(LIM)))
assert s.check() == sat
s.pop()
s.push()
s.add(sm == 11)
assert s.check() == unsat
Note that your call to:
f = Int(str(uuid4()))
Is inside the loop in the first case, and is outside the loop in the second case. So, the second case simply works on one variable, and thus converges quickly. While the first one keeps creating variables and creates a much harder problem for z3. It's not surprising at all that these two behave significantly differently, as they encode entirely different constraints.
As a general note, reducing an array of elements with an operation is just not going to be an easy problem for z3. First, you have to assume an upper bound on the elements. And if that's the case, then why bother with Lambda or Array at all? Simply create a Python list of that many variables, and ignore the array logic completely. That is:
elts = [Int("s%d"%i) for i in range(100)]
And then to access the elements of your 'array', simply use Python accessor notation elts[12].
Note that this only works if all your accesses are with a constant integer; i.e., your index cannot be symbolic. But if you're looking for proving reduction properties, that should suffice; and would be much more efficient.

Finding a prime with Miller Rabin

I have what I believe is a proper implementation of the miller-rabin algorithm using Lua, and I am trying to get a consistent return for prime numbers. It seems my implementation only works half of the time. Although if I try implementing similar code within python, that code works 100% of the time. Could someone point me in the right direction?
--decompose n-1 as (2^s)*d
local function decompose(negOne)
exponent, remainder = 0, negOne
while (remainder%2) == 0 do
exponent = exponent+1
remainder = remainder/2
end
assert((2^exponent)*remainder == negOne and ((remainder%2) == 1), "Error setting up s and d value")
return exponent, remainder
end
local function isNotWitness(n, possibleWitness, exponent, remainder)
witness = (possibleWitness^remainder)%n
if (witness == 1) or (witness == n-1) then
return false
end
for _=0, exponent do
witness = (witness^2)%n
if witness == (n-1) then
return false
end
end
return true
end
--using miller-rabin primality testing
--n the integer to be tested, k the accuracy of the test
local function isProbablyPrime(n, accuracy)
if n <= 3 then
return n == 2 or n == 3
end
if (n%2) == 0 then
return false
end
exponent, remainder = decompose(n-1)
--checks if it is composite
for i=0, accuracy do
math.randomseed(os.time())
witness = math.random(2, n - 2)
if isNotWitness(n, witness, exponent, remainder) then
return false
end
end
--probably prime
return true
end
if isProbablyPrime(31, 30) then
print("prime")
else
print("nope")
end
Python has arbitrary length integers. Lua doesn't.
The problem is in witness = (possibleWitness^remainder)%n.
Lua is unable to calculate exact result of 29^15 % 31 directly.
There is a workaround working for numbers n < sqrt(2^53):
witness = mulmod(possibleWitness, remainder, n)
where
local function mulmod(a, e, m)
local result = 1
while e > 0 do
if e % 2 == 1 then
result = result * a % m
e = e - 1
end
e = e / 2
a = a * a % m
end
return result
end

Z3 real arithmetics and data types theories integrating not that well

This is related to the question I asked before at Z3 SMT 2.0 vs Z3 py implementation
I implemented the full algebra for positive reals with infinity and the solver is misbehaving.
I get unknown on this simple instance, when the commented constraint gives an actual solution for the constraint.
# Data type declaration
MyR = Datatype('MyR')
MyR.declare('inf');
MyR.declare('num',('re',RealSort()))
MyR = MyR.create()
inf = MyR.inf
num = MyR.num
re = MyR.re
# Functions declaration
#sum
def msum(a, b):
return If(a == inf, a, If(b == inf, b, num(re(a) + re(b))))
#greater or equal
def mgeq(a, b):
return If(a == inf, True, If(b == inf, False, re(a) >= re(b)))
#greater than
def mgt(a, b):
return If(a == inf, b!=inf, If(b == inf, False, re(a) > re(b)))
#multiplication inf*0=0 inf*inf=inf num*num normal
def mmul(a, b):
return If(a == inf, If(b==num(0),b,a), If(b == inf, If(a==num(0),a,b), num(re(a)*re(b))))
s0,s1,s2 = Consts('s0 s1 s2', MyR)
# Constraints add to solver
constraints =[
s2==mmul(s0,s1),
s0!=inf,
s1!=inf
]
#constraints =[s2==mmul(s0,s1),s0==num(1),s1==num(2)]
sol1= Solver()
sol1.add(constraints)
set_option(rational_to_decimal=True)
if sol1.check()==sat:
m = sol1.model()
print m
else:
print sol1.check()
I don't know whether this is surprising or expected. Is there a way to make it work?
Your problem is nonlinear. The new (and complete) nonlinear arithmetic solver (nlsat) in Z3 is not integrated with other theories such as algebraic datatypes. See the posts:
getting unsat core using Z3_solver_get_unsat_core
Non-linear arithmetic and uninterpreted functions
This is a limitation in the current version. Future versions will address this issue.
In the meantime, you can workaround the problem by using a different encoding. If you use only real arithmetic and Booleans, the problem will be in the scope of nlsat. One possibility is to encode MyR as a Python pair: Z3 Boolean expression and Z3 Real expression.
Here is a partial encoding. I did not encode all operators. The example is also available online at http://rise4fun.com/Z3Py/EJLq
from z3 import *
# Encoding MyR as pair (Z3 Boolean expression, Z3 Real expression)
# We use a class to be able to overload +, *, <, ==
class MyRClass:
def __init__(self, inf, val):
self.inf = inf
self.val = val
def __add__(self, other):
other = _to_MyR(other)
return MyRClass(Or(self.inf, other.inf), self.val + other.val)
def __radd__(self, other):
return self.__add__(other)
def __mul__(self, other):
other = _to_MyR(other)
return MyRClass(Or(self.inf, other.inf), self.val * other.val)
def __rmul(self, other):
return self.__mul__(other)
def __eq__(self, other):
other = _to_MyR(other)
return Or(And(self.inf, other.inf),
And(Not(self.inf), Not(other.inf), self.val == other.val))
def __ne__(self, other):
return Not(self.__eq__(other))
def __lt__(self, other):
other = _to_MyR(other)
return And(Not(self.inf),
Or(other.inf, self.val < other.val))
def MyR(name):
# A MyR variable is encoded as a pair of variables name.inf and name.var
return MyRClass(Bool('%s.inf' % name), Real('%s.val' % name))
def MyRVal(v):
return MyRClass(BoolVal(False), RealVal(v))
def Inf():
return MyRClass(BoolVal(True), RealVal(0))
def _to_MyR(v):
if isinstance(v, MyRClass):
return v
elif isinstance(v, ArithRef):
return MyRClass(BoolVal(False), v)
else:
return MyRVal(v)
s0 = MyR('s0')
s1 = MyR('s1')
s2 = MyR('s2')
sol = Solver()
sol.add( s2 == s0*s1,
s0 != Inf(),
s1 != Inf(),
s0 == s1,
s2 == 2,
)
print sol
print sol.check()
print sol.model()

How can I do mod without a mod operator?

This scripting language doesn't have a % or Mod(). I do have a Fix() that chops off the decimal part of a number. I only need positive results, so don't get too robust.
Will
// mod = a % b
c = Fix(a / b)
mod = a - b * c
do? I'm assuming you can at least divide here. All bets are off on negative numbers.
a mod n = a - (n * Fix(a/n))
For posterity, BrightScript now has a modulo operator, it looks like this:
c = a mod b
If someone arrives later, here are some more actual algorithms (with errors...read carefully)
https://eprint.iacr.org/2014/755.pdf
There are actually two main kind of reduction formulae: Barett and Montgomery. The paper from eprint repeat both in different versions (algorithms 1-3) and give an "improved" version in algorithm 4.
Overview
I give now an overview of the 4. algorithm:
1.) Compute "A*B" and Store the whole product in "C" that C and the modulus $p$ is the input for that algorithm.
2.) Compute the bit-length of $p$, say: the function "Width(p)" returns exactly that value.
3.) Split the input $C$ into N "blocks" of size "Width(p)" and store each in G. Start in G[0] = lsb(p) and end in G[N-1] = msb(p). (The description is really faulty of the paper)
4.) Start the while loop:
Set N=N-1 (to reach the last element)
precompute $b:=2^{Width(p)} \bmod p$
while N>0 do:
T = G[N]
for(i=0; i<Width(p); i++) do: //Note: that counter doesn't matter, it limits the loop)
T = T << 1 //leftshift by 1 bit
while is_set( bit( T, Width(p) ) ) do // (N+1)-th bit of T is 1
unset( bit( T, Width(p) ) ) // unset the (N+1)-th bit of T (==0)
T += b
endwhile
endfor
G[N-1] += T
while is_set( bit( G[N-1], Width(p) ) ) do
unset( bit( G[N-1], Width(p) ) )
G[N-1] += b
endwhile
N -= 1
endwhile
That does alot. Not we only need to recursivly reduce G[0]:
while G[0] > p do
G[0] -= p
endwhile
return G[0]// = C mod p
The other three algorithms are well defined, but this lacks some information or present it really wrong. But it works for any size ;)
What language is it?
A basic algorithm might be:
hold the modulo in a variable (modulo);
hold the target number in a variable (target);
initialize modulus variable;
while (target > 0) {
if (target > modulo) {
target -= modulo;
}
else if(target < modulo) {
modulus = target;
break;
}
}
This may not work for you performance-wise, but:
while (num >= mod_limit)
num = num - mod_limit
In javascript:
function modulo(num1, num2) {
if (num2 === 0 || isNaN(num1) || isNaN(num2)) {
return NaN;
}
if (num1 === 0) {
return 0;
}
var remainderIsPositive = num1 >= 0;
num1 = Math.abs(num1);
num2 = Math.abs(num2);
while (num1 >= num2) {
num1 -= num2
}
return remainderIsPositive ? num1 : 0 - num1;
}

Resources