I have a finite set of pairs of type (int a, int b). The exact values of the pairs are explicitly present in the knowledge base. For example it could be represented by a function (int a, int b) -> (bool exists) which is fully defined on a finite domain.
I would like to write a function f with signature (int b) -> (int count), representing the number of pairs containing the specified b value as its second member. I would like to do this in z3 python, though it would also be useful to know how to do this in the z3 language
For example, my pairs could be:
(0, 0)
(0, 1)
(1, 1)
(1, 2)
(2, 1)
then f(0) = 1, f(1) = 3, f(2) = 1
This is a bit of an odd thing to do in z3: If the exact values of the pairs are in your knowledge base, then why do you need an SMT solver? You can just search and count using your regular programming techniques, whichever language you are in.
But perhaps you have some other constraints that come into play, and want a generic answer. Here's how one would code this problem in z3py:
from z3 import *
pairs = [(0, 0), (0, 1), (1, 1), (1, 2), (2, 1)]
def count(snd):
return sum([If(snd == p[1], 1, 0) for p in pairs])
s = Solver()
searchFor = Int('searchFor')
result = Int('result')
s.add(Or(*[searchFor == d[0] for d in pairs]))
s.add(result == count(searchFor))
while s.check() == sat:
m = s.model()
print("f(" + str(m[searchFor]) + ") = " + str(m[result]))
s.add(searchFor != m[searchFor])
When run, this prints:
f(0) = 1
f(1) = 3
f(2) = 1
as you predicted.
Again; if your pairs are exactly known (i.e., they are concrete numbers), don't use z3 for this problem: Simply write a program to count as needed. If the database values, however, are not necessarily concrete but have other constraints, then above would be the way to go.
To find out how this is coded in SMTLib (the native language z3 speaks), you can insert print(s.sexpr()) in the program before the while loop starts. That's one way. Of course, if you were writing this by hand, you might want to code it differently in SMTLib; but I'd strongly recommend sticking to higher-level languages instead of SMTLib as it tends to be hard to read/write for anyone except machines.
I am new to Z3-solver python. I am trying to define a list and confine all my outputs to that list for a simple operation like xor.
My code:
b=Solver()
ls=[1,2,3,4,5] #my list
s1=BitVec('s1',32)
s2=BitVec('s2',32)
x=b.check(s1^s2==1, s1 in ls, s2 in ls) #s1 and s2 belongs to the list, however, this is not the correct way
if x==sat: print(b.model().eval)
The check function doesn't work like that.
Can anyone please help me in figuring how to do this in a different way?
Ans: s1=2,s2=3; since 2xor3 = 1 and s2,s3 belongs to ls=[1,2,3,4,5]
The easiest way to do this would be to define a function that checks if a given argument is in a list provided. Something like:
from z3 import *
def oneOf(x, lst):
return Or([x == i for i in lst])
s1 = BitVec('s1', 32)
s2 = BitVec('s2', 32)
s = Solver()
ls = [1, 2, 3, 4, 5]
s.add(oneOf(s1, ls))
s.add(oneOf(s2, ls))
s.add(s1 ^ s2 == 1)
print (s.check())
print (s.model())
When I run this, I get:
sat
[s2 = 2, s1 = 3]
which I believe is what you're after.
I am currently learning Racket (just for fun) and I stumbled upon this example:
(define doubles
(stream-cons
1
(stream-map
(lambda (x)
(begin
(display "map applied to: ")
(display x)
(newline)
(* x 2)))
doubles)))
It produces 1 2 4 8 16 ...
I do not quite understand how it works.
So it creates 1 as a first element; when I call (stream-ref doubles 1) it creates a second element which is obviously 2.
Then I call (stream-ref doubles 2) which should force creating the third element so it calls stream-map for a stream which already has 2 elements – (1 2) – so it should produce (2 4) then and append this result to the stream.
Why is this stream-map always applied to the last created element? How it works?
Thank you for your help!
This is a standard trick that makes it possible for lazy streams to be defined in terms of their previous element. Consider a stream as an infinite sequence of values:
s = x0, x1, x2, ...
Now, when you map over a stream, you provide a function and produce a new stream with the function applied to each element of the stream:
map(f, s) = f(x0), f(x1), f(x2), ...
But what happens when a stream is defined in terms of a mapping over itself? Well, if we have a stream s = 1, map(f, s), we can expand that definition:
s = 1, map(f, s)
= 1, f(x0), f(x1), f(x2), ...
Now, when we actually go to evaluate the second element of the stream, f(x0), then x0 is clearly 1, since we defined the first element of the stream to be 1. But when we go to evaluate the third element of the stream, f(x1), we need to know x1. Fortunately, we just evaluated x1, since it is f(x0)! This means we can “unfold” the sequence one element at a time, where each element is defined in terms of the previous one:
f(x) = x * 2
s = 1, map(f, s)
= 1, f(x0), f(x1), f(x2), ...
= 1, f(1), f(x1), f(x2), ...
= 1, 2, f(x1), f(x2), ...
= 1, 2, f(2), f(x2), ...
= 1, 2, 4, f(x2), ...
= 1, 2, 4, f(4), ...
= 1, 2, 4, 8, ...
This knot-tying works because streams are evaluated lazily, so each value is computed on-demand, left-to-right. Therefore, each previous element has been computed by the time the subsequent one is demanded, and the self-reference doesn’t cause any problems.
I'm new to spark and trying to find a way to integrate information from one rdd into another, but their structures don't lend themselves to a standard join function
I have on rdd of this format:
[{a:a1, b:b1, c:[1,2,3,4], d:d1},
{a:a2, b:b2, c:[5,6,7,8], d:d2}]
and another of this format:
[{1:x1},{2,x2},{3,x3},{4,x4},{5,x5},{6,x6},{7,x7},{8,x8}]
I want to match the values in the second rdd to their keys in the first rdd (which are in a list value in the c key). I know how to manipulate them once they're there, so I'm not too concerned about the final output, but I'd maybe like to see something like this:
[{a:a1, b:b1, c:[1,2,3,4],c0: [x1,x2,x3,x4], d:d1},
{a:a2, b:b2, c:[5,6,7,8],c0: [x5,x6,x7,x8], d:d2}]
or this:
[{a:a1, b:b1, c:[(1,x1),(2,x2),(3,x3),(4,x4)], d:d1},
{a:a2, b:b2, c:[(5,x5),(6,x6),(7,x7),(8,x8)], d:d2}]
or anything else that can match the keys in the second rdd with the values in the first. I considered making the second rdd into a dictionary, which I know how to work with, but I just think my data is too large for that.
Thank you so much, I really appreciate it.
join after flatMap, or cartesian makes too many shuffles.
One of the possible solutions is to use cartesian after groupBy with HashPartitioner.
(Sorry, this is scala code)
val rdd0: RDD[(String, String, Seq[Int], String)]
val rdd1: RDD[(Int, String)]
val partitioner = new HashPartitioner(rdd0.partitions.size)
// here is the point!
val grouped = rdd1.groupBy(partitioner.getPartition(_))
val result = rdd0.cartesian(grouped).map { case (left, (_, right)) =>
val map = right.toMap
(left._1, left._2, left._4) -> left._3.flatMap(v => map.get(v).map(v -> _))
}.groupByKey().map { case (key, value) =>
(key._1, key._2, value.flatten.toSeq, key._3)
}
I will assume that rdd1 is the input containing {a:a1, b:b1, c:[1,2,3,4], d:d1} and rdd2 has tuples [(1, x1), (2, x2), (3, x3), (4, x4), (5, x5), (6, x6), (7, x7), (8, x8)]. I will also assume that all values in the "c" field in rdd1 can be found in rdd2. If not, you need to change some of the code below.
I sometimes have to solve this type of problem. If rdd2 is small enough, I typically do a map-side join, where I first broadcast the object and then do a simple lookup.
def augment_rdd1(line, lookup):
c0 = []
for key in line['c']:
c0.append(lookup.value[key])
return c0
lookup = sc.broadcast(dict(rdd2.collect()))
output = rdd1.map(lambda line: (line, augment_rdd1(line, lookup)))
If rdd2 is too large to be broadcasted, what I normally do is use a flatMap to map every row of rdd1 to as many rows as there are elements in the "c" field, e.g. {a:a1, b:b1, c:[1,2,3,4], d:d1} would be mapped to
(1, {a:a1, b:b1, c:[1,2,3,4], d:d1})
(2, {a:a1, b:b1, c:[1,2,3,4], d:d1})
(3, {a:a1, b:b1, c:[1,2,3,4], d:d1})
(4, {a:a1, b:b1, c:[1,2,3,4], d:d1})
The flatMap is
flat_rdd1 = rdd1.flatMap(lambda line: [(key, line) for key in line['c'])])
Then, I would join with rdd2 to get an RDD which has:
({a:a1, b:b1, c:[1,2,3,4], d:d1}, x1)
({a:a1, b:b1, c:[1,2,3,4], d:d1}, x2)
({a:a1, b:b1, c:[1,2,3,4], d:d1}, x3)
({a:a1, b:b1, c:[1,2,3,4], d:d1}, x4)
The join is the following:
rdd2_tuple = rdd2.map(lambda line: line.items())
joined_rdd = flat_rdd1.join(rdd2_tuple).map(lambda x: x[1])
Finally, all you need to do is a groupByKey to get ({a:a1, b:b1, c:[1,2,3,4], d:d1}, [x1, x2, x3, x4]):
result = joined_rdd.groupByKey()