Different behaviour between Join and SelectMany after replacing one of the sets - join

I hope that someone can shed a light on the (to me) unexpected behavioral difference between the two (result wise) equal queries.
A small program can be worth a thousand words, so here goes :
static void Main(string[] args)
{
var l1 = new List<int> { 1, 2, 3 };
var l2 = new List<int> { 2, 3, 4 };
var q1 = // or var q1 = l1.Join(l2, i => i, j => j, (i, j) => i);
from i in l1
join j in l2
on i equals j
select i;
var q2 = //or var q2 = l1.SelectMany(i => l2.Where(j => i == j));
from i in l1
from j in l2
where i == j
select i;
var a1 = q1.ToList(); // 2 and 3, as expected
var a2 = q2.ToList(); // 2 and 3, as expected
l2.Remove(2);
var b1 = q1.ToList(); // only 3, as expected
var b2 = q2.ToList(); // only 3, as expected
// now here goes, lets replace l2 alltogether.
// Afterwards, I expected the same result as q1 delivered...
l2 = new List<int> { 2, 3, 4 };
var c1 = q1.ToList(); // only 3 ? Still using the previous reference to l2 ?
var c2 = q2.ToList(); // 2 and 3, as expected
}
Now I know that Join internally uses a lookup class to optimize performance, and without too much knowledge, my guess is that the combination of that with captured variables might cause this behavior, but to say I really understand it, no :-)
Is this an example of what Joel calls "a leaky abstraction" ?
Cheers,
Bart

You're actually nearly there, given your query expansions in the comments:
var q1 = l1.Join(l2, i => i, j => j, (i, j) => i);
var q2 = l1.SelectMany(i => l2.Where(j => i == j));
Look at where l2 is used in each case. In the Join case, the value of l2 is passed into the method immediately. (Remember that the value is a reference to the list though... changing the contents of the list isn't the same as changing the value of l2.) Changing the value of l2 later doesn't affect what the query returned by the Join method remembers.
Now look at SelectManay: l2 is only used in the lambda expression... so it's a captured variable. That means that whenever the lambda expression is evaluated, the value of l2 at that moment in time is used... so it will reflect any changes to the value.

Related

Z3 - how to count matches?

I have a finite set of pairs of type (int a, int b). The exact values of the pairs are explicitly present in the knowledge base. For example it could be represented by a function (int a, int b) -> (bool exists) which is fully defined on a finite domain.
I would like to write a function f with signature (int b) -> (int count), representing the number of pairs containing the specified b value as its second member. I would like to do this in z3 python, though it would also be useful to know how to do this in the z3 language
For example, my pairs could be:
(0, 0)
(0, 1)
(1, 1)
(1, 2)
(2, 1)
then f(0) = 1, f(1) = 3, f(2) = 1
This is a bit of an odd thing to do in z3: If the exact values of the pairs are in your knowledge base, then why do you need an SMT solver? You can just search and count using your regular programming techniques, whichever language you are in.
But perhaps you have some other constraints that come into play, and want a generic answer. Here's how one would code this problem in z3py:
from z3 import *
pairs = [(0, 0), (0, 1), (1, 1), (1, 2), (2, 1)]
def count(snd):
return sum([If(snd == p[1], 1, 0) for p in pairs])
s = Solver()
searchFor = Int('searchFor')
result = Int('result')
s.add(Or(*[searchFor == d[0] for d in pairs]))
s.add(result == count(searchFor))
while s.check() == sat:
m = s.model()
print("f(" + str(m[searchFor]) + ") = " + str(m[result]))
s.add(searchFor != m[searchFor])
When run, this prints:
f(0) = 1
f(1) = 3
f(2) = 1
as you predicted.
Again; if your pairs are exactly known (i.e., they are concrete numbers), don't use z3 for this problem: Simply write a program to count as needed. If the database values, however, are not necessarily concrete but have other constraints, then above would be the way to go.
To find out how this is coded in SMTLib (the native language z3 speaks), you can insert print(s.sexpr()) in the program before the while loop starts. That's one way. Of course, if you were writing this by hand, you might want to code it differently in SMTLib; but I'd strongly recommend sticking to higher-level languages instead of SMTLib as it tends to be hard to read/write for anyone except machines.

How can I generate a unique, predictable, repeatable, non sequential alphanumeric identifier?

I have to generate identifiers composed of four alphanumerical characters, e.g. B41F.
I have the following requirements:
Each identifier must be unique (there is no central location to lookup existing identifiers)
The identifier must not be obviously sequential (e.g. 1A01, 1A02)
It must be predictable
It must be repeatable using solely the identifier index (on two different environment, the Nth identifier generated, which has index N, must be the same)
The problem is generic to any language. My implementation will be done in dart.
I think this could be done with a PRNG and some LUT, but I could not find any implementation or pseudo-code that respects requirement 4) without replaying the whole sequence. Also, some PRNG implementation have a random component that is not guaranteed to be repeatable over library update.
How can I achieve this? I'm looking for pseudo-code, code or hints.
You should not use a PRNG when identifiers must be unique. RNGs do not promise uniqueness. Some might have a long period before they repeat, but that's at their full bit-range, reducing it to a smaller number may cause conflicts earlier.
Your identifiers are really just numbers in base 36, so you need something like shuffle(index).toRadixString(36) to generate it.
The tricky bit is the shuffle function which must be a permutations of the numbers 0..36^4-1, one which looks random (non-sequential), but can be computed (efficiently?) for any input.
Since 36^4 is not a power of 2, most of the easy bit-shuffles likely won't work.
If you can live with 32^4 numbers only (2^20 ~ 1M) it might be easier.
Then you can also choose to drop O, I, 0 and 1 from the result, which might make it easier to read.
In that case, I'd do something primitive (not cryptographically secure at all), like:
// Represent 20-bit numbers
String represent(int index) {
RangeError.checkValueInInterval(index, 0, 0xFFFFF, "index");
var digits = "23456789ABCDEFGHJKLMNPQRSTUVWXYZ";
return "${digits[(index >> 15) & 31]}${digits[(index >> 10) & 31]}"
"${digits[(index >> 5) & 31]}${digits[index & 31]}";
}
// Completely naive number shuffler for 20-bit numbers.
// All numbers made up on the spot.
int shuffle(int index) {
RangeError.checkValueInInterval(index, 0, 0xFFFFF, "index");
index ^= 0x35712;
index ^= index << 15;
index ^= index << 4;
index ^= index << 12;
index ^= index << 7;
index ^= index << 17;
return index & 0xFFFFF; // 20 bit only.
}
If you really want the full 36^4 range to be used, I'd probably do something like the shuffle, but in base-six arithmetic. Maybe:
String represent(int index) =>
RangeError.checkValueInInterval(index, 0, 1679615, "index")
.toRadixString(36).toUpperCase();
int shuffle(int index) {
RangeError.checkValueInInterval(index, 0, 1679615, "index");
const seed = [1, 4, 2, 5, 0, 3, 1, 4]; // seed.
var digits = List<int>.filled(8, 0);
for (var i = 0; i < 8; i++) {
digits[i] = index.remainder(6);
index = index ~/ 6;
}
void shiftAdd(List<int> source, int shift, int times) {
for (var n = digits.length - 1 - shift; n >= 0; n--) {
digits[shift + n] = (digits[shift + n] + source[n] * times).remainder(6);
}
}
shiftAdd(seed, 0, 1);
shiftAdd(digits, 3, 2);
shiftAdd(digits, 5, 1);
shiftAdd(digits, 2, 5);
var result = 0;
for (var i = digits.length - 1; i >= 0; i--) {
result = result * 6 + digits[i];
}
return result;
}
Again, this is something I made up on the spot, it "shuffles", but does not promise anything about the properties of the result, other than that they don't look sequential.

Z3py how to solve a problem with many possible path (k out of n potential actions, order matters) efficiently

I am trying to solve a problem that consists of n actions (n >= 8). A path consists k (k == 4 for now) actions. I would like to check if there exists any path, which satisfies the set of constraints I defined.
I have made two attempts to solve this problem:
Attempt 1: Brute force, try all permutations
Attempt 2: Code a path selection matrix M [k x n], such that each row contains one and only one element greater than 0, and all other elements equal to 0.
For instance if k == 2, n == 2, M = [[0.9, 0], [0, 0.7]] represents perform action 1 first, then action 2.
Then my state transition was coded as:
S1 = a2(a1(S0, M[1][1]), M[1][2]) = a2(a1(S0, 0.9), 0)
S2 = a2(a1(S1, M[2][1]), M[2][2]) = a2(a1(S1, 0), 0.7)
Note: I made sure that S == a(S,0), so that in each step only one action is executed.
Then constraints were checked on S2
I was hoping this to be faster than the permutation way of doing it. Unfortunately, this turns out to be slower. Just wondering if there is any better way to solve this problem?
Code:
_path = [[Real(f'step_{_i}_action_{_j}') for _j in range(len(actions))] for _i in range(number_of_steps)]
_states: List[State] = [self.s0]
for _i in range(number_of_steps):
_new_state = copy.deepcopy(_states[-1])
for _a, _p in zip(actions, _path[_i]):
self.solver.add(_a.constraints(_states[-1], _p))
_new_state = _a.execute(_new_state, _p)
_states.append(_new_state)

Ruby - How to check only the first return value of a method?

Right now I have this Ruby method that returns 3 different numbers:
# Find integers s and t such that gcd(a,b) = s*a + t*b
# pre: a,b >= 0
# post: return gcd(a,b), s, t
def egcd(a, b)
# let A, B = a, b
s, t, u, v = 1, 0, 0, 1
while 0 < b
# loop invariant: a = sA + tB and b = uA + vB and gcd(a,b) = gcd(A,B)
q = a / b
a, b, s, t, u, v = b, (a%b), u, v, (s-u*q), (t-v*q)
end
[a, s, t]
end
I want to only check the first return value, a.
if egcd(ARGV[3].to_i, 128) != 1
So this statement here does not work since it's returning 3 values, I just want to check if the first value is != 1. I'm fairly new to Ruby, does anyone know of a way to accomplish this? Thanks in advance!
Getting the first value of an array can be done in a few ways:
if egcd(ARGV[3].to_i, 128).first != 1
or
if egcd(ARGV[3].to_i, 128)[0] != 1
If you're only using the first value, I'd suggest re-writing your program to be a little more intuitive. I'd also consider re-writing this piece of code entirely as it doesn't read nicely at all.
Since your return value is an array, check to see if the first value of the array is not 1.
if egcd(ARGV[3].to_i, 128).first != 1
or
unless egcd(ARGV[3].to_i, 128).first == 1
You can do it this way:
if egcd(ARGV[3].to_i, 128)[0] != 1
...since the return is actually an array.

Julia - Preallocating for sparse matrices

I was reading about preallocation from Performance Tips and it has this example:
function xinc!(ret::AbstractVector{T}, x::T) where T
ret[1] = x
ret[2] = x+1
ret[3] = x+2
nothing
end
function loopinc_prealloc()
ret = Array{Int}(3)
y = 0
for i = 1:10^7
xinc!(ret, i)
y += ret[2]
end
y
end
I see that the example is trying to change ret which is preallocated. However, when I tried the following:
function addSparse!(sp1, sp2)
sp1 = 2*sp2
nothing
end
function loopinc_prealloc()
sp1 = spzeros(3, 3)
y = 0
for i = 1:10^7
sp2 = sparse([1, 2], [1, 2], [2 * i, 2 * i], 3, 3)
addSparse!(sp1, sp2)
y += sp1[1,1]
end
y
end
I don't think sp1 is updated by addSparse!. In the example from Julia, function xinc! modifies ret one by one. How can I do the same to a sparse matrix?
In my actual code, I need to update a big sparse matrix in a loop for the sake of saving memory it makes sense for me to preallocate.
The issue is not that the Matrix is sparse. The issue is that when you use the assignment operator = you assign the name sp1 to a new object (with value 2sp2), rather than updating the sp1 matrix. Consider the example from performance tips: ret[1] = x does not reassign ret it just modifies it's elements.
Use the .= operator instead to overwrite all the elements of a container.

Resources