Hashfunction to map combinations of 5 to 7 cards - hash-function

Referring to the original problem: Optimizing hand-evaluation algorithm for Poker-Monte-Carlo-Simulation
I have a list of 5 to 7 cards and want to store their value in a hashtable, which should be an array of 32-bit-integers and directly accessed by the hashfunctions value as index.
Regarding the large amount of possible combinations in a 52-card-deck, I don't want to waste too much memory.
Numbers:
7-card-combinations: 133784560
6-card-combinations: 20358520
5-card-combinations: 2598960
Total: 156.742.040 possible combinations
Storing 157 million 32-bit-integer values costs about 580MB. So I would like to avoid increasing this number by reserving memory in an array for values that aren't needed.
So the question is: How could a hashfunction look like, that maps each possible, non duplicated combination of cards to a consecutive value between 0 and 156.742.040 or at least comes close to it?

Paul Senzee has a great post on this for 7 cards (deleted link as it is broken and now points to a NSFW site).
His code is basically a bunch of pre-computed tables and then one function to look up the array index for a given 7-card hand (represented as a 64-bit number with the lowest 52 bits signifying cards):
inline unsigned index52c7(unsigned __int64 x)
{
const unsigned short *a = (const unsigned short *)&x;
unsigned A = a[3], B = a[2], C = a[1], D = a[0],
bcA = _bitcount[A], bcB = _bitcount[B], bcC = _bitcount[C], bcD = _bitcount[D],
mulA = _choose48x[7 - bcA], mulB = _choose32x[7 - (bcA + bcB)], mulC = _choose16x[bcD];
return _offsets52c[bcA] + _table4[A] * mulA +
_offsets48c[ (bcA << 4) + bcB] + _table [B] * mulB +
_offsets32c[((bcA + bcB) << 4) + bcC] + _table [C] * mulC +
_table [D];
}
In short, it's a bunch of lookups and bitwise operations powered by pre-computed lookup tables based on perfect hashing.
If you go back and look at this website, you can get the perfect hash code that Senzee used to create the 7-card hash and repeat the process for 5- and 6-card tables (essentially creating a new index52c7.h for each). You might be able to smash all 3 into one table, but I haven't tried that.
All told that should be ~628 MB (4 bytes * 157 M entries). Or, if you want to split it up, you can map it to 16-bit numbers (since I believe most poker hand evaluators only need 7,462 unique hand scores) and then have a separate map from those 7,462 hand scores to whatever hand categories you want. That would be 314 MB.

Here's a different answer based on the colex function concept. It works with bitsets that are sorted in descending order. Here's a Python implementation (both recursive so you can see the logic and iterative). The main concept is that, given a bitset, you can always calculate how many bitsets there are with the same number of set bits but less than (in either the lexicographical or mathematical sense) your given bitset. I got the idea from this paper on hand isomorphisms.
from math import factorial
def n_choose_k(n, k):
return 0 if n < k else factorial(n) // (factorial(k) * factorial(n - k))
def indexset_recursive(bitset, lowest_bit=0):
"""Return number of bitsets with same number of set bits but less than
given bitset.
Args:
bitset (sequence) - Sequence of set bits in descending order.
lowest_bit (int) - Name of the lowest bit. Default = 0.
>>> indexset_recursive([51, 50, 49, 48, 47, 46, 45])
133784559
>>> indexset_recursive([52, 51, 50, 49, 48, 47, 46], lowest_bit=1)
133784559
>>> indexset_recursive([6, 5, 4, 3, 2, 1, 0])
0
>>> indexset_recursive([7, 6, 5, 4, 3, 2, 1], lowest_bit=1)
0
"""
m = len(bitset)
first = bitset[0] - lowest_bit
if m == 1:
return first
else:
t = n_choose_k(first, m)
return t + indexset_recursive(bitset[1:], lowest_bit)
def indexset(bitset, lowest_bit=0):
"""Return number of bitsets with same number of set bits but less than
given bitset.
Args:
bitset (sequence) - Sequence of set bits in descending order.
lowest_bit (int) - Name of the lowest bit. Default = 0.
>>> indexset([51, 50, 49, 48, 47, 46, 45])
133784559
>>> indexset([52, 51, 50, 49, 48, 47, 46], lowest_bit=1)
133784559
>>> indexset([6, 5, 4, 3, 2, 1, 0])
0
>>> indexset([7, 6, 5, 4, 3, 2, 1], lowest_bit=1)
0
"""
m = len(bitset)
g = enumerate(bitset)
return sum(n_choose_k(bit - lowest_bit, m - i) for i, bit in g)

Related

Lua loop shuffle list

i have a problem with my Script if i try to loop thought my list the output is completly random shuffled
minimal Code:
list = {
numbers = {
number1 = 1,
number2 = 2,
number3 = 3,
number4 = 4,
number5 = 5,
number6 = 6,
number7 = 7,
}
}
for k, numbers in pairs(list) do
for k, number in pairs(numbers) do
print(number)
end
end
output:
5
7
2
3
4
6
1
the only fix i figured out is to remove the variables number1 to number7
and just enter the numbers
Lua tables do not have an order.
In addition to that you're using pairs which internally uses next.
From the Lua manual:
The order in which the indices are enumerated is not specified, even
for numeric indices. (To traverse a table in numerical order, use a
numerical for.)
In your case the keys have a numeric component so you could simply create them in a numeric loop.
local numbers = {
number1 = 1,
number2 = 2,
number3 = 3,
number4 = 4,
number5 = 5,
number6 = 6,
number7 = 7,
}
for i = 1, 7 do
print(numbers["number"..i])
end
For other non-numeric keys you would have to use a second table that lists the keys in an ordered sequence:
local numbers = { bob = 1, bill = 3, john = 2}
local orderedKeys = { "bob", "john", "bill"}
for k,v in ipairs(orderedKeys) do
print(numbers[v])
end
A numeric loop will always work for any integer keys.
local numbers = {
[0] = 0,
[5] = 5,
[3] = 3,
[1] = 0,
}
for i = 0, 5 do
if numbers[i] then
print(numbers[i])
end
end
Read through this carefully:
A table with exactly one border is called a sequence. For instance,
the table {10, 20, 30, 40, 50} is a sequence, as it has only one
border (5). The table {10, 20, 30, nil, 50} has two borders (3 and 5),
and therefore it is not a sequence. (The nil at index 4 is called a
hole.) The table {nil, 20, 30, nil, nil, 60, nil} has three borders
(0, 3, and 6) and three holes (at indices 1, 4, and 5), so it is not a
sequence, too. The table {} is a sequence with border 0. Note that
non-natural keys do not interfere with whether a table is a sequence.
Things like ipairs, the length operator #, table.sort, table.concat and others only work with sequences.
Keys that do not contribute to the sequence are ignored by those functions. You can only loop over all keys of a table with next or pairs respectively. But then order is not guaranteed.

If you can combine 3+ arbitrarily sized integers and still be able to deconstruct it back

Say you have 3 integers:
13105
705016
13
I'm wondering if you could combine these into one integer in any way, so that you can still get back to the original 3 integers.
var startingSet = [ 13105, 705016, 13 ]
var combined = combineIntoOneInteger(startingSet)
// 15158958589285958925895292589 perhaps, I have no idea.
var originalIntegers = deconstructInteger(combined, 3)
// [ 13105, 705016, 13 ]
function combineIntoOneInteger(integers) {
// some sort of hashing-like function...
}
function deconstructInteger(integer, arraySize) {
// perhaps pass it some other parameters
// like how many to deconstruct to, or other params.
}
It doesn't need to technically be an "integer". It is just a string using only the integer characters, though perhaps I might want to use the hex characters instead. But I ask in terms of integers because underneath I do have integers of a bounded size that will be used to construct the combined object.
Some other notes....
The combined value should be unique, so no matter what values you combine, you will always get a different result. That is, there are absolutely no conflicts. Or if that's not possible, perhaps an explanation why and a potential workaround.
The mathematical "set" containing all possible outputs can be composed of different amounts of components. That is to say, you might have the output/combined set containing [ 100, 200, 300, 400 ] but the input set is these 4 arrays: [ [ 1, 2, 3 ], [ 5 ], [ 91010, 132 ], [ 500, 600, 700 ] ]. That is, the input arrays can be of wildly different lengths and wildly different sized integers.
One way to accomplish this more generically is to just use a "separator" character, which makes it super easy. So it would be like 13105:705016:13. But this is cheating, I want it to only use the characters in the integer set (or perhaps the hex set, or some other arbitrary set, but for this case just the integer set or hex).
Another idea for a potential way to accomplish this is to somehow hide a separator in there by doing some hashing or permutation jiu jitsu so that [ 13105, 705016, 13 ] becomes some integer-looking thing like 95918155193915183, where 155 and 5 are some separator like interpolator values based on the preceding input or some other tricks. A simpler approach to this would be like saying "anything following three zeroes 000 like 410001414 means it's a new integer. So basically 000 is a separator. But this specifically is ugly and brittle. Maybe it could get more tricky and work though, like "if the value is odd and followed by a multiple of 3 of itself, then it's a separator" sort of thing. But I can see that also having brittle edge cases.
But basically, given a set of integers n (of strings of integer characters), how to convert that into a single integer (or single integer-charactered string), and then convert it back into the original set of integers n.
Sure, there are lots of ways to do this.
To start with, it's only necessary to have a reversible function which combines two values into one. (For it to be reversible, there must be another function which takes the output value and recreates the two input values.)
Let's call the function which combines two values combine and the reverse function separate. Then we have:
separate(combine(a, b)) == [a, b]
for any values a and b. That means that combine(a, b) == combine(c, d)
can only be true if both a == c and b == d; in other words, every pair of inputs produces a different output.
Encoding arbitrary vectors
Once we have that function, we can encode arbitrary-length input vectors. The simplest case is when we know in advance what the length of the vector is. For example, we could define:
combine3 = (a, b, c) => combine(combine(a, b), c)
combine4 = (a, b, c, d) => combine(combine(combine(a, b), c), d)
and so on. To reverse that computation, we only have to repeatedly call separate the correct number of times, each time keeping the second returned value. For example, if we previously had computed:
m = combine4(a, b, c, d)
we could get the four input values back as follows:
c3, d = separate(m)
c2, c = separate(c3)
a, b = separate(c2)
But your question asks for a way to combine an arbitrary number of values. To do that, we just need to do one final combine, which mixes in the number of values. That lets us get the original vector back out: first, we call separate to get the value count back out, and then we call separate enough times to extract each successive input value.
combine_n = v => combine(v.reduce(combine), v.length)
function separate_n(m) {
let [r, n] = separate(m)
let a = Array(n)
for (let i = n - 1; i > 0; --i) [r, a[i]] = separate(r);
a[0] = r;
return a;
}
Note that the above two functions do not work on the empty vector, which should code to 0. Adding the correct checks for this case is left as an exercise. Also note the warning towards the bottom of this answer, about integer overflow.
A simple combine function: diagonalization
With that done, let's look at how to implement combine. There are actually many solutions, but one pretty simple one is to use the diagonalization function:
diag(a, b) = (a + b)(a + b + 1)
------------------ + a
2
This basically assigns positions in the infinite square by tracing successive diagonals:
<-- b -->
0 1 3 6 10 15 21 ...
^ 2 4 7 11 16 22 ...
| 5 8 12 17 23 ...
a 9 13 18 24 ...
| 14 19 25 ...
v 20 26 ...
27 ...
(In an earlier version of this answer, I had reversed a and b, but this version seems to have slightly more intuitive output values.)
Note that the top row, where a == 0, is exactly the triangular numbers, which is not surprising because the already enumerated positions are the top left triangle of the square.
To reverse the transformation, we start by solving the equation which defines the triangular numbers, m = s(s + 1)/2, which is the same as
0 = s² + s - 2m
whose solution can be found using the standard quadratic formula, resulting in:
s = floor((-1 + sqrt(1 + 8 * m)) / 2)
(s here is the original a+b; that is, the index of the diagonal.)
I should explain the call to floor which snuck in there. s will only be precisely an integer on the top row of the square, where a is 0. But, of course, a will usually not be 0, and m will usually be a little more than the triangular number we're looking for, so when we solve for s, we'll get some fractional value. Floor just discards the fractional part, so the result is the diagonal index.
Now we just have to recover a and b, which is straight-forward:
a = m - combine(0, s)
b = s - a
So we now have the definitions of combine and separate:
let combine = (a, b) => (a + b) * (a + b + 1) / 2 + a
function separate(m) {
let s = Math.floor((-1 + Math.sqrt(1 + 8 * m)) / 2);
let a = m - combine(0, s);
let b = s - a;
return [a, b];
}
One cool feature of this particular encoding is that every non-negative integer corresponds to a distinct vector. Many other encoding schemes do not have this property; the possible return values of combine_n are a subset of the set of non-negative integers.
Example encodings
For reference, here are the first 30 encoded values, and the vectors they represent:
> for (let i = 1; i <= 30; ++i) console.log(i, separate_n(i));
1 [ 0 ]
2 [ 1 ]
3 [ 0, 0 ]
4 [ 1 ]
5 [ 2 ]
6 [ 0, 0, 0 ]
7 [ 0, 1 ]
8 [ 2 ]
9 [ 3 ]
10 [ 0, 0, 0, 0 ]
11 [ 0, 0, 1 ]
12 [ 1, 0 ]
13 [ 3 ]
14 [ 4 ]
15 [ 0, 0, 0, 0, 0 ]
16 [ 0, 0, 0, 1 ]
17 [ 0, 1, 0 ]
18 [ 0, 2 ]
19 [ 4 ]
20 [ 5 ]
21 [ 0, 0, 0, 0, 0, 0 ]
22 [ 0, 0, 0, 0, 1 ]
23 [ 0, 0, 1, 0 ]
24 [ 0, 0, 2 ]
25 [ 1, 1 ]
26 [ 5 ]
27 [ 6 ]
28 [ 0, 0, 0, 0, 0, 0, 0 ]
29 [ 0, 0, 0, 0, 0, 1 ]
30 [ 0, 0, 0, 1, 0 ]
Warning!
Observe that all of the unencoded values are pretty small. The encoded values is similar in size to the concatenation of all the input values, and so it does grow pretty rapidly; you have to be careful to not exceed Javascript's limit on exact integer computation. Once the encoded value exceeds this limit (253) it will no longer be possible to reverse the encoding. If your input vectors are long and/or the encoded values are large, you'll need to find some kind of bignum support in order to do precise integer computations.
Alternative combine functions
Another possible implementation of combine is:
let combine = (a, b) => 2**a * 3**b
In fact, using powers of primes, we could dispense with the combine_n sequence, and just produce the combination directly:
combine(a, b, c, d, e,...) = 2a 3b 5c 7d 11e...
(That assumes that the encoded values are strictly positive; if they could be 0, we'd have no way of knowing how long the sequence was because the encoded value does not distinguish between a vector and the same vector with a 0 appended. But that's not a big issue, because if we needed to deal with 0s, we would just add one to all used exponents:
combine(a, b, c, d, e,...) = 2a+1 3b+1 5c+1 7d+1 11e+1...
That is certainly correct and its very elegant in a theoretical sense. It's the solution which you will find in theoretical CS textbooks because it is much easier to prove uniqueness and reversibility. However, in the real world it is really not practical. Reversing the combination depends on finding the prime factors of the encoded value, and the encoded values are truly enormous, well out of the range of easily representable numbers.
Another possibility is precisely the one you mention in the question: simply put a separator between successive values. One simple way to do this is to rewrite the values to encode in base 9 (or base 15) and then increment all the digit values, so that the digit 0 is not present in any encoded value. Then we can put 0s between the encoded values and read the result in base 10 (or base 16).
Neither of these solutions has the property that every non-negative integer is the encoding of some vector. (The second one almost has that property, and it's a useful exercise to figure out which integers are not possible encodings, and then fix the encoding algorithm to avoid that problem.)

Sage: Polynomial ring over finite field - inverting a polynomial non-prime

I'm trying to recreate the wiki's example procedure, available here:
https://en.wikipedia.org/wiki/NTRUEncrypt
I've run into an issue while attempting to invert the polynomials.
The SAGE code below seems to be working fine for the given p=3, which is a prime number.
However, the representation of the polynomial in the field generated by q=32 ends up wrong, because it behaves as if the modulus was 2.
Here's the code in play:
F = PolynomialRing(GF(32),'a')
a = F.gen()
Ring = F.quotient(a^11 - 1, 'x')
x = Ring.gen()
pollist = [-1, 1, 1, 0, -1, 0, 1, 0, 0, 1, -1]
fq = Ring(pollist)
print(fq)
print(fq^(-1))
The Ring is described as follows:
Univariate Quotient Polynomial Ring in x over Finite Field in z5 of size 2^5 with modulus a^11 + 1
And the result:
x^10 + x^9 + x^6 + x^4 + x^2 + x + 1
x^5 + x + 1
I've tried to replace the Finite Field with IntegerModRing(32), but the inversion ends up demanding a field, as implied by the message:
NotImplementedError: The base ring (=Ring of integers modulo 32) is not a field
Any suggestions as to how I could obtain the correct inverse of f (mod q) would be greatly appreciated.
GF(32) is the finite field with 32 elements, not the integers modulo 32. You must use Zmod(32) (or IntegerModRing(32), as you suggested) instead.
As you point out, Sage psychotically bans you from computing inverses in ℤ/32ℤ[a]/(a¹¹-1) because that is not a field, and not even a factorial ring. It can, however, compute those inverses when they exist, only you must ask more kindly:
sage: F.<a> = Zmod(32)[]
sage: fq = F([-1, 1, 1, 0, -1, 0, 1, 0, 0, 1, -1])
sage: print(fq)
31*a^10 + a^9 + a^6 + 31*a^4 + a^2 + a + 31
sage: print(fq.inverse_mod(a^11 - 1))
16*a^8 + 4*a^7 + 10*a^5 + 28*a^4 + 9*a^3 + 13*a^2 + 21*a + 1
Not ideal, admittedly.

Codility: Passing cars in Lua

I'm currently practicing programming problems and out of interest, I'm trying a few Codility exercises in Lua. I've been stuck on the Passing Cars problem for a while.
Problem:
A non-empty zero-indexed array A consisting of N integers is given. The consecutive elements of array A represent consecutive cars on a road.
Array A contains only 0s and/or 1s:
0 represents a car traveling east,
1 represents a car traveling west.
The goal is to count passing cars. We say that a pair of cars (P, Q), where 0 ≤ P < Q < N, is passing when P is traveling to the east and Q is traveling to the west.
For example, consider array A such that:
A[0] = 0
A[1] = 1
A[2] = 0
A[3] = 1
A[4] = 1
We have five pairs of passing cars: (0, 1), (0, 3), (0, 4), (2, 3), (2, 4).
Write a function:
function solution(A)
that, given a non-empty zero-indexed array A of N integers, returns the number of pairs of passing cars.
The function should return −1 if the number of pairs of passing cars exceeds 1,000,000,000.
For example, given:
A[0] = 0
A[1] = 1
A[2] = 0
A[3] = 1
A[4] = 1
the function should return 5, as explained above.
Assume that:
N is an integer within the range [1..100,000];
each element of array A is an integer that can have one of the following values: 0, 1.
Complexity:
expected worst-case time complexity is O(N);
expected worst-case space complexity is O(1), beyond input storage (not counting the storage required for input arguments).
Elements of input arrays can be modified.
My attempt in Lua keeps failing but I can't seem to find the issue.
local function solution(A)
local zeroes = 0
local pairs = 0
for i = 1, #A do
if A[i] == 0 then
zeroes = zeroes + 1
else
pairs = pairs + zeroes
if pairs > 1e9 then
return -1
end
end
end
return pairs
end
In terms of time-space complexity constraints, I think it should pass so I can't seem to find the issue. What am I doing wrong? Any advice or tips to make my code more efficient would be appreciated.
FYI: I keep getting a result of 2 when the desired example result is 5.
The problem statement says A is 0-based so if we ignore the first and start at 1, the output would be 2 instead of 5. 0-based tables should be avoided in Lua, they go against convention and will lead to a lot of off-by one errors: for i=1,#A do will not do what you want.
function solution1based(A)
local zeroes = 0
local pairs = 0
for i = 1, #A do
if A[i] == 0 then
zeroes = zeroes + 1
else
pairs = pairs + zeroes
if pairs > 1e9 then
return -1
end
end
end
return pairs
end
print(solution1based{0, 1, 0, 1, 1}) -- prints 5 as you wanted
function solution0based(A)
local zeroes = 0
local pairs = 0
for i = 0, #A do
if A[i] == 0 then
zeroes = zeroes + 1
else
pairs = pairs + zeroes
if pairs > 1e9 then
return -1
end
end
end
return pairs
end
print(solution0based{[0]=0, [1]=1, [2]=0, [3]=1, [4]=1}) -- prints 5

generating series of number 0,3,5,8,10,13,15,18

i want to generate a series of number through looping.
my series will contain numbers like 0,3,5,8,10,13,15,18 and so on.
i try to take reminder and try to add 2 and 3 but it wont work out.
can any one please help me in generating this series.
You can just use an increment which toggles between 3 and 2, e.g.
for (i = 0, inc = 3; i < 1000; i += inc, inc = 5 - inc)
{
printf("%d\n", i);
}
It looks like the the sequence starts at zero, and uses increments of 3 and 2. There are several ways of implementing this, but perhaps the simplest one would be iterating in increments of 5 (i.e. 3+2) and printing two numbers - position and position plus three.
Here is some pseudocode:
i = 0
REPEAT N times :
PRINT i
PRINT i + 3
i += 5
The iteration i=0 will print 0 and 3
The iteration i=5 will print 5 and 8
The iteration i=10 will print 10 and 13
The iteration i=15 will print 15 and 18
... and so on
I was pulled in with the tag generate-series, which is a powerful PostgreSQL function. This may have been tagged by mistake (?) but it just so happens that there would be an elegant solution:
SELECT ceil(generate_series(0, 1000, 25) / 10.0)::int;
generate_series() returns 0, 25, 50, 75 , ... (can only produces integer numbers)
division by 10.0 produces numeric data: 0, 2.5, 5, 7.5, ...
ceil() rounds up to your desired result.
The final cast to integer (::int) is optional.
SQL Fiddle.

Resources