I'm looking to generate a table of random values, but want to make sure that none of those values are repeated within the table.
So my basic table generation looks like this:
numbers = {}
for i = 1, 5 do
table.insert(numbers, math.random(20))
end
So that will work in populating a table with 5 random values between 1-20. However, it's the making sure none of those values repeat is where I'm stuck.
One approach would be to shuffle an array of numbers and then take the first n numbers. The wrong way to go about shuffling an array is to maintain a list of previously generated random numbers, checking against that with each newly generated random number before adding it to the final array. Such a solution is O(n^2) in time complexity when iterating over the array during the check; this will be painful for large arrays, or for small arrays when many must be created. Lua has constant time array access since tables are really hash tables, so you could get away with this, except: sometimes many random numbers will need to be tried before a suitable one (that has not already been used) is found. This can be a real problem near the end of an array of many random numbers, i.e., when you want 1000 random numbers and have filled all but the last slot, how many random tries (and how many iterations of the 999 numbers already selected) will it take to find the only number (42, of course) that is still available?
The right way to go about shuffling is to use a shuffling algorithm. The Fisher-Yates shuffle is a common solution to this problem. The idea is that you start at one end of an array, and swap each element with a random element that occurs later in the list until the entire array has been shuffled. This solution is O(n) in time complexity, thus much less wasteful of computational resources.
Here is an implementation in Lua:
function shuffle (arr)
for i = 1, #arr - 1 do
local j = math.random(i, #arr)
arr[i], arr[j] = arr[j], arr[i]
end
end
Testing in the REPL:
> t = { 1, 2, 3, 4, 5, 6 }
> table.inspect(t)
1 = 1
2 = 2
3 = 3
4 = 4
5 = 5
6 = 6
> shuffle(t)
> table.inspect(t)
1 = 4
2 = 5
3 = 1
4 = 6
5 = 2
6 = 3
This can easily be extended to create lists of random numbers:
function shuffled_numbers (n)
local numbers = {}
for i = 1, n do
numbers[i] = i
end
shuffle(numbers)
return numbers
end
REPL interaction:
> s = shuffled_numbers(10)
> table.inspect(s)
1 = 9
2 = 5
3 = 3
4 = 4
5 = 7
6 = 6
7 = 2
8 = 10
9 = 8
10 = 1
If you want to see what is happening during the shuffle, add a print statement in the shuffle function:
function shuffle (arr)
for i = 1, #arr - 1 do
local j = math.random(i, #arr)
print(string.format("%d (%d) <--> %d (select %d)", i, arr[i], j, arr[j]))
arr[i], arr[j] = arr[j], arr[i]
end
end
Now you can see the swaps as they occur if you recall that in the above implementation of shuffled_numbers the array { 1, 2, ..., n } is the starting point of the shuffle. Note that sometimes a number is swapped with itself, which is to say that the number in the current unselected position is a valid choice, too. Also note that the last number is automatically the correct selection, since it is the only number that has not yet been randomly selected:
> s = shuffled_numbers(10)
1 (1) <--> 5 (select 5)
2 (2) <--> 10 (select 10)
3 (3) <--> 5 (select 1)
4 (4) <--> 9 (select 9)
5 (3) <--> 8 (select 8)
6 (6) <--> 9 (select 4)
7 (7) <--> 8 (select 3)
8 (7) <--> 10 (select 2)
9 (6) <--> 9 (select 6)
> table.inspect(s)
1 = 5
2 = 10
3 = 1
4 = 9
5 = 8
6 = 4
7 = 3
8 = 2
9 = 6
10 = 7
Obtaining a selection of 5 random numbers between 1 and 20 is easy enough to accomplish using the shuffle function; one of the virtues of this approach is that the shuffling operation has been abstracted to an O(n) procedure which can shuffle any array, numeric or otherwise. The function that calls shuffle is responsible for supplying the input and returning the results.
A simple solution for more flexibility in the range of random numbers returned:
-- Take the first N numbers from a shuffled range [A, B].
function shuffled_range_take (n, a, b)
local numbers = {}
for i = a, b do
numbers[i] = i
end
shuffle(numbers)
return { table.unpack(numbers, 1, n) }
-- table.unpack won't work for very large ranges, e.g. [1, 1000000]
-- You could instead use this for arbitrarily large ranges:
-- local take = {}
-- for i= 1, n do
-- take[i] = numbers[i]
-- end
-- return take
end
REPL interaction creating a table containing 5 random values between 1 and 20:
> s = shuffled_range_take(5, 1, 20)
> table.inspect(s)
1 = 1
2 = 10
3 = 4
4 = 8
5 = 20
But, there is a disadvantage to the shuffle method in some circumstances. When the number of elements needed is small compared with the number of available elements, the above solution must shuffle a large array to obtain comparatively few random elements. The shuffle is O(n) in the number of elements available, while the memoization method is roughly O(n) in the number of elements chosen. A memoization method like that of #AlexanderMashin performs poorly when the goal is to create an array of 20 random numbers between 1 and 20, because the final numbers chosen may need to be chosen many times before suitable numbers are found. But when only 5 random numbers between 1 and 20 are needed, this problem with duplicate choices is less of an issue. This approach seems to perform better than the shuffle, up to about 10 numbers needed from 20 random numbers. When more than 10 numbers are needed from 20, the shuffle begins to perform better. This break-even point is different for larger numbers of elements to choose from; for 1000 available elements, parity is reached at about 700 chosen. When performance is critical, testing is the only way to determine the best solution.
numbers = {}
local i = 1;
while i<=5 do
n = 0
local rand = math.random(20)
for x=1,#numbers do
if numbers[x] == rand then
n = n + 1
end
end
if n == 0 then
table.insert(numbers, rand)
i = i + 1
end
n = 0
end
the method I used for this process was to use a for to scan each of the elements in the table and increase the variable n if one of them was equal to the random value given, so if x was different from 0, the value would not be inserted in the table and would not increment the variable i (I had to use the while to work with i)
if you want to print each of the elements in the table to check the values you can use this:
for i=1,#numbers do
print(numbers[i])
end
I suggest an alternative method based on the fact that it is easy to make sets in Lua: they are just tables with true values.
-- needed is how many random numbers in the table are needed,
-- maximum is the maximum value of a random non-negtive integer.
local function fill_table( needed, maximum )
math.randomseed ( os.time () ) -- reseed the random numbers generator
local numbers = {}
local used = {} -- which numbers are already used
for i = 1, needed do
local random
repeat
random = math.random( maximum )
until not used[random]
used[random] = true
numbers[i] = random
end
return numbers
end
Making a table with 20 keys (use for/do/end) and then do your desired times
rand_number=table.remove(tablename, math.random(1,#tablename))
EDIT: Corrected - See first comment
And rand_number never holds the same value. I use this as a simulation for a "Lottozahlengenerator" (german, sorry) or random video/music clips playing where duplicates are unwanted.
I need to iterate through a 1D array and add all of the elements together to find the total. I must use a Perfrom ... Varying statement, this is what I have come up with so far.
perform 100-read-input-file
varying emp-rec-calls(ws-emp-total)
from 1 by ws-emp-total
until (ws-eof-flag = 'Y'
OR ws-array-counter > ws-array-max)
add emp-rec-calls(ws-emp-total) to ws-total-temp
The code for 100-read-input-file is simply
read input-file at end move 'y' to found-eof.
The problem I am currently getting is "Subscript out of range:" on this line "perform 100-read-input-file". All help is appretiated, thanks!
Let's analyze the code you provided:
perform 100-read-input-file
varying emp-rec-calls(ws-emp-total)
from 1 by ws-emp-total
until (ws-eof-flag = 'Y'
OR ws-array-counter > ws-array-max)
add emp-rec-calls(ws-emp-total) to ws-total-temp
This loop doesn't really make any sense. You are saying perform this loop varying occurance X of the array EMP-REC-CALLS from 1 by X until a flag that never gets set within the loop is equal to yes OR a counter you are not incrementing is greater than the array size.
I think you are trying to achieve something like this:
PERFORM VARYING WS-ARRAY-COUNTER
FROM 1 BY 1
UNTIL WS-ARRAY-COUNTER > WS-ARRAY-MAX
ADD EMP-REC-CALLS(WS-COUNTER) TO WS-TOTAL-TEMP
END-PERFORM
This will vary the counter WS-ARRAY-COUNTER by 1 every iteration of the loop (starting at 1) until that counter is greater than the max defined.
I have a dataset that has three variables which indicate a category of event at three time points (dispatch, beginning, end). I want to establish the number of cases where (a) the category is the same for all three time points (b) those which have changed at time point 2 (beginning) and (c) those which have changed at time point 3 (end).
Can anyone recommend some syntax or a starting point?
To measure a change (non-equivalent) against T0 (Time zero or in your case Dispatch), wouldn't you simply check for equivalence between respective variables?:
DATA LIST FREE /ID T0 T1 T2.
BEGIN DATA.
1 1 1 1.
2 1 1 0.
3 1 0 1.
4 0 1 1.
5 1 0 0.
6 0 1 0.
7 0 0 1.
8 0 0 0.
END DATA.
COMPUTE ChangeT1=T0<>T1.
COMPUTE ChangeT2=T0<>T2.
To check all the values are the same across all three variables would be just (given you have string variables else otherwise you could do this differently if working with numeric variables such as Standard deviation):
COMPUTE CheckNoChange=T0=T1 & T0=T2.
I've created a function that resizes an array and sets new entries to 0, but can also decrease the size of the array in 2 different ways:
1. Simply setting the n property to the new size (the length operator cannot be used because of this reason).
2. Setting all values after the new size to nil up to 2*size to force a rehash.
local function resize(array, elements, free)
local size = array.n
if elements < size then -- Decrease Size
array.n = elements
if free then
size = math.max(size, #array) -- In case of multiple resizes
local base = elements + 1
for idx = base, 2*size do -- Force a rehash -> free extra unneeded memory
array[idx] = nil
end
end
elseif elements > size then -- Increase Size
array.n = elements
for idx = size + 1, elements do
array[idx] = 0
end
end
end
How I tested it:
local mem = {n=0};
resize(mem, 50000)
print(mem.n, #mem) -- 50000 50000
print(collectgarbage("count")) -- relatively large number
resize(mem, 10000, true)
print(mem.n, #mem) -- 10000 10000
print(collectgarbage("count")) -- smaller number
resize(mem, 20, true)
print(mem.n, #mem) -- 20 20
print(collectgarbage("count")) -- same number as above, but it should be a smaller number
However when I don't pass true as the third argument to the second call of resize (so it doesn't force a rehash on the second call), the third call does end up rehashing it.
Am I missing something? I'm expecting the third one to also rehash after the second one has.
Here is a clearer picture of how the table usually looks like before and after the resizes:
table: 0x15bd3d0 n: 0 #: 0 narr: 0 nrec: 1
table: 0x15bd3d0 n: 50000 #: 50000 narr: 65536 nrec: 1
table: 0x15bd3d0 n: 10000 #: 10000 narr: 16384 nrec: 2
table: 0x15bd3d0 n: 20 #: 20 narr: 16384 nrec: 2
And here is what happens:
During the resize to 50000 elements, the table is rehashed several times, and at the end it contains exactly one hash part slot for the n field and enough array part slots for the integer keys.
During the shrinking to 10000 elements, you first assign nil to the integer keys 10001 to 65536, and then from 65537 to 100000. The first group of assignments will never cause a rehash, because you assign to existing fields. This has to do with the guarantees for the next function. The second group of assignments will cause rehashes, but since you are assinging nils, Lua will realize at some point that the array part of the table is more than half empty (see comment at the beginning of ltable.c). Lua will then shrink the array part to a reasonable size and use a second hash slot for the new key. But since you are assigning nils, that second hash slot is never occupied, and Lua is free to re-use it for all the remaining assignments (and it often but not always does). You wouldn't notice a rehash at this point anyway, because you will always end up with the 16384 array slots and 2 hash slots (one for n, one for the new element to be assigned).
The shrinking to 20 elements just continues this way, with the exception that a second hash slot is already available. So you might never get a rehash (and the array size stays larger than necessary), but if you do (Lua for some reason doesn't like the one free hash slot), you'll see the number of array slots drop to a reasonable level.
This is what it looks like when you do get a rehash during the second shrinking:
table: 0x11c43d0 n: 0 #: 0 narr: 0 nrec: 1
table: 0x11c43d0 n: 50000 #: 50000 narr: 65536 nrec: 1
table: 0x11c43d0 n: 10000 #: 10000 narr: 16384 nrec: 2
table: 0x11c43d0 n: 20 #: 20 narr: 32 nrec: 2
If you want to repeat my experiments, the git HEAD version of lua-getsize (original version here) now also returns the number of slots in the array/hash parts of a table.
Im trying to read MemberRef coded index (MemberRefParent), Before I do that I need to know its size, according to ECMA-335 Section II.24.2.6, if I understood it correctly the coded index is calculated like so:
my pseudo code
m=max_rows(t0..tn-1); //returns the number of rows of the table that has the most rows.
if(m<2^(16-log(n)){
//size is 2
} else {
//size is 4
}
When I tested the code on CLI file, I got an error, I must have missed something, I hope someone can help me find where I was wrong.
From section II.24.2.6, ECMA-335
If e is a coded index that points into table ti out of n possible
tables t0, …tn-1, then it is stored as e << (log n) | tag{ t0, …tn-1}[
ti] using 2 bytes if the maximum number of rows of tables t0, …tn-1,
is less than 2^(16 – (log n)), and using 4 bytes otherwise.
-
MemberRefParent: 3 bits to encode tag Tag
TypeDef 0
TypeRef 1
ModuleRef 2
MethodDef 3
TypeSpec 4