Simulate parallelism in Lua - lua

I am trying to use lua to prototype some parallel algorithm. With this I mean to write code in pure Lua, perform tests on it, debug it, etc. Then, when I am confident that it works, I can translate it to a true multithread library, or even to another language (e.g. OpenCL kenel). Obviously I am not concerned in any way with the performance of the prototype code.
I thought to use a coroutine that yield at each line, with some boilerplate to randomly select the next "Thread" to run. For example:
local function parallel_simulation(...)
local function_list = {...}
local coroutine_list = {}
local thread_number = #function_list
for i = 1, thread_number do
coroutine_list[i] = coroutine.create(function_list[i])
end
while 0 < thread_number do
local current = math.random(1, thread_number)
local worker = coroutine_list[current]
coroutine.resume(worker)
if 'dead' == coroutine.status(worker) then
thread_number = thread_number - 1
table.remove(coroutine_list, current)
end
end
end
----------------------------------------------------------
-- Usage example
local Y = coroutine.yield
local max = 3
local counter = 0
local retry = 99
local function increment()
Y() local c = counter
Y() while max > c do
Y() c = counter
Y() c = c + 1
Y() counter = c
Y() end
end
for i=1,retry do
counter = 0
parallel_simulation(increment, increment)
if max ~= counter then
print('Test SUCCESS ! A non-thread-safe algorithm was identified .', i, counter)
return
end
end
error('Test FAIL ! The non-thread-safe algorithm was not identified .')
This is just an idea, any solution involving pure Lua is welcome! What makes me very uncomfortable with this solution, are all that Y(). Is there any way to avoid them? (debug.sethook does not allow to yield...)
EDIT 1 - More meaningful example was provided
EDIT 2 - Hopefully, I clearified what I am trying to accomplish

A simple alternative to putting Y() in front of each line is to use gsub and load:
Y = coroutine.yield
max = 3
counter = 0
code = [[
function increment()
local c = counter
while max > c do
c = counter
c = c + 1
counter = c
end
end]]
code = code:gsub("\n ", "\n Y() ") -- replace two spaces in find/replace with whatever tab character(s) you use
assert(load(code))()
local retry = 99
-- rest of code here
(Use load or loadstring depending on your Lua version)
Note that the variable declarations Y/max/counter must be global or else the loaded function will not have access to them. Similarly, the function in code must be global or else increment will not exist outside the loaded code.
Such a solution assumes that all instructions on each line are atomic/thread-safe, of course.
An improvement I would recommend making to parallel_simulation is to add some way of changing how the next thread is chosen. ex, perhaps the error will only be revealed if one of the threads is early on in execution and another one is almost done - though this state could theoretically be reached through sufficient random trials, having an argument that allows you to adjust which threads are more likely to be chosen next (ex using weights) should make it much more likely.

Related

lua in a table to get another element of the same table

When creating an element in the table, I need to use another element that I created before in the same table. please help me with this.
local table = {
distance = 30.0,
last_distance = table.distance-10.0
}
I want to do the above operation but I can't, I think I need to use self or setmetatable but I don't know how to do it. and please don't give me answers like first create a value outside and then use it in the table, I don't want to do that.
Basic life advice
First of all: Don't call your table table. That will shadow the global table library. Call it t, tab, tabl, Table, table_, or actually give it a useful name, but don't call it table, or there'll be a big surprise when you try to access any table.* methods. Ideally, your linter should warn you about this.
Implementing it using hacks
Table constructors are equivalent to creating a table on the stack - there is no named local variable self or the like. It is likely possible that there is a hidden local variable accessible using debug.getlocal however:
$ lua
Lua 5.4.4 Copyright (C) 1994-2022 Lua.org, PUC-Rio
> function getlocals()
>> local i = 1; repeat local k, v = debug.getlocal(2, i); i = i + 1; print(k, v) until not k
>> t = {a = getlocals(), b = ()}
stdin:3: unexpected symbol near ')'
> function getlocals()
local i = 1; repeat local k, v = debug.getlocal(2, i); i = i + 1; print(k, v) until not k end
> t = {a = getlocals(), b = 2}
(temporary) table: 0x55e9181302d0
nil nil
> t
table: 0x55e9181302d0
Indeed, from basic testing it appears that this is even the first local inside the table constructor! However, it isn't quite as easy:
> local a = 1; local b = a; t = {a = getlocals(), b = 2}; print(b)
a 1
b 1
(temporary) table: 0x55e918130160
nil nil
1
Using extensive hacks, you might be able to write something that returns the currently constructed table most of the time (probably relying on the fact that it will usually be the last local). The following works:
function lastlocal()
local i = 0
local last
::next:: -- you could (and perhaps should) use a loop instead
i = i + 1
local k, v = debug.getlocal(2, i)
if v then
last = v
goto next
end
return last
end
from my basic testing, this works fine to obtain the table currently being constructed:
> function lastlocal()
local i = 0
local last
::next:: -- you could (and perhaps should) use a loop instead
i = i + 1
local k, v = debug.getlocal(2, i)
if v then
last = v
goto next
end
return last
end
> t = {a = 1, b = lastlocal().a}
> t.a
1
> t.b
1
Why you should not implement this using hacks
With all of this in mind: Don't ever do this. The purpose of this merely is to lead this ad absurdum. There are multiple reasons why this is horribly unreliable:
The order of execution of table constructor assignments is undefined. An optimizing interpreter like LuaJIT (and the PUC Lua implementation just as well) is free to reorder {a = 1, b = 2} to {b = 2, a = 1}.
Likewise, how table constructors are implemented internally is entirely undefined. There is no guarantee that the local variable actually exists and is the last one.
It is horribly inefficient and relies on the debug library for something other than debugging.
What's a metatable?
Metatables serve an entirely different purpose; you could dynamically generate derived fields like last_distance using them, but you can't use them to reference a table using a table constructor. Here's a basic example:
local t = {distance = 30}
setmetatable(t, {__index = function(self, k)
if k ~= "last_distance" then return nil end
return t.distance - 10 -- calculate `last_distance` & return it
end})
print(t.last_distance) -- 20
t.distance = 10
print(t.last_distance) -- 0
Back to the question
When creating an element in the table, I need to use another element that I created before in the same table.
The proper way to do this is to either (1) create a value outside of the table
local distance = 30
local last_distance = distance - 10
local tab = {distance = distance, last_distance = last_distance}
Perfectly readable, perfectly fine.
Or (2) first create a table with some properties, then add derived properties:
local tab = {distance = 30}
tab.last_distance = tab.distance - 10
as readable, as fine.
Both will be highly efficient; only micro-optimizations would be debatable (could (1) choose a better layout for the hashes by choosing the right insertion order? does it pre-allocate the right size (likely yes)? does (2) incur a penalty since it indexes tab to obtain tab.distance?), but none of this will likely ever matter.
I want to do the above operation but I can't, I think I need to use self or setmetatable but I don't know how to do it.
I have shown you:
How you can do it using egregious hacks and why you shouldn't.
How you can do something similar (derived attributes) using a metamethod.
and please don't give me answers like first create a value outside and then use it in the table, I don't want to do that.
This is the correct, idiomatic way to do this in Lua though. Your restriction seems arbitrary.

I can't figure out how to do an simple algoritm to get the sum of two numbers

I am trying to find a solution to the problem "Two Sum" if you recognize it , and I've run into a problem and I cannot figure it out (Lua)
Code:
num = {2,7,11,15}
target = 9
current = 0
repeat
createNum1 = tonumber(num[math.random(1,#num)])
createNum2 = tonumber(num[math.random(1,#num)])
current = createNum1 + createNum2
until current == target
print(table.find(num,createNum1), table.find(num,createNum2))
Error:
lua5.3: HelloWorld.lua:9: attempt to call a nil value (field 'find')
stack traceback:
HelloWorld.lua:9: in main chunk
[C]: in ?
Thank you!
Lua has no table.find function in its very small standard library; just take a look at the reference manual.
You could implement your own table.find function, but that would just be monkey-patching an overall broken algorithm. There is no need to use a probabilistic algorithm that probably runs in at least quadratic time if there only is one pair of numbers that adds up to the desired number. Instead, you should leverage Lua's tables - associative arrays - here. First build an index of [number] = last index:
local num = {2,7,11,15}
local target = 9
local idx = {}
for i, n in ipairs(num) do idx[n] = i end
then loop over the numbers; given a number m you just need to look for target - m in your idx lookup:
for i, n in ipairs(num) do local j = idx[target - n]; if j then print(i, j) break end end
if you want to exit early - sometimes without building the full idx table - you can fuse the two loops:
local idx = {}
for i, n in ipairs(num) do
local j = idx[target - n]
if j then
print(j, i)
break
end
idx[n] = i
end
other solutions exist (e.g. using sorting, which requires no auxiliary space), but this one is elegant in that it runs in O(n) time & O(n) space to produce a solution and leverages Lua's builtin data structures.

How can I remove specific items from a table in Lua? [duplicate]

This question is similar to How can I safely iterate a lua table while keys are being removed but distinctly different.
Summary
Given a Lua array (table with keys that are sequential integers starting at 1), what's the best way to iterate through this array and delete some of the entries as they are seen?
Real World Example
I have an array of timestamped entries in a Lua array table. Entries are always added to the end of the array (using table.insert).
local timestampedEvents = {}
function addEvent( data )
table.insert( timestampedEvents, {getCurrentTime(),data} )
end
I need to occasionally run through this table (in order) and process-and-remove certain entries:
function processEventsBefore( timestamp )
for i,stamp in ipairs( timestampedEvents ) do
if stamp[1] <= timestamp then
processEventData( stamp[2] )
table.remove( timestampedEvents, i )
end
end
end
Unfortunately, the code above approach breaks iteration, skipping over some entries. Is there any better (less typing, but still safe) way to do this than manually walking the indices:
function processEventsBefore( timestamp )
local i = 1
while i <= #timestampedEvents do -- warning: do not cache the table length
local stamp = timestampedEvents[i]
if stamp[1] <= timestamp then
processEventData( stamp[2] )
table.remove( timestampedEvents, i )
else
i = i + 1
end
end
end
the general case of iterating over an array and removing random items from the middle while continuing to iterate
If you're iterating front-to-back, when you remove element N, the next element in your iteration (N+1) gets shifted down into that position. If you increment your iteration variable (as ipairs does), you'll skip that element. There are two ways we can deal with this.
Using this sample data:
input = { 'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p' }
remove = { f=true, g=true, j=true, n=true, o=true, p=true }
We can remove input elements during iteration by:
Iterating from back to front.
for i=#input,1,-1 do
if remove[input[i]] then
table.remove(input, i)
end
end
Controlling the loop variable manually, so we can skip incrementing it when removing an element:
local i=1
while i <= #input do
if remove[input[i]] then
table.remove(input, i)
else
i = i + 1
end
end
For non-array tables, you iterate using next or pairs (which is implemented in terms of next) and set items you want removed to nil.
Note that table.remove shifts all following elements every time it's called, so performance is exponential for N removals. If you're removing a lot of elements, you should shift the items yourself as in LHF or Mitch's answer.
Efficiency!
WARNING: Do NOT use table.remove(). That function causes all of the subsequent (following) array indices to be re-indexed every time you call it to remove an array entry. It is therefore MUCH faster to just "compact/re-index" the table in a SINGLE passthrough OURSELVES instead!
The best technique is simple: Count upwards (i) through all array entries, while keeping track of the position we should put the next "kept" value into (j). Anything that's not kept (or which is moved from i to j) is set to nil which tells Lua that we've erased that value.
I'm sharing this, since I really don't like the other answers on this page (as of Oct 2018). They're either wrong, bug-ridden, overly simplistic or overly complicated, and most are ultra-slow. So I implemented an efficient, clean, super-fast one-pass algorithm instead. With a SINGLE loop.
Here's a fully commented example (there's a shorter, non-tutorial version at the end of this post):
function ArrayShow(t)
for i=1,#t do
print('total:'..#t, 'i:'..i, 'v:'..t[i]);
end
end
function ArrayRemove(t, fnKeep)
print('before:');
ArrayShow(t);
print('---');
local j, n = 1, #t;
for i=1,n do
print('i:'..i, 'j:'..j);
if (fnKeep(t, i, j)) then
if (i ~= j) then
print('keeping:'..i, 'moving to:'..j);
-- Keep i's value, move it to j's pos.
t[j] = t[i];
t[i] = nil;
else
-- Keep i's value, already at j's pos.
print('keeping:'..i, 'already at:'..j);
end
j = j + 1;
else
t[i] = nil;
end
end
print('---');
print('after:');
ArrayShow(t);
return t;
end
local t = {
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'
};
ArrayRemove(t, function(t, i, j)
-- Return true to keep the value, or false to discard it.
local v = t[i];
return (v == 'a' or v == 'b' or v == 'f' or v == 'h');
end);
Output, showing its logic along the way, how it's moving things around, etc...
before:
total:9 i:1 v:a
total:9 i:2 v:b
total:9 i:3 v:c
total:9 i:4 v:d
total:9 i:5 v:e
total:9 i:6 v:f
total:9 i:7 v:g
total:9 i:8 v:h
total:9 i:9 v:i
---
i:1 j:1
keeping:1 already at:1
i:2 j:2
keeping:2 already at:2
i:3 j:3
i:4 j:3
i:5 j:3
i:6 j:3
keeping:6 moving to:3
i:7 j:4
i:8 j:4
keeping:8 moving to:4
i:9 j:5
---
after:
total:4 i:1 v:a
total:4 i:2 v:b
total:4 i:3 v:f
total:4 i:4 v:h
Finally, here's the function for use in your own code, without all of the tutorial-printing... and with just a few minimal comments to explain the final algorithm:
function ArrayRemove(t, fnKeep)
local j, n = 1, #t;
for i=1,n do
if (fnKeep(t, i, j)) then
-- Move i's kept value to j's position, if it's not already there.
if (i ~= j) then
t[j] = t[i];
t[i] = nil;
end
j = j + 1; -- Increment position of where we'll place the next kept value.
else
t[i] = nil;
end
end
return t;
end
That's it!
And if you don't want to use the whole "re-usable callback/function" design, you can simply copy the inner code of ArrayRemove() into your project, and change the line if (fnKeep(t, i, j)) then to if (t[i] == 'deleteme') then... That way you get rid of the function call/callback overhead too, and speed things up even more!
Personally, I use the re-usable callback system, since it still massively beats table.remove() by factors of 100-1000+ times faster.
Bonus (Advanced Users): Regular users can skip reading this bonus section. It describes how to sync multiple related tables. Note that the 3rd parameter to fnKeep(t, i, j), the j, is a bonus parameter which allows your keep-function to know what index the value
will be stored at whenever fnKeep answers true (to keep that
value).
Example usage: Let's say you have two "linked" tables,
where one is table['Mitch'] = 1; table['Rick'] = 2; (a hash-table
for quick array index lookups via named strings) and the other is
array[{Mitch Data...}, {Rick Data...}] (an array with numerical indices,
where Mitch's data is at pos 1 and Rick's data is at pos 2,
exactly as described in the hash-table). Now you decide to loop
through the array and remove Mitch Data, which thereby moves Rick Data from position 2 to position 1 instead...
Your fnKeep(t, i, j) function can then easily use the j info to update the hash-table
pointers to ensure they always point at the correct array offsets:
local hData = {['Mitch'] = 1, ['Rick'] = 2};
local aData = {
{['name'] = 'Mitch', ['age'] = 33}, -- [1]
{['name'] = 'Rick', ['age'] = 45}, -- [2]
};
ArrayRemove(aData, function(t, i, j)
local v = t[i];
if (v['name'] == 'Rick') then -- Keep "Rick".
if (i ~= j) then -- i and j differing means its data offset will be moved if kept.
hData[v['name']] = j; -- Point Rick's hash table entry at its new array location.
end
return true; -- Keep.
else
hData[v['name']] = nil; -- Delete this name from the lookup hash-table.
return false; -- Remove from array.
end
end);
Thereby removing 'Mitch' from both the lookup hash-table and the array, and moving the 'Rick' hash-table entry to point
to 1 (that's the value of j) where its array data is being moved
to (since i and j differed, meaning the data was being moved).
This kind of algorithm allows your related tables to stay in perfect sync,
always pointing at the correct data position thanks to the j
parameter.
It's just an advanced bonus for those who need that
feature. Most people can simply ignore the j parameter in their
fnKeep() functions!
Well, that's all, folks!
Enjoy! :-)
Benchmarks (aka "Let's have a good laugh...")
I decided to benchmark this algorithm against the standard "loop backwards and use table.remove()" method which 99.9% of all Lua users are using.
To do this test, I used the following test.lua file: https://pastebin.com/aCAdNXVh
Each algorithm being tested is given 10 test-arrays, containing 2 million items per array (a total of 20 million items per algorithm-test). The items in all arrays are identical (to ensure total fairness in testing): Every 5th item is the number "13" (which will be deleted), and all other items are the number "100" (which will be kept).
Well... my ArrayRemove() algorithm's test concluded in 2.8 seconds (to process the 20 million items). I'm now waiting for the table.remove() test to finish... It's been a few minutes so far and I am getting bored........ Update: Still waiting... Update: I am hungry... Update: Hello... today?! Update: Zzz... Update: Still waiting... Update: ............ Update: Okay, the table.remove() code (which is the method that most Lua users are using) is going to take a few days. I'll update the day it finishes.
Note to self: I began running the test at ~04:55 GMT on November 1st, 2018. My ArrayRemove() algorithm finished in 2.8 seconds... The built-in Lua table.remove() algorithm is still running as of now... I'll update this post later... ;-)
Update: It is now 14:55 GMT on November 1st, 2018, and the table.remove() algorithm has STILL NOT FINISHED. I'm going to abort that part of the test, because Lua has been using 100% of my CPU for the past 10 hours, and I need my computer now. And it's hot enough to make coffee on the laptop's aluminum case...
Here's the result:
Processing 10 arrays with 2 million items (20 million items total):
My ArrayRemove() function: 2.8 seconds.
Normal Lua table.remove(): I decided to quit the test after 10 hours of 100% CPU usage by Lua. Because I need to use my laptop now! ;-)
Here's the stack trace when I pressed Ctrl-C... which confirms what Lua function my CPU has been working on for the last 10 hours, haha:
[ mitch] elapsed time: 2.802
^Clua: test.lua:4: interrupted!
stack traceback:
[C]: in function 'table.remove'
test.lua:4: in function 'test_tableremove'
test.lua:43: in function 'time_func'
test.lua:50: in main chunk
[C]: in ?
If I had let the table.remove() test run to its completion, it may take a few days... Anyone who doesn't mind wasting a ton of electricity is welcome to re-run this test (file is above at pastebin) and let us all know how long it took.
Why is table.remove() so insanely slow? Simply because every call to that function has to repeatedly re-index every table item that exists after the one we told it to remove! So to delete the 1st item in a 2 million item array, it must move the indices of ALL other 2 million items down by 1 slot to fill the gap caused by the deletion. And then... when you remove another item.. it has to yet again move ALL other 2 million items... It does this over and over...
You should never, EVER use table.remove()! Its performance penalty grows rapidly. Here's an example with smaller array sizes, to demonstrate this:
10 arrays of 1,000 items (10k items total): ArrayRemove(): 0.001 seconds, table.remove(): 0.018 seconds (18x slower).
10 arrays of 10,000 items (100k items total): ArrayRemove(): 0.014 seconds, table.remove(): 1.573 seconds (112.4x slower).
10 arrays of 100,000 items (1m items total): ArrayRemove(): 0.142 seconds, table.remove(): 3 minutes, 48 seconds (1605.6x slower).
10 arrays of 2,000,000 items (20m items total): ArrayRemove(): 2.802 seconds, table.remove(): I decided to abort the test after 10 hours, so we may never now how long it takes. ;-) But at the current timepoint (not even finished), it's taken 12847.9x longer than ArrayRemove()... But the final table.remove() result, if I had let it finish, would probably be around 30-40 thousand times slower.
As you can see, table.remove()'s growth in time is not linear (because if it was, then our 1 million item test would have only taken 10x as long as the 0.1 million (100k) test, but instead we see 1.573s vs 3m48s!). So we cannot take a lower test (such as 10k items) and simply multiply it to 10 million items to know how long the test that I aborted would have taken... So if anyone is truly curious about the final result, you'll have to run the test yourselves and post a comment after a few days when table.remove() finishes...
But what we can do at this point, with the benchmarks we have so far, is say table.remove() sucks! ;-)
There's no reason to ever call that function. EVER. Because if you want to delete items from a table, just use t['something'] = nil;. If you want to delete items from an array (a table with numeric indices), use ArrayRemove().
By the way, the tests above were all executed using Lua 5.3.4, since that's the standard runtime most people use. I decided to do a quick run of the main "20 million items" test using LuaJIT 2.0.5 (JIT: ON CMOV SSE2 SSE3 SSE4.1 fold cse dce fwd dse narrow loop abc sink fuse), which is a faster runtime than the standard Lua. The result for 20 million items with ArrayRemove() was: 2.802 seconds in Lua, and 0.092 seconds in LuaJIT. Which means that if your code/project runs on LuaJIT, you can expect even faster performance from my algorithm! :-)
I also re-ran the "100k items" test one final time using LuaJIT, so that we can see how table.remove() performs in LuaJIT instead, and to see if it's any better than regular Lua:
[LUAJIT] 10 arrays of 100,000 items (1m items total): ArrayRemove(): 0.005 seconds, table.remove(): 20.783 seconds (4156.6x slower than ArrayRemove()... but this LuaJIT result is actually a WORSE ratio than regular Lua, whose table.remove() was "only" 1605.6x slower than my algorithm for the same test... So if you're using LuaJIT, the performance ratio is even more in favor of my algorithm!)
Lastly, you may wonder "would table.remove() be faster if we only want to delete one item, since it's a native function?". If you use LuaJIT, the answer to that question is: No. In LuaJIT, ArrayRemove() is faster than table.remove() even for removing ONE ITEM. And who isn't using LuaJIT? With LuaJIT, all Lua code speeds up by easily around 30x compared to regular Lua. Here's the result: [mitch] elapsed time (deleting 1 items): 0.008, [table.remove] elapsed time (deleting 1 items): 0.011. Here's the pastebin for the "just delete 1-6 items" test: https://pastebin.com/wfM7cXtU (with full test results listed at the end of the file).
TL;DR: Don't use table.remove() anywhere, for any reason whatsoever!
Hope you enjoy ArrayRemove()... and have fun, everyone! :-)
I'd avoid table.remove and traverse the array once setting the unwanted entries to nil then traverse the array again compacting it if necessary.
Here's the code I have in mind, using the example from Mud's answer:
local input = { 'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p' }
local remove = { f=true, g=true, j=true, n=true, o=true, p=true }
local n=#input
for i=1,n do
if remove[input[i]] then
input[i]=nil
end
end
local j=0
for i=1,n do
if input[i]~=nil then
j=j+1
input[j]=input[i]
end
end
for i=j+1,n do
input[i]=nil
end
Try this function:
function ripairs(t)
-- Try not to use break when using this function;
-- it may cause the array to be left with empty slots
local ci = 0
local remove = function()
t[ci] = nil
end
return function(t, i)
--print("I", table.concat(array, ','))
i = i+1
ci = i
local v = t[i]
if v == nil then
local rj = 0
for ri = 1, i-1 do
if t[ri] ~= nil then
rj = rj+1
t[rj] = t[ri]
--print("R", table.concat(array, ','))
end
end
for ri = rj+1, i do
t[ri] = nil
end
return
end
return i, v, remove
end, t, ci
end
It doesn't use table.remove, so it should have O(N) complexity. You could move the remove function into the for-generator to remove the need for an upvalue, but that would mean a new closure for every element... and it isn't a practical issue.
Example usage:
function math.isprime(n)
for i = 2, n^(1/2) do
if (n % i) == 0 then
return false
end
end
return true
end
array = {}
for i = 1, 500 do array[i] = i+10 end
print("S", table.concat(array, ','))
for i, v, remove in ripairs(array) do
if not math.isprime(v) then
remove()
end
end
print("E", table.concat(array, ','))
Be careful not to use break (or otherwise exit prematurely from the loop) as it will leave the array with nil elements.
If you want break to mean "abort" (as in, nothing is removed), you could do this:
function rtipairs(t, skip_marked)
local ci = 0
local tbr = {} -- "to be removed"
local remove = function(i)
tbr[i or ci] = true
end
return function(t, i)
--print("I", table.concat(array, ','))
local v
repeat
i = i+1
v = t[i]
until not v or not (skip_marked and tbr[i])
ci = i
if v == nil then
local rj = 0
for ri = 1, i-1 do
if not tbr[ri] then
rj = rj+1
t[rj] = t[ri]
--print("R", table.concat(array, ','))
end
end
for ri = rj+1, i do
t[ri] = nil
end
return
end
return i, v, remove
end, t, ci
end
This has the advantage of being able to cancel the entire loop with no elements being removed, as well as provide the option to skip over elements already marked as "to be removed". The disadvantage is the overhead of a new table.
I hope these are helpful to you.
You might consider using a priority queue instead of a sorted array.
A priority queue will efficiently compact itself as you remove entries in order.
For an example of a priority queue implementation, see this mailing list thread: http://lua-users.org/lists/lua-l/2007-07/msg00482.html
Simple..
values = {'a', 'b', 'c', 'd', 'e', 'f'}
rem_key = {}
for i,v in pairs(values) do
if remove_value() then
table.insert(rem_key, i)
end
end
for i,v in pairs(rem_key) do
table.remove(values, v)
end
I recommend against using table.remove, for performance reasons (which may be more or less relevant to your particular case).
Here's what that type of loop generally looks like for me:
local mylist_size = #mylist
local i = 1
while i <= mylist_size do
local value = mylist[i]
if value == 123 then
mylist[i] = mylist[mylist_size]
mylist[mylist_size] = nil
mylist_size = mylist_size - 1
else
i = i + 1
end
end
Note This is fast BUT with two caveats:
It is faster if you need to remove relatively few elements. (It does practically no work for elements that should be kept).
It will leave the array UNSORTED. Sometimes you don't care about having a sorted array, and in that case this is a useful "shortcut".
If you want to preserve the order of the elements, or if you expect to not keep most of the elements, then look into Mitch's solution. Here is a rough comparison between mine and his. I ran it on https://www.lua.org/cgi-bin/demo and most results were similar to this:
[ srekel] elapsed time: 0.020
[ mitch] elapsed time: 0.040
[ srekel] elapsed time: 0.020
[ mitch] elapsed time: 0.040
Of course, remember that it varies depending on your particular data.
Here is the code for the test:
function test_srekel(mylist)
local mylist_size = #mylist
local i = 1
while i <= mylist_size do
local value = mylist[i]
if value == 13 then
mylist[i] = mylist[mylist_size]
mylist[mylist_size] = nil
mylist_size = mylist_size - 1
else
i = i + 1
end
end
end -- func
function test_mitch(mylist)
local j, n = 1, #mylist;
for i=1,n do
local value = mylist[i]
if value ~= 13 then
-- Move i's kept value to j's position, if it's not already there.
if (i ~= j) then
mylist[j] = mylist[i];
mylist[i] = nil;
end
j = j + 1; -- Increment position of where we'll place the next kept value.
else
mylist[i] = nil;
end
end
end
function build_tables()
local tables = {}
for i=1, 10 do
tables[i] = {}
for j=1, 100000 do
tables[i][j] = j % 15373
end
end
return tables
end
function time_func(func, name)
local tables = build_tables()
time0 = os.clock()
for i=1, #tables do
func(tables[i])
end
time1 = os.clock()
print(string.format("[%10s] elapsed time: %.3f\n", name, time1 - time0))
end
time_func(test_srekel, "srekel")
time_func(test_mitch, "mitch")
time_func(test_srekel, "srekel")
time_func(test_mitch, "mitch")
You can use a functor to check for elements that need to be removed. The additional gain is that it completes in O(n), because it doesn't use table.remove
function table.iremove_if(t, f)
local j = 0
local i = 0
while (i <= #f) do
if (f(i, t[i])) then
j = j + 1
else
i = i + 1
end
if (j > 0) then
local ij = i + j
if (ij > #f) then
t[i] = nil
else
t[i] = t[ij]
end
end
end
return j > 0 and j or nil -- The number of deleted items, nil if 0
end
Usage:
table.iremove_if(myList, function(i,v) return v.name == name end)
In your case:
table.iremove_if(timestampedEvents, function(_,stamp)
if (stamp[1] <= timestamp) then
processEventData(stamp[2])
return true
end
end)
This is basically restating the other solutions in non-functional style; I find this much easier to follow (and harder to get wrong):
for i=#array,1,-1 do
local element=array[i]
local remove = false
-- your code here
if remove then
array[i] = array[#array]
array[#array] = nil
end
end
It occurs to me that—for my special case, where I only ever shift entries from the front of the queue—I can do this far more simply via:
function processEventsBefore( timestamp )
while timestampedEvents[1] and timestampedEvents[1][1] <= timestamp do
processEventData( timestampedEvents[1][2] )
table.remove( timestampedEvents, 1 )
end
end
However, I'll not accept this as the answer because it does not handle the general case of iterating over an array and removing random items from the middle while continuing to iterate.
First, definitely read #MitchMcCabers’s post detailing the evils of table.remove().
Now I’m no lua whiz but I tried to combine his approach with #MartinRudat’s, using an assist from an array-detection approach modified from #PiFace’s answer here.
The result, according to my tests, successfully removes an element from either a key-value table or an array.
I hope it’s right, it works for me so far!
--helper function needed for remove(...)
--I’m not super able to explain it, check the link above
function isarray(tableT)
for k, v in pairs(tableT) do
if tonumber(k) ~= nil and k ~= #tableT then
if tableT[k+1] ~= k+1 then
return false
end
end
end
return #tableT > 0 and next(tableT, #tableT) == nil
end
function remove(targetTable, removeMe)
--check if this is an array
if isarray(targetTable) then
--flag for when a table needs to squish in to fill cleared space
local shouldMoveDown = false
--iterate over table in order
for i = 1, #targetTable do
--check if the value is found
if targetTable[i] == removeMe then
--if so, set flag to start collapsing the table to write over it
shouldMoveDown = true
end
--if collapsing needs to happen...
if shouldMoveDown then
--check if we're not at the end
if i ~= #targetTable then
--if not, copy the next value over this one
targetTable[i] = targetTable[i+1]
else
--if so, delete the last value
targetTable[#targetTable] = nil
end
end
end
else
--loop over elements
for k, v in pairs(targetTable) do
--check for thing to remove
if (v == removeMe) then
--if found, nil it
targetTable[k] = nil
break
end
end
end
return targetTable, removeMe;
end
Efficiency! Even more! )
Regarding Mitch's variant. It has some waste assignments to nil, here is optimized version with the same idea:
function ArrayRemove(t, fnKeep)
local j, n = 1, #t;
for i=1,n do
if (fnKeep(t, i, j)) then
-- Move i's kept value to j's position, if it's not already there.
if (i ~= j) then
t[j] = t[i];
end
j = j + 1; -- Increment position of where we'll place the next kept value.
end
end
table.move(t,n+1,n+n-j+1,j);
--for i=j,n do t[i]=nil end
return t;
end
And here is even more optimized version with block moving
For larger arrays and larger keeped blocks
function ArrayRemove(t, fnKeep)
local i, j, n = 1, 1, #t;
while i <= n do
if (fnKeep(t, i, j)) then
local k = i
repeat
i = i + 1;
until i>n or not fnKeep(t, i, j+i-k)
--if (k ~= j) then
table.move(t,k,i-1,j);
--end
j = j + i - k;
end
i = i + 1;
end
table.move(t,n+1,n+n-j+1,j);
return t;
end
if (k ~= j) is not needed as it is executed many times but "true" after first remove. I think table.move() handles index checks anyway.table.move(t,n+1,n+n-j+1,j) is equivalent to "for i=j,n do t[i]=nil end".I'm new to lua and don't know if where is efficient value replication function. Here we would replicate nil n-j+1 times.
And regarding table.remove(). I think it should utilize table.move() that moves elements in one operation. Kind of memcpy in C. So maybe it's not so bad afterall.#MitchMcMabers, can you update your benchmarks? Did you use lua >= 5.3?

A unique environment per script in Lua 5.3

I would like to be able to have a chunk of Lua code (a "script") that could be shared among enemy types in a game but where each instance of a script gets a unique execution environment. To illustrate my problem, this is my first attempt at what a script might look like:
time_since_last_shoot = 0
tick = function(entity_id, dt)
time_since_last_shoot = time_since_last_shoot + dt
if time_since_last_shoot > 10 then
enemy = find_closest_enemy(entity_id)
shoot(entity_id, enemy)
time_since_last_shoot = 0
end
end
But that fails since I'd be sharing the global time_since_last_shoot variable among all my enemies. So then I tried this:
spawn = function(entity)
entity.time_since_last_shoot = 0;
end
tick = function(entity, dt)
entity.time_since_last_shoot = entity.time_since_last_shoot + dt
if entity.time_since_last_shoot > 10 then
enemy = find_closest_enemy(entity)
shoot(entity, enemy)
entity.time_since_last_shoot = 0
end
end
And then for each entity I create a unique table and then pass that as the first argument when I call the spawn and tick functions. And then somehow map that table back to an id at runtime. Which could work, but I have a couple concerns.
First, it's error prone. A script could still accidentally create global state that could lead to difficult to debug problems later in the same script or even others.
And second, since the update and tick functions are themselves global, I'll still run into issues when I go to create a second type of enemy which tries to use the same interface. I suppose I could solve that with some kind of naming convention but surely there's a better way to handle that.
I did find this question which seems to be asking the same thing, but the accepted answer is light on specifics and refers to a lua_setfenv function that isn't present in Lua 5.3. It seems that it was replaced by _ENV, unfortunately I'm not familiar enough with Lua to fully understand and/or translate the concept.
[edit] A third attempt based on the suggestion of #hugomg:
-- baddie.lua
baddie.spawn = function(self)
self.time_since_last_shoot = 0
end
baddie.tick = function(self, dt)
entity.time_since_last_shoot = entity.time_since_last_shoot + dt
if entity.time_since_last_shoot > 10 then
enemy = find_closest_enemy(entity)
shoot(entity, enemy)
entity.time_since_last_shoot = 0
end
end
And in C++ (using sol2):
// In game startup
sol::state lua;
sol::table global_entities = lua.create_named_table("global_entities");
// For each type of entity
sol::table baddie_prototype = lua.create_named_table("baddie_prototype");
lua.script_file("baddie.lua")
std::function<void(table, float)> tick = baddie_prototype.get<sol::function>("tick");
// When spawning a new instance of the enemy type
sol::table baddie_instance = all_entities.create("baddie_instance");
baddie_instance["entity_handle"] = new_unique_handle();
// During update
tick(baddie_instance, 0.1f);`
This works how I expected and I like the interface but I'm not sure if it follows the path of least surprise for someone who might be more familiar with Lua than I. Namely, my use of the implicit self parameter and my distinction between prototype/instance. Do I have the right idea or have I done something weird?
For your first issue (accidentally creating globals), you can rely on a linter like luacheck or a module that prevents you from creating globals like strict.lua from Penlight.
And then, why not just make things local? I mean both time_since_last_shoot and tick. This leverages closures, one of the most useful features of Lua. If you want different tick functions, each with its own variables, you can do something like this:
local function new_tick()
local time_since_last_shoot = 0
return function(entity_id, dt)
time_since_last_shoot = time_since_last_shoot + dt
if time_since_last_shoot > 10 then
local enemy = find_closest_enemy(entity_id)
shoot(entity_id, enemy)
time_since_last_shoot = 0
end
end
end
local tick_1 = new_tick()
local tick_2 = new_tick()
Of course, you could also use the environment for this, but here I think local variables and closure are a better solution to the problem.
The way _ENV works in 5.3 is that global variable are "syntactic" sugar for reading fields from the _ENV variable. For example, a program that does
local x = 10
y = 20
print(x + y)
is equivalent to
local x = 10
_ENV.y = 20
_ENV.print(x + _ENV.y)
By default, _ENV is a "global table" that works like you would expect global variables to behave. However, if you create a local variable (or function argument) named _ENV then in that variable's scope any unbound variables will point to this new environment instead of point to the usual global scope. For example, the following program prints 10:
local _ENV = {
x = 10,
print=print
}
-- the following line is equivalent to
-- _ENV.print(_ENV.x)
print(x)
In your program, one way to use this technique would be to add an extra parameter to your functions for the environment:
tick = function(_ENV, entity, dt)
-- ...
end
then, any global variables inside the function will actually just be accessing fields in the _ENV parameter instead of actually being global.
That said, I'm not sure _ENV is the best tool to solve your problem. For your first problem, of accidentally creating globals, a simpler solution would be to use a linter to warn you if you assign to an undeclared global variable. As for the second problem, you could just put the update and tick functions in a table instead of having them be global.

Safely remove items from an array table while iterating

This question is similar to How can I safely iterate a lua table while keys are being removed but distinctly different.
Summary
Given a Lua array (table with keys that are sequential integers starting at 1), what's the best way to iterate through this array and delete some of the entries as they are seen?
Real World Example
I have an array of timestamped entries in a Lua array table. Entries are always added to the end of the array (using table.insert).
local timestampedEvents = {}
function addEvent( data )
table.insert( timestampedEvents, {getCurrentTime(),data} )
end
I need to occasionally run through this table (in order) and process-and-remove certain entries:
function processEventsBefore( timestamp )
for i,stamp in ipairs( timestampedEvents ) do
if stamp[1] <= timestamp then
processEventData( stamp[2] )
table.remove( timestampedEvents, i )
end
end
end
Unfortunately, the code above approach breaks iteration, skipping over some entries. Is there any better (less typing, but still safe) way to do this than manually walking the indices:
function processEventsBefore( timestamp )
local i = 1
while i <= #timestampedEvents do -- warning: do not cache the table length
local stamp = timestampedEvents[i]
if stamp[1] <= timestamp then
processEventData( stamp[2] )
table.remove( timestampedEvents, i )
else
i = i + 1
end
end
end
the general case of iterating over an array and removing random items from the middle while continuing to iterate
If you're iterating front-to-back, when you remove element N, the next element in your iteration (N+1) gets shifted down into that position. If you increment your iteration variable (as ipairs does), you'll skip that element. There are two ways we can deal with this.
Using this sample data:
input = { 'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p' }
remove = { f=true, g=true, j=true, n=true, o=true, p=true }
We can remove input elements during iteration by:
Iterating from back to front.
for i=#input,1,-1 do
if remove[input[i]] then
table.remove(input, i)
end
end
Controlling the loop variable manually, so we can skip incrementing it when removing an element:
local i=1
while i <= #input do
if remove[input[i]] then
table.remove(input, i)
else
i = i + 1
end
end
For non-array tables, you iterate using next or pairs (which is implemented in terms of next) and set items you want removed to nil.
Note that table.remove shifts all following elements every time it's called, so performance is exponential for N removals. If you're removing a lot of elements, you should shift the items yourself as in LHF or Mitch's answer.
Efficiency!
WARNING: Do NOT use table.remove(). That function causes all of the subsequent (following) array indices to be re-indexed every time you call it to remove an array entry. It is therefore MUCH faster to just "compact/re-index" the table in a SINGLE passthrough OURSELVES instead!
The best technique is simple: Count upwards (i) through all array entries, while keeping track of the position we should put the next "kept" value into (j). Anything that's not kept (or which is moved from i to j) is set to nil which tells Lua that we've erased that value.
I'm sharing this, since I really don't like the other answers on this page (as of Oct 2018). They're either wrong, bug-ridden, overly simplistic or overly complicated, and most are ultra-slow. So I implemented an efficient, clean, super-fast one-pass algorithm instead. With a SINGLE loop.
Here's a fully commented example (there's a shorter, non-tutorial version at the end of this post):
function ArrayShow(t)
for i=1,#t do
print('total:'..#t, 'i:'..i, 'v:'..t[i]);
end
end
function ArrayRemove(t, fnKeep)
print('before:');
ArrayShow(t);
print('---');
local j, n = 1, #t;
for i=1,n do
print('i:'..i, 'j:'..j);
if (fnKeep(t, i, j)) then
if (i ~= j) then
print('keeping:'..i, 'moving to:'..j);
-- Keep i's value, move it to j's pos.
t[j] = t[i];
t[i] = nil;
else
-- Keep i's value, already at j's pos.
print('keeping:'..i, 'already at:'..j);
end
j = j + 1;
else
t[i] = nil;
end
end
print('---');
print('after:');
ArrayShow(t);
return t;
end
local t = {
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'
};
ArrayRemove(t, function(t, i, j)
-- Return true to keep the value, or false to discard it.
local v = t[i];
return (v == 'a' or v == 'b' or v == 'f' or v == 'h');
end);
Output, showing its logic along the way, how it's moving things around, etc...
before:
total:9 i:1 v:a
total:9 i:2 v:b
total:9 i:3 v:c
total:9 i:4 v:d
total:9 i:5 v:e
total:9 i:6 v:f
total:9 i:7 v:g
total:9 i:8 v:h
total:9 i:9 v:i
---
i:1 j:1
keeping:1 already at:1
i:2 j:2
keeping:2 already at:2
i:3 j:3
i:4 j:3
i:5 j:3
i:6 j:3
keeping:6 moving to:3
i:7 j:4
i:8 j:4
keeping:8 moving to:4
i:9 j:5
---
after:
total:4 i:1 v:a
total:4 i:2 v:b
total:4 i:3 v:f
total:4 i:4 v:h
Finally, here's the function for use in your own code, without all of the tutorial-printing... and with just a few minimal comments to explain the final algorithm:
function ArrayRemove(t, fnKeep)
local j, n = 1, #t;
for i=1,n do
if (fnKeep(t, i, j)) then
-- Move i's kept value to j's position, if it's not already there.
if (i ~= j) then
t[j] = t[i];
t[i] = nil;
end
j = j + 1; -- Increment position of where we'll place the next kept value.
else
t[i] = nil;
end
end
return t;
end
That's it!
And if you don't want to use the whole "re-usable callback/function" design, you can simply copy the inner code of ArrayRemove() into your project, and change the line if (fnKeep(t, i, j)) then to if (t[i] == 'deleteme') then... That way you get rid of the function call/callback overhead too, and speed things up even more!
Personally, I use the re-usable callback system, since it still massively beats table.remove() by factors of 100-1000+ times faster.
Bonus (Advanced Users): Regular users can skip reading this bonus section. It describes how to sync multiple related tables. Note that the 3rd parameter to fnKeep(t, i, j), the j, is a bonus parameter which allows your keep-function to know what index the value
will be stored at whenever fnKeep answers true (to keep that
value).
Example usage: Let's say you have two "linked" tables,
where one is table['Mitch'] = 1; table['Rick'] = 2; (a hash-table
for quick array index lookups via named strings) and the other is
array[{Mitch Data...}, {Rick Data...}] (an array with numerical indices,
where Mitch's data is at pos 1 and Rick's data is at pos 2,
exactly as described in the hash-table). Now you decide to loop
through the array and remove Mitch Data, which thereby moves Rick Data from position 2 to position 1 instead...
Your fnKeep(t, i, j) function can then easily use the j info to update the hash-table
pointers to ensure they always point at the correct array offsets:
local hData = {['Mitch'] = 1, ['Rick'] = 2};
local aData = {
{['name'] = 'Mitch', ['age'] = 33}, -- [1]
{['name'] = 'Rick', ['age'] = 45}, -- [2]
};
ArrayRemove(aData, function(t, i, j)
local v = t[i];
if (v['name'] == 'Rick') then -- Keep "Rick".
if (i ~= j) then -- i and j differing means its data offset will be moved if kept.
hData[v['name']] = j; -- Point Rick's hash table entry at its new array location.
end
return true; -- Keep.
else
hData[v['name']] = nil; -- Delete this name from the lookup hash-table.
return false; -- Remove from array.
end
end);
Thereby removing 'Mitch' from both the lookup hash-table and the array, and moving the 'Rick' hash-table entry to point
to 1 (that's the value of j) where its array data is being moved
to (since i and j differed, meaning the data was being moved).
This kind of algorithm allows your related tables to stay in perfect sync,
always pointing at the correct data position thanks to the j
parameter.
It's just an advanced bonus for those who need that
feature. Most people can simply ignore the j parameter in their
fnKeep() functions!
Well, that's all, folks!
Enjoy! :-)
Benchmarks (aka "Let's have a good laugh...")
I decided to benchmark this algorithm against the standard "loop backwards and use table.remove()" method which 99.9% of all Lua users are using.
To do this test, I used the following test.lua file: https://pastebin.com/aCAdNXVh
Each algorithm being tested is given 10 test-arrays, containing 2 million items per array (a total of 20 million items per algorithm-test). The items in all arrays are identical (to ensure total fairness in testing): Every 5th item is the number "13" (which will be deleted), and all other items are the number "100" (which will be kept).
Well... my ArrayRemove() algorithm's test concluded in 2.8 seconds (to process the 20 million items). I'm now waiting for the table.remove() test to finish... It's been a few minutes so far and I am getting bored........ Update: Still waiting... Update: I am hungry... Update: Hello... today?! Update: Zzz... Update: Still waiting... Update: ............ Update: Okay, the table.remove() code (which is the method that most Lua users are using) is going to take a few days. I'll update the day it finishes.
Note to self: I began running the test at ~04:55 GMT on November 1st, 2018. My ArrayRemove() algorithm finished in 2.8 seconds... The built-in Lua table.remove() algorithm is still running as of now... I'll update this post later... ;-)
Update: It is now 14:55 GMT on November 1st, 2018, and the table.remove() algorithm has STILL NOT FINISHED. I'm going to abort that part of the test, because Lua has been using 100% of my CPU for the past 10 hours, and I need my computer now. And it's hot enough to make coffee on the laptop's aluminum case...
Here's the result:
Processing 10 arrays with 2 million items (20 million items total):
My ArrayRemove() function: 2.8 seconds.
Normal Lua table.remove(): I decided to quit the test after 10 hours of 100% CPU usage by Lua. Because I need to use my laptop now! ;-)
Here's the stack trace when I pressed Ctrl-C... which confirms what Lua function my CPU has been working on for the last 10 hours, haha:
[ mitch] elapsed time: 2.802
^Clua: test.lua:4: interrupted!
stack traceback:
[C]: in function 'table.remove'
test.lua:4: in function 'test_tableremove'
test.lua:43: in function 'time_func'
test.lua:50: in main chunk
[C]: in ?
If I had let the table.remove() test run to its completion, it may take a few days... Anyone who doesn't mind wasting a ton of electricity is welcome to re-run this test (file is above at pastebin) and let us all know how long it took.
Why is table.remove() so insanely slow? Simply because every call to that function has to repeatedly re-index every table item that exists after the one we told it to remove! So to delete the 1st item in a 2 million item array, it must move the indices of ALL other 2 million items down by 1 slot to fill the gap caused by the deletion. And then... when you remove another item.. it has to yet again move ALL other 2 million items... It does this over and over...
You should never, EVER use table.remove()! Its performance penalty grows rapidly. Here's an example with smaller array sizes, to demonstrate this:
10 arrays of 1,000 items (10k items total): ArrayRemove(): 0.001 seconds, table.remove(): 0.018 seconds (18x slower).
10 arrays of 10,000 items (100k items total): ArrayRemove(): 0.014 seconds, table.remove(): 1.573 seconds (112.4x slower).
10 arrays of 100,000 items (1m items total): ArrayRemove(): 0.142 seconds, table.remove(): 3 minutes, 48 seconds (1605.6x slower).
10 arrays of 2,000,000 items (20m items total): ArrayRemove(): 2.802 seconds, table.remove(): I decided to abort the test after 10 hours, so we may never now how long it takes. ;-) But at the current timepoint (not even finished), it's taken 12847.9x longer than ArrayRemove()... But the final table.remove() result, if I had let it finish, would probably be around 30-40 thousand times slower.
As you can see, table.remove()'s growth in time is not linear (because if it was, then our 1 million item test would have only taken 10x as long as the 0.1 million (100k) test, but instead we see 1.573s vs 3m48s!). So we cannot take a lower test (such as 10k items) and simply multiply it to 10 million items to know how long the test that I aborted would have taken... So if anyone is truly curious about the final result, you'll have to run the test yourselves and post a comment after a few days when table.remove() finishes...
But what we can do at this point, with the benchmarks we have so far, is say table.remove() sucks! ;-)
There's no reason to ever call that function. EVER. Because if you want to delete items from a table, just use t['something'] = nil;. If you want to delete items from an array (a table with numeric indices), use ArrayRemove().
By the way, the tests above were all executed using Lua 5.3.4, since that's the standard runtime most people use. I decided to do a quick run of the main "20 million items" test using LuaJIT 2.0.5 (JIT: ON CMOV SSE2 SSE3 SSE4.1 fold cse dce fwd dse narrow loop abc sink fuse), which is a faster runtime than the standard Lua. The result for 20 million items with ArrayRemove() was: 2.802 seconds in Lua, and 0.092 seconds in LuaJIT. Which means that if your code/project runs on LuaJIT, you can expect even faster performance from my algorithm! :-)
I also re-ran the "100k items" test one final time using LuaJIT, so that we can see how table.remove() performs in LuaJIT instead, and to see if it's any better than regular Lua:
[LUAJIT] 10 arrays of 100,000 items (1m items total): ArrayRemove(): 0.005 seconds, table.remove(): 20.783 seconds (4156.6x slower than ArrayRemove()... but this LuaJIT result is actually a WORSE ratio than regular Lua, whose table.remove() was "only" 1605.6x slower than my algorithm for the same test... So if you're using LuaJIT, the performance ratio is even more in favor of my algorithm!)
Lastly, you may wonder "would table.remove() be faster if we only want to delete one item, since it's a native function?". If you use LuaJIT, the answer to that question is: No. In LuaJIT, ArrayRemove() is faster than table.remove() even for removing ONE ITEM. And who isn't using LuaJIT? With LuaJIT, all Lua code speeds up by easily around 30x compared to regular Lua. Here's the result: [mitch] elapsed time (deleting 1 items): 0.008, [table.remove] elapsed time (deleting 1 items): 0.011. Here's the pastebin for the "just delete 1-6 items" test: https://pastebin.com/wfM7cXtU (with full test results listed at the end of the file).
TL;DR: Don't use table.remove() anywhere, for any reason whatsoever!
Hope you enjoy ArrayRemove()... and have fun, everyone! :-)
I'd avoid table.remove and traverse the array once setting the unwanted entries to nil then traverse the array again compacting it if necessary.
Here's the code I have in mind, using the example from Mud's answer:
local input = { 'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p' }
local remove = { f=true, g=true, j=true, n=true, o=true, p=true }
local n=#input
for i=1,n do
if remove[input[i]] then
input[i]=nil
end
end
local j=0
for i=1,n do
if input[i]~=nil then
j=j+1
input[j]=input[i]
end
end
for i=j+1,n do
input[i]=nil
end
Try this function:
function ripairs(t)
-- Try not to use break when using this function;
-- it may cause the array to be left with empty slots
local ci = 0
local remove = function()
t[ci] = nil
end
return function(t, i)
--print("I", table.concat(array, ','))
i = i+1
ci = i
local v = t[i]
if v == nil then
local rj = 0
for ri = 1, i-1 do
if t[ri] ~= nil then
rj = rj+1
t[rj] = t[ri]
--print("R", table.concat(array, ','))
end
end
for ri = rj+1, i do
t[ri] = nil
end
return
end
return i, v, remove
end, t, ci
end
It doesn't use table.remove, so it should have O(N) complexity. You could move the remove function into the for-generator to remove the need for an upvalue, but that would mean a new closure for every element... and it isn't a practical issue.
Example usage:
function math.isprime(n)
for i = 2, n^(1/2) do
if (n % i) == 0 then
return false
end
end
return true
end
array = {}
for i = 1, 500 do array[i] = i+10 end
print("S", table.concat(array, ','))
for i, v, remove in ripairs(array) do
if not math.isprime(v) then
remove()
end
end
print("E", table.concat(array, ','))
Be careful not to use break (or otherwise exit prematurely from the loop) as it will leave the array with nil elements.
If you want break to mean "abort" (as in, nothing is removed), you could do this:
function rtipairs(t, skip_marked)
local ci = 0
local tbr = {} -- "to be removed"
local remove = function(i)
tbr[i or ci] = true
end
return function(t, i)
--print("I", table.concat(array, ','))
local v
repeat
i = i+1
v = t[i]
until not v or not (skip_marked and tbr[i])
ci = i
if v == nil then
local rj = 0
for ri = 1, i-1 do
if not tbr[ri] then
rj = rj+1
t[rj] = t[ri]
--print("R", table.concat(array, ','))
end
end
for ri = rj+1, i do
t[ri] = nil
end
return
end
return i, v, remove
end, t, ci
end
This has the advantage of being able to cancel the entire loop with no elements being removed, as well as provide the option to skip over elements already marked as "to be removed". The disadvantage is the overhead of a new table.
I hope these are helpful to you.
You might consider using a priority queue instead of a sorted array.
A priority queue will efficiently compact itself as you remove entries in order.
For an example of a priority queue implementation, see this mailing list thread: http://lua-users.org/lists/lua-l/2007-07/msg00482.html
Simple..
values = {'a', 'b', 'c', 'd', 'e', 'f'}
rem_key = {}
for i,v in pairs(values) do
if remove_value() then
table.insert(rem_key, i)
end
end
for i,v in pairs(rem_key) do
table.remove(values, v)
end
I recommend against using table.remove, for performance reasons (which may be more or less relevant to your particular case).
Here's what that type of loop generally looks like for me:
local mylist_size = #mylist
local i = 1
while i <= mylist_size do
local value = mylist[i]
if value == 123 then
mylist[i] = mylist[mylist_size]
mylist[mylist_size] = nil
mylist_size = mylist_size - 1
else
i = i + 1
end
end
Note This is fast BUT with two caveats:
It is faster if you need to remove relatively few elements. (It does practically no work for elements that should be kept).
It will leave the array UNSORTED. Sometimes you don't care about having a sorted array, and in that case this is a useful "shortcut".
If you want to preserve the order of the elements, or if you expect to not keep most of the elements, then look into Mitch's solution. Here is a rough comparison between mine and his. I ran it on https://www.lua.org/cgi-bin/demo and most results were similar to this:
[ srekel] elapsed time: 0.020
[ mitch] elapsed time: 0.040
[ srekel] elapsed time: 0.020
[ mitch] elapsed time: 0.040
Of course, remember that it varies depending on your particular data.
Here is the code for the test:
function test_srekel(mylist)
local mylist_size = #mylist
local i = 1
while i <= mylist_size do
local value = mylist[i]
if value == 13 then
mylist[i] = mylist[mylist_size]
mylist[mylist_size] = nil
mylist_size = mylist_size - 1
else
i = i + 1
end
end
end -- func
function test_mitch(mylist)
local j, n = 1, #mylist;
for i=1,n do
local value = mylist[i]
if value ~= 13 then
-- Move i's kept value to j's position, if it's not already there.
if (i ~= j) then
mylist[j] = mylist[i];
mylist[i] = nil;
end
j = j + 1; -- Increment position of where we'll place the next kept value.
else
mylist[i] = nil;
end
end
end
function build_tables()
local tables = {}
for i=1, 10 do
tables[i] = {}
for j=1, 100000 do
tables[i][j] = j % 15373
end
end
return tables
end
function time_func(func, name)
local tables = build_tables()
time0 = os.clock()
for i=1, #tables do
func(tables[i])
end
time1 = os.clock()
print(string.format("[%10s] elapsed time: %.3f\n", name, time1 - time0))
end
time_func(test_srekel, "srekel")
time_func(test_mitch, "mitch")
time_func(test_srekel, "srekel")
time_func(test_mitch, "mitch")
You can use a functor to check for elements that need to be removed. The additional gain is that it completes in O(n), because it doesn't use table.remove
function table.iremove_if(t, f)
local j = 0
local i = 0
while (i <= #f) do
if (f(i, t[i])) then
j = j + 1
else
i = i + 1
end
if (j > 0) then
local ij = i + j
if (ij > #f) then
t[i] = nil
else
t[i] = t[ij]
end
end
end
return j > 0 and j or nil -- The number of deleted items, nil if 0
end
Usage:
table.iremove_if(myList, function(i,v) return v.name == name end)
In your case:
table.iremove_if(timestampedEvents, function(_,stamp)
if (stamp[1] <= timestamp) then
processEventData(stamp[2])
return true
end
end)
This is basically restating the other solutions in non-functional style; I find this much easier to follow (and harder to get wrong):
for i=#array,1,-1 do
local element=array[i]
local remove = false
-- your code here
if remove then
array[i] = array[#array]
array[#array] = nil
end
end
It occurs to me that—for my special case, where I only ever shift entries from the front of the queue—I can do this far more simply via:
function processEventsBefore( timestamp )
while timestampedEvents[1] and timestampedEvents[1][1] <= timestamp do
processEventData( timestampedEvents[1][2] )
table.remove( timestampedEvents, 1 )
end
end
However, I'll not accept this as the answer because it does not handle the general case of iterating over an array and removing random items from the middle while continuing to iterate.
First, definitely read #MitchMcCabers’s post detailing the evils of table.remove().
Now I’m no lua whiz but I tried to combine his approach with #MartinRudat’s, using an assist from an array-detection approach modified from #PiFace’s answer here.
The result, according to my tests, successfully removes an element from either a key-value table or an array.
I hope it’s right, it works for me so far!
--helper function needed for remove(...)
--I’m not super able to explain it, check the link above
function isarray(tableT)
for k, v in pairs(tableT) do
if tonumber(k) ~= nil and k ~= #tableT then
if tableT[k+1] ~= k+1 then
return false
end
end
end
return #tableT > 0 and next(tableT, #tableT) == nil
end
function remove(targetTable, removeMe)
--check if this is an array
if isarray(targetTable) then
--flag for when a table needs to squish in to fill cleared space
local shouldMoveDown = false
--iterate over table in order
for i = 1, #targetTable do
--check if the value is found
if targetTable[i] == removeMe then
--if so, set flag to start collapsing the table to write over it
shouldMoveDown = true
end
--if collapsing needs to happen...
if shouldMoveDown then
--check if we're not at the end
if i ~= #targetTable then
--if not, copy the next value over this one
targetTable[i] = targetTable[i+1]
else
--if so, delete the last value
targetTable[#targetTable] = nil
end
end
end
else
--loop over elements
for k, v in pairs(targetTable) do
--check for thing to remove
if (v == removeMe) then
--if found, nil it
targetTable[k] = nil
break
end
end
end
return targetTable, removeMe;
end
Efficiency! Even more! )
Regarding Mitch's variant. It has some waste assignments to nil, here is optimized version with the same idea:
function ArrayRemove(t, fnKeep)
local j, n = 1, #t;
for i=1,n do
if (fnKeep(t, i, j)) then
-- Move i's kept value to j's position, if it's not already there.
if (i ~= j) then
t[j] = t[i];
end
j = j + 1; -- Increment position of where we'll place the next kept value.
end
end
table.move(t,n+1,n+n-j+1,j);
--for i=j,n do t[i]=nil end
return t;
end
And here is even more optimized version with block moving
For larger arrays and larger keeped blocks
function ArrayRemove(t, fnKeep)
local i, j, n = 1, 1, #t;
while i <= n do
if (fnKeep(t, i, j)) then
local k = i
repeat
i = i + 1;
until i>n or not fnKeep(t, i, j+i-k)
--if (k ~= j) then
table.move(t,k,i-1,j);
--end
j = j + i - k;
end
i = i + 1;
end
table.move(t,n+1,n+n-j+1,j);
return t;
end
if (k ~= j) is not needed as it is executed many times but "true" after first remove. I think table.move() handles index checks anyway.table.move(t,n+1,n+n-j+1,j) is equivalent to "for i=j,n do t[i]=nil end".I'm new to lua and don't know if where is efficient value replication function. Here we would replicate nil n-j+1 times.
And regarding table.remove(). I think it should utilize table.move() that moves elements in one operation. Kind of memcpy in C. So maybe it's not so bad afterall.#MitchMcMabers, can you update your benchmarks? Did you use lua >= 5.3?

Resources