I have a memory leak issue about the usage of lua table, the code is below:
function workerProc()
-- a table holds some objects (userdata, the __gc is implememted correctly)
local objs = {createObj(), createObj(), ...}
while isWorking() do
-- ...
local query = {unpack(objs)}
repeat
-- ...
table.remove(query, queryIndex)
until #query == 0
sleep(1000)
end
end
the table objs is initialized with some userdata objects and these objects are always available in the while loop so no gc will performed on these objs. In the while loop the query table is initialize with all the elements from objs (use unpack function). While running the script I found that the memory keeps increasing but when I comment out local query = {unpack(objs)} it disappears.
I don't think this piece of code have memory leak problem cause the query var is local and it should be unavailable after each iteration of while loop, but the fact is. Anybody know why the memory is swallowed by that table?
Judging from your code example, the likely explanation for what you are seeing is perhaps the gc doesn't get a chance to perform a full collection cycle while inside the loop.
You can force a collection right after the inner loop using collectgarbage() and see if that resolves the memory issue:
while isWorking() do
-- ..
local query = {unpack(objs)}
repeat
-- ..
table.remove(query, queryIndex)
until #query == 0
collectgarbage()
sleep(1000)
end
Another possibility is to move local query outside the loop and create the table once instead of creating a new table on every iteration in the outter loop.
Related
I try to make efficiently a copy of a lua table. I have written the following function copyTable() that works well (see below). But I imagined I could have something more efficient using the "passing by value" mechanism of the functions. I made a few tests to explore this mechanism :
function nop(x)
return x
end
function noop(x)
x={}
return x
end
function nooop(x)
x[#x+1]=4
return x
end
function copyTable(datatable)
local tblRes={}
if type(datatable)=="table" then
for k,v in pairs(datatable) do tblRes[k]=copyTable(v) end
else
tblRes=datatable
end
return tblRes
end
tab={1,2,3}
print(tab) -->table: 0x1d387e0 tab={1,2,3}
print(nop(tab)) -->table: 0x1d387e0 tab={1,2,3}
print(noop(tab)) -->table: 0x1e76f90 tab={1,2,3}
print(nooop(tab)) -->table: 0x1d387e0 tab={1,2,3,4}
print(tab) -->table: 0x1d387e0 tab={1,2,3,4}
print(copyTable(tab)) -->table: 0x1d388d0
We can see that the reference to the table is transferred unchanged through the functions (when I just read it or add things) except within noop() where I try a radical modification of the existing.
I read Bas Bossink and the answer made by Michael Anderson in this Q/A. Regarding the passing or tables as arguments, they emphasized the difference between "arguments passed by ref" and "arguments passed by values and tables are references" with examples where this difference appears.
But what does that mean precisely ? Do we have a copy of the reference, but what difference does that make with a passing through ref since the data pointed and therefore manipulated is still the same, not copied ? Is the mechanism in noop() specific when we try to affect nil to the table, specific to avoid the deletion of the table or in which cases does it trigger (we can see with nooop() that it is not always the case when the table is modified) ?
My question : how the mechanism of passing tables really works ? Is there a way to make a more efficient way to copy the data of a table without the burden of my copyTable ?
The rules of argument passing in Lua is similarly to C: everything is pass by value, but tables and userdata are passed around as pointers. Passing a copy of a reference does not appear so different in usage, but it is completely different than passing by reference.
For example, you brought this part up specifically.
function noop(x)
x={}
return x
end
print(noop(tab)) -->table: 0x1e76f90 tab={1, 2, 3}
You are assigning the value for the new table[1] into variable x (x now holds a new pointer value). You didn't mutate the original table, the tab variable still holds the pointer value to the original table. When you return from noop you are passing back the value of the new table, which is empty. Variables hold values, and a pointer is a value, not a reference.
Edit:
Missed your other question. No, if you want to deep-copy a table, a function similar to what you wrote is the only way. Deep copies are very slow when tables get large. To avoid performance issues, you might use a mechanism like "rewind tables", which keep track of changes made to them so they can be undone at later points in time (very useful in recursive with backtrack contexts). Or if you just need to keep users from screwing with table internals, write a "freezable" trait.
[1] Imagine the {} syntax is a function that constructs a new table and returns a pointer to the new table.
If you are sure that those 3 assumptions (A) are valid for "tab" (the table being copied):
There are no table keys
t1 = {}
tab = {}
tab[t1] = value
There are no repeated table values
t1 = {}
tab = {}
tab.a = t1
tab.b = t1
-- or
-- tab.a.b...x = t1
There are no recursive tables:
tab = {}
tab.a = tab
-- or
-- tab.a.b...x = tab
Then the code you provided is the smallest and almost as efficient as possible.
If A1 doesn't hold (i.e. you have table keys), then you must change your code to:
function copyTable(datatable)
local tblRes={}
if type(datatable)=="table" then
for k,v in pairs(datatable) do
tblRes[copyTable(k)] = copyTable(v)
end
else
tblRes=datatable
end
return tblRes
end
If A2 doesn't hold (i.e. you have repeated table values), then you could change your code to:
function copyTable(datatable, cache)
cache = cache or {}
local tblRes={}
if type(datatable)=="table" then
if cache[datatable] then return cache[datatable]
for k,v in pairs(datatable) do
tblRes[copyTable(k, cache)] = copyTable(v, cache)
end
cache[datatable] = tblRes
else
tblRes=datatable
end
return tblRes
end
This approach only pays off, though, if you have lots of repeated large tables. So, it is a matter of evaluating which version is faster for your actual production scenario.
If A3 doesn't hold (i.e. you have recursive tables), then your code (and both adjustments above) will enter an infinite recursive loop and eventually throw a stack overflow.
The simplest way to handle that is keeping a backtrack and throwing an error if table recursion happens:
function copyTable(datatable, cache, parents)
cache = cache or {}
parents = parents or {}
local tblRes={}
if type(datatable)=="table" then
if cache[datatable] then return cache[datatable]
assert(not parents[datatable])
parents[datatable] = true
for k,v in pairs(datatable) do
tblRes[copyTable(k, cache, parents)]
= copyTable(v, cache, parents)
end
parents[datatable] = false
cache[datatable] = tblRes
else
tblRes=datatable
end
return tblRes
end
My solution for a deepcopy function which handles recursive tables, preserving the original structure may be found here: https://gist.github.com/cpeosphoros/0aa286c6b39c1e452d9aa15d7537ac95
Right now, I've been making my for loops like this
local i
for i = 1, 10 do
--stuff
end
Since I thought you should try to keep things local for better performance and reduce risk from errors.
However, I've noticed that it is common to simply use.
for i = 1, 10 do
--stuff
end
Is using local preferred, or is omitting it pretty harmless?
(EDIT) There's no difference between the code samples you gave. However, be aware that the variable you defined with local i is not the same variable that is used within your for i = 1, 10 do loop. When the loop exits, the original value of i remains unchanged (that is, i == nil).
siffiejoe points out that loop control/counter variables are never accessible outside of the loop, even if the same variable name was defined in advance. Any references to the variable within the loop will use the loop value. Any references outside of the loop will use the original, or non-loop value.
For this reason, it's safe to reuse an existing variable name in a for statement without corrupting the original. If you want to access the counter variable after the loop, you can define an extra variable beforehand and update it at within the loop as follows (siffiejoe's example):
local j
for i = 1, 10 do
j = i
--[[ stuff ]]
end
print(j) -- j stores the last value of i before the loop exits
Documentation: Numeric for
Short answer: don't add local i before the for loop, because it's useless and confusing.
The for loop starts a new block, the control variable (i here) is already local to this block. Adding the local i is similar to:
local i
do
local i = 0
-- do something
end
Note that the i inside the block is a totally new variable, which shadows the i outside. When the block is over, the i outside the block gets its life back, but has no knowledge of what happened inside the block because the two variables have no relationship except having the same name.
The for loop control-structure in Lua have some particularities, it's not exactly the equivalent of a C for loop.
One of these particularities is that its index variable is controlled by the loop. The index var is itself local and modifying it inside the loop has in fact no effect. Example:
for i=1,5 do
io.write(("i: %d; "):format(i))
i = i + 10 -- this has no effect, as 'for' is in control of the index var.
end
Result:
i: 1; i: 2; i: 3; i: 4; i: 5;
I have encountered several places where people call collectgarbage() twice to finalize all unused objects.
Why is that? Why isn't a single call enough? Why not three calls?
When I try the following code (on Lua 5.2), the object gets finalized (meaning: its __gc gets called) with just a single call to collectgarbage:
do
local x = setmetatable({},{
__gc = function() print("works") end
})
end
collectgarbage()
os.exit()
Does this mean one call is enough?
It's explained in Programming in Lua 3rd edition ยง17.6 Finalizers. In short, it's because of resurrection.
A finalizer is a function associated with an object that is called when that object is about to be collected. Lua implements finalizers with __gc metamethod.
The problem is, when the finalizers are called, the object must be alive in some cases. PiL explains it with this example:
A = {x = "this is A"}
B = {f = A}
setmetatable(B, {__gc = function (o) print(o.f.x) end})
A, B = nil
collectgarbage() --> this is A
The finalizer for B accesses A, so A cannot be collected before the finalization of B. Lua must resurrect both B and A before running that finalizer.
Resurrection is the reason of calling collectgarbage twice:
Because of resurrection, objects with finalizers are collected in two phases.
The first time the collector detects that an object with a finalizer is not reachable, the collector resurrects the object and queues it to be finalized. Once its finalizer runs, Lua marks the object as finalized. The next time the collector detects that the object is not reachable, it deletes the object. If you want to ensure that all garbage in your program has been actually released, you must call collectgarbage twice; the second call will delete the objects that were finalized during the first call.
I am trying to "Skip" a variable, by either never declaring it or just having it garbage collected immediately, but I don't know if it's possible.
Example:
function TestFunc()
return 1, 2
end
function SecondFunction()
local nodeclare, var = TestFunc()
end
Basically what I wanted was for "nodeclare" to not even exist. So if I did print(nodeclare, var) it would do nil, 2.
The same thing would be if I was doing a pairs loop and I didn't need to use the keyvalue.
Is there some special thing I can put as the variable name for this to happen? If say I was doing a pairs loop over 100 values, would that even have a signifigant impact?
First of all, variables are not garbage collected, objects are. In this case, there's nothing to garbage collect.
However, let's say that TestFunc was creating objects (say, tables):
function TestFunc()
return {1}, {2}
end
function SecondFunction()
local nodeclare, var = TestFunc()
end
Now nodeclare is referencing a table returned by TestFunc. That's an object, allocated on the heap, that we don't want hanging around forever.
That object will eventually be collected if there is nothing left referring to it. In your case, as soon as SecondFunction returns, the local nodeclare goes out of scope and goes away. As long as there's nothing else referencing that table, the table will be collected (during next collection cycle).
You can avoid declaring nodeclare entirely by skipping the first return value of TestFunc like this:
local var = select(2, TestFunc())
However, when you're talking about a temporary local variable, as in your example, you normally just create the temporary variable then ignore it. This avoids the overhead of the call to select. Sometimes you use a variable name that indicates it's trash:
local _, var = TestFunc()
If say I was doing a pairs loop over 100 values, would that even have a signifigant impact?
None whatsoever. You're just continually overwriting the value of a local variable.
What impact do you mean exactly? Memory? Performance?
According to the Programming in Lua book, you can sort of skip the second return value, but not ignore the first and use the second:
x,y = foo2() -- x='a', y='b'
x = foo2() -- x='a', 'b' is discarded
x,y,z = 10,foo2() -- x=10, y='a', z='b'
I am making a game using Lua and I need to use Breadth-first search to implement a fast path-finding algorithm which finds the shortest path between the enemy AI and the player.
I will have up to 3 enemies using this algorithm at once and map is a 2-dimensional tile-based maze. I have already implemented collision detection so now all that is left to do is get a way for the enemies to find the shortest path to the player in a way that can be done quickly and and preferably about 80-90 times per second per enemy.
When I previously implemented breadth-first search, I used a queue in C++. From what I have read about Lua, stack implementations using tables are very efficient since the elements don't need to be shifted after push() (AKA table.insert) and pop() (table.remove) operations. However, I have theorized that queues with large sizes can be very inefficient in Lua, because if you want to insert/remove something from index 1 of a table, all other elements in the table must be shifted either upward or downward in the table.
Thus, I would like to ask some simple questions for those experienced in Lua. What is the simplest and/or fastest implementation of a queue in this language? Is it even possible to have a fast queue in Lua or is it just generally accepted that Lua will always be forced to "shift" all elements for a queue operation such as pop() or append()?
EDIT: I understand that the underlying implementation of "shifting elements" in Lua tables is written in C and is therefore quite optimized. I just would like to know if I have any better options for queues before I begin to write a simple table implementation.
A fast implementation of queue(actually double queue) in Lua is done by the book Programming in Lua:
List = {}
function List.new ()
return {first = 0, last = -1}
end
It can insert or remove an element at both ends in constant time, the key is to store both first and last.
function List.pushleft (list, value)
local first = list.first - 1
list.first = first
list[first] = value
end
function List.pushright (list, value)
local last = list.last + 1
list.last = last
list[last] = value
end
function List.popleft (list)
local first = list.first
if first > list.last then error("list is empty") end
local value = list[first]
list[first] = nil -- to allow garbage collection
list.first = first + 1
return value
end
function List.popright (list)
local last = list.last
if list.first > last then error("list is empty") end
local value = list[last]
list[last] = nil -- to allow garbage collection
list.last = last - 1
return value
end