Distinguish function vs closure - lua

Lua will write the code of a function out as bytes using string.dump, but warns that this does not work if there are any upvalues. Various snippets online describe hacking around this with debug. It looks like 'closed over variables' are called 'upvalues', which seems clear enough. Code is not data etc.
I'd like to serialise functions and don't need them to have any upvalues. The serialised function can take a table as an argument that gets serialised separately.
How do I detect attempts to pass closures to string.dump, before calling the broken result later?
Current thought is debug.getupvalue at index 1 and treat nil as meaning function, as opposed to closure, but I'd rather not call into the debug interface if there's an alternative.
Thanks!

Even with debug library it's very difficult to say whether a function has a non-trivial upvalue.
"Non-trivial" means "upvalue except _ENV".
When debug info is stripped from your program, all upvalues look almost the same :-)
local function test()
local function f1()
-- usual function without upvalues (except _ENV for accessing globals)
print("Hello")
end
local upv = {}
local function f2()
-- this function does have an upvalue
upv[#upv+1] = ""
end
-- display first upvalues
print(debug.getupvalue (f1, 1))
print(debug.getupvalue (f2, 1))
end
test()
local test_stripped = load(string.dump(test, true))
test_stripped()
Output:
_ENV table: 00000242bf521a80 -- test f1
upv table: 00000242bf529490 -- test f2
(no name) table: 00000242bf521a80 -- test_stripped f1
(no name) table: 00000242bf528e90 -- test_stripped f2
The first two lines of the output are printed by test(), the last two lines by test_stripped().
As you see, inside test_stripped functions f1 and f2 are almost undistinguishable.

Related

What is "object = {...}" in lua good for?

I recently read about lua and addons for the game "World of Warcraft". Since the interface language for addons is lua and I want to learn a new language, I thought this was a good idea.
But there is this one thing I can't get to know. In almost every addon there is this line on the top which looks for me like a constructor that creates a object on which member I can have access to. This line goes something like this:
object = {...}
I know that if a function returns several values (which is IMHO one huge plus for lua) and I don't want to store them seperatly in several values, I can just write
myArray = {SomeFunction()}
where myArray is now a table that contains the values and I can access the values by indexing it (myArray[4]). Since the elements are not explicitly typed because only the values themselfe hold their type, this is fine for lua. I also know that "..." can be used for a parameter array in a function for the case that the function does not know how many parameter it gets when called (like String[] args in java). But what in gods name is this "curly bracket - dot, dot, dot - curly bracket" used for???
You've already said all there is to it in your question:
{...} is really just a combination of the two behaviors you described: It creates a table containing all the arguments, so
function foo(a, b, ...)
return {...}
end
foo(1, 2, 3, 4, 5) --> {3, 4, 5}
Basically, ... is just a normal expression, just like a function call that returns multiple values. The following two expressions work in the exact same way:
local a, b, c = ...
local d, e, f = some_function()
Keep in mind though that this has some performance implications, so maybe don't use it in a function that gets called like 1000 times a second ;)
EDIT:
Note that this really doesn't apply just to "functions". Functions are actually more of a syntax feature than anything else. Under the hood, Lua only knows of chunks, which are what both functions and .lua files get turned into. So, if you run a Lua script, the entire script gets turned into a chunk and is therefore no different than a function.
In terms of code, the difference is that with a function you can specify names for its arguments outside of its code, whereas with a file you're already at the outermost level of code; there's no "outside" a file.
Luckily, all Lua files, when they're loaded as a chunk, are automatically variadic, meaning they get the ... to access their argument list.
When you call a file like lua script.lua foo bar, inside script.lua, ... will actually contain the two arguments "foo" and "bar", so that's also a convenient way to access arguments when using Lua for standalone scripts.
In your example, it's actually quite similar. Most likely, somewhere else your script gets loaded with load(), which returns a function that you can call—and, you guessed it, pass arguments to.
Imagine the following situation:
function foo(a, b)
print(b)
print(a)
end
foo('hello', 'world')
This is almost equivalent to
function foo(...)
local a, b = ...
print(b)
print(a)
end
foo('hello', 'world')
Which is 100% (Except maybe in performance) equivalent to
-- Note that [[ string ]] is just a convenient syntax for multiline "strings"
foo = load([[
local a, b = ...
print(b)
print(a)
]])
foo('hello', 'world')
From the Lua 5.1 Reference manual then {...} means the arguments passed to the program. In your case those are probably the arguments passed from the game to the addon.
You can see references to this in this question and this thread.
Put the following text at the start of the file:
local args = {...}
for __, arg in ipairs(args) do
print(arg)
end
And it reveals that:
args[1] is the name of the addon
args[2] is a (empty) table passed by reference to all files in the same addon
Information inserted to args[2] is therefore available to different files.

Strange behavior caused by debug.getinfo(1, "n").name

I learned how to get the function name inside a function by using debug.getinfo(1, "n").name.
Using this feature, I found out the strange behavior in Lua.
Here's my code:
function myFunc()
local name = debug.getinfo(1, "n").name
return name
end
function foo()
return myFunc()
end
function boo()
local name = myFunc()
return name
end
print(foo())
print(boo())
Result:
nil
myFunc
As you can see, the function foo() and boo() calls the same function myFunc() but they return different results.
If I replace debug.getinfo(1, "n").name with other string, they return the same results as expected but I don't understand the unexpected behavior caused by using the debug.getinfo().
Is it possible to correct myFunc() function so calling both foo() and boo() functions return the same result?
Expected result:
myFunc
myFunc
In Lua, any return statement of the form return <expression_yielding_a_function>(...) is a "tail call". Tail calls essentially don't exist in the call stack, so they take up no additional space or resources. The function you call effectively gets erased from the debug information.
Is it possible to correct myFunc() function so calling both foo() and boo() functions return the same result?
Um... yes, but before I tell you how, allow me to try to convince you not to do this.
As previously mentioned, tail calls are part of the Lua language. The removal of tail calls from the stack is not an "optimization" any more than it is an "optimization" for a for loop to exit when you use break. It is a part of Lua's grammar, and Lua programmers have just as much a right to expect a tail call to be a tail call as they have the right to expect break to exit loops.
Lua, as a language, specifically states that this:
local function recursive(...)
--some terminating condition
return recursive(modified_args)
end
will never, ever, run out of stack space. It will be just as stack space efficient as performing a loop. This is a part of the Lua language, just as much a part of it as the behavior of for and while.
If a user wants to call your function via a tail call, that is their right as the user of a language that makes tail calls a thing. Denying users of a language the right to use the features of that language is rude.
So don't do it.
Furthermore, your code suggests that you are attempting to rely on functions having names. That you're doing something significant and meaningful with those names.
Well, Lua is not Python; Lua functions do not have to have names, period. As such, you should not write code that meaningfully relies upon the name of a function. For debugging or logging purposes, fine. But you should not break user expectations just for debugging and logging. So if the user made a tail call, just accept that's what the user wanted and that your debugging/logging will suffer slightly.
OK, so, do we agree that you shouldn't do this? That Lua users have the right to tail calls, and you don't have the right to deny them? That Lua functions are not named and you shouldn't write code that requires them to maintain a name? OK?
What follows is terrible code that you should never use! (in Lua 5.3):
function bypass_tail_call(Func)
local function tail_call_bypass(...)
local rets = table.pack(Func(...))
return table.unpack(rets, rets.n)
end
return tail_call_bypass
end
Then, simply replace your real function with the return of the bypass:
function myFunc()
local name = debug.getinfo(1, "n").name
return name
end
myFunc = bypass_tail_call(myFunc)
Note that the bypass function has to build an array to hold the return values, then unpack them into the final return statement. This obviously requires additional memory allocations that don't have to happen in regular code.
So there's another reason not to do this.
You can run your code through luac -l -p
...
function <stdin:6,8> (4 instructions at 0x555f561592a0)
0 params, 2 slots, 1 upvalue, 0 locals, 1 constant, 0 functions
1 [7] GETTABUP 0 0 -1 ; _ENV "myFunc"
2 [7] TAILCALL 0 1 0
3 [7] RETURN 0 0
4 [8] RETURN 0 1
function <stdin:10,13> (4 instructions at 0x555f561593b0)
0 params, 2 slots, 1 upvalue, 1 local, 1 constant, 0 functions
1 [11] GETTABUP 0 0 -1 ; _ENV "myFunc"
2 [11] CALL 0 1 2
3 [12] RETURN 0 2
4 [13] RETURN 0 1
Those are the two function that are of interest to us: foo and boo
As you can see, when boo calls myFunc, it's just a normal CALL, so nothing interesting there.
foo, however, does something called a tail call. That is, the return value of foo is the return value of myFunc.
What makes this kind of call special is that there is no need for the program to jump back into foo; once foo calls myFunc it can just hand over the keys and say "You know what to do"; myFunc then returns its results directly to where foo was called. This has two advantages:
The stack frame of foo can be cleaned up before myFunc is called
once myFunc returns, it doesn't need two jumps to return to the main thread; only one
Both of those are insignificant in examples like yours, but once you have a chain of lots and lots of tail calls, it becomes significant.
The downside of this is that, once the stack of foo gets cleaned up, Lua also forgets all the debugging information associated with it; it only remembers that myFunc was called as a tail call, but not from where.
An interesting side note, is that boo is almost also a tail call. If Lua didn't have multiple return values, it'd be exactly identical to foo, and a smarter compiler like LuaJIT might compile it to a tail call. PUC Lua won't though, since it needs a literal return some_function() to recognize the tail call.
The difference is that boo only returns the first value returned by myFunc, and while in your example, there will only ever be one, the interpreter can't make that assumption (LuaJIT might make that assumption during JIT compilation, but that's beyond my understanding)
Also note that, technically, the word tail call just describes a function A directly returning the return value of another function B.
It often gets used interchangeably with tail call optimization, which is what the compiler does when it re-uses the stack frame and turns the function call into a jump.
Strictly speaking, C (for example) has tail calls, but it has no tail call optimization, meaning something like
int recursive(n) { return recursive(n+1); }
is valid C code, but will eventually cause a stack overflow, while in Lua
local function recursive(n) return recursive(n+1) end
will just run forever. Both are tail calls, but only the second gets optimized.
EDIT: As always with C, some compilers may, on their own, implement tail call optimization, so don't go around telling everyone that "C never ever does it"; it's just not a requried part of the language, while in Lua it's actually defined in the language specification, so it's not Lua until it has TCO.
This is a result of tail call optimisation, which Lua does.
In this case, Lua translates the function call into a "goto" statement, and does not use any extra stack frame to perform the tail call.
You can add traceback statement to check it:
function myFunc()
local name = debug.getinfo(1, "n").name
print(debug.traceback("Stack trace"))
return name
end
Tail call optimisation happens in Lua when you return with a function call:
-- Optimized
function good1()
return test()
end
-- Optimized
function good2()
return test(foo(), bar(5 + baz()))
end
-- Not optimised
function bad1()
return test() + 1
end
-- Not optimised
function bad2()
return test()[2] + foo()
end
You can refer to the following links for more information:
- Programming in Lua - 6.3: Proper Tail Calls
- What is tail call optimisation? - Stack Overflow

Dumping lua func with arguments

Is it possibly to dump the Lua table including function arguments?
If so, how can I do it?
I have managed to dump tables and function with addresses, but I haven't been able to figure out a way to get function args, i have tried different methods, but no luck.
So I want to receive them truth dumping tables and function plus args of the function.
Output should be something like this: Function JumpHigh(Player, height)
I don't know if it is even possible, but would be very handy.
Table only stores values.
If there's function stored in a table, then it's just a function body, there's no arguments. If arguments were applied, table would store only final result of that call.
Maybe you're talking about closure - function returned from other function, capturing some arguments from a top level function in a lexical closure? Then see debug.getupvalue() function to check closure's content.
Is this something what you're asking?
local function do_some_action(x,y)
return function()
print(x,y)
end
end
local t = {
func = do_some_action(123,478)
}
-- only function value printed
print "Table content:"
for k,v in pairs(t) do
print(k,v)
end
-- list function's upvalues, where captured arguments may be stored
print "Function's upvalues"
local i = 0
repeat
i = i + 1
local name, val = debug.getupvalue(t.func, i)
if name then
print(name, val)
end
until not name
Note that upvalues stored is not necessary an argument to a toplevel function. It might be some local variable, storing precomputed value for the inner function.
Also note that if script was precompiled into Lua bytecode with stripping debug info, then you won't get upvalues' names, those will be empty.

In Lua, is there a function that given a function, it returns its name as a string?

Sorry if this is too obvious, but I am a total newcomer to lua, and I can't find it in the reference.
Is there a NAME_OF_FUNCTION function in Lua, that given a function gives me its name so that I can index a table with it? Reason I want this is that I want to do something like this:
local M = {}
local function export(...)
for x in ...
M[NAME_OF_FUNCTION(x)] = x
end
end
local function fun1(...)
...
end
local function fun2(...)
...
end
.
.
.
export(fun1, fun2, ...)
return M
There simply is no such function. I guess there is no such function, as functions are first class citizens. So a function is just a value like any other, referenced to by variable. Hence the NAME_OF_FUNCTION function wouldn't be very useful, as the same function can have many variable pointing to it, or none.
You could emulate one for global functions, or functions in a table by looping through the table (arbitrary or _G), checking if the value equals x. If so you have found the function name.
a=function() print"fun a" end
b=function() print"fun b" end
t={
a=a,
c=b
}
function NameOfFunctionIn(fun,t) --returns the name of a function pointed to by fun in table t
for k,v in pairs(t) do
if v==fun then return k end
end
end
print(NameOfFunctionIn(a,t)) -- prints a, in t
print(NameOfFunctionIn(b,t)) -- prints c
print(NameOfFunctionIn(b,_G)) -- prints b, because b in the global table is b. Kind of a NOOP here really.
Another approach would be to wrap functions in a table, and have a metatable set up that calls the function, like this:
fun1={
fun=function(self,...)
print("Hello from "..self.name)
print("Arguments received:")
for k,v in pairs{...} do print(k,v) end
end,
name="fun1"
}
fun_mt={
__call=function(t,...)
t.fun(t,...)
end,
__tostring=function(t)
return t.name
end
}
setmetatable(fun1,fun_mt)
fun1('foo')
print(fun1) -- or print(tostring(fun1))
This will be a bit slower than using bare functions because of the metatable lookup. And it will not prevent anyone from changing the name of the function in the state, changing the name of the function in the table containing it, changing the function, etc etc, so it's not tamper proof. You could also strip the tables of just by indexing like fun1.fun which might be good if you export it as a module, but you loose the naming and other tricks you could put into the metatable.
Technically this is possible, here's an implementation of the export() function:
function export(...)
local env = getfenv(2);
local funcs = {...};
for i=1, select("#", ...) do
local func = funcs[i];
for local_index = 1, math.huge do
local local_name, local_value = debug.getlocal(2, local_index);
if not local_name then
break;
end
if local_value == func then
env[local_name] = local_value;
break;
end
end
end
return env;
end
It uses the debug API, would require some changes for Lua 5.2, and finally I don't necessarily endorse it as a good way to write modules, I'm just answering the question quite literally.
Try this:
http://pgl.yoyo.org/luai/i/tostring
tostring( x ) should hopefully be what you are looking for
If I am not wrong (and I probably will, because I actually never programmed in Lua, just read a bunch of papers and articles), internally there is already a table with function names (like locals and globals in Python), so you should be able to perform a reverse-lookup to see what key matches a function reference.
Anyway, just speculating.
But the fact is that looking at your code, you already know the name of the functions, so you are free to construct the table. If you want to be less error prone, it would be easier to use the name of the function to get the function reference (with eval or something like that) than the other way around.

trying to print _G doesn't work

According to the documentation _G "holds the global environment". I wanted to see what's inside it so I wrote the following code to print _G but it doesn't work:
function f(x)
return 2*x
end
a=3
b="hello world"
print("_G has "..#_G.." elements")
for k,v in pairs(_G) do
print(k)
print(_G[k])
print("G["..k.."]=".._G[k])
end
Error:
_G has 0 elements
a
3
G[a]=3
string
table: 003C8448
lua: try_G.lua:10: attempt to concatenate field '?' (a table value)
stack traceback:
try_G.lua:10: in main chunk
[C]: ?
>Exit code: 1
You could also use the table.foreach(t,f) function. It iterates over a table t, calling the function f with each key and value pair. Use with print to get a quick view:
table.foreach(_G,print)
This is really handy at the interactive prompt as it is reasonably succinct and easy enough to type.
C:\Users\Ross>lua
Lua 5.1.4 Copyright (C) 1994-2008 Lua.org, PUC-Rio
> table.foreach(_G,print)
string table: 005CE3D0
xpcall function: 00717E80
package table: 005CE088
tostring function: 00717DE0
print function: 00711CB8
os table: 005CE358
unpack function: 00717E40
require function: 00718360
getfenv function: 00711B58
setmetatable function: 00717DA0
next function: 00711C38
assert function: 00711A38
tonumber function: 00717DC0
io table: 005CE218
rawequal function: 00711CF8
collectgarbage function: 00711A78
getmetatable function: 00711B98
module function: 00718320
rawset function: 00711D58
math table: 005CE448
debug table: 005CE498
pcall function: 00711C78
table table: 005CE128
newproxy function: 00711E10
type function: 00717E00
coroutine table: 005CDFE8
_G table: 00713EC8
select function: 00711D98
gcinfo function: 00711B18
pairs function: 00711F98
rawget function: 00711D18
loadstring function: 00711C18
ipairs function: 00711F68
_VERSION Lua 5.1
dofile function: 00711A98
setfenv function: 00717D60
load function: 00711BD8
error function: 00711AD8
loadfile function: 00711BB8
>
Update: Unfortunately, as Alexander Gladysh reminds me, the table.foreach function was deprecated in Lua 5.1, and a quick check of the current beta release of 5.2 shows that it has been removed in Lua 5.2. It is easy to write the same loop in terms of pairs:
for k,v in pairs(_G) do print(k,v) end
which should give the same output as table.foreach(_G,print) would. The key feature that I'm leaning on here is that print is defined to call tostring() on each argument you pass, and tostring() is defined to return some sort of sensible string for every kind of value, even those like functions that don't have a good representation as a string. The details will differ on each platform, but the default implementation of tostring() includes the address of the table or function in its string result, allowing you to at least recognize that _G.os and _G.io are distinct tables.
For more human-friendly table printing, there are a lot of solutions, ranging from examples in PiL to several persistent data libraries. Personally, I like the pl.pretty.write() function provided by steve donavan's penlight library.
Your code works exactly as expected- it loops through _G and attempts to print the contents. Unfortunately, _G contains many tables which cannot be concatenated into a string. The code fails because _G["_G"] = _G. That means that when the interpreter comes to
print("G["..k.."]=".._G[k])
then k is "_G" and _G[k] is _G, and you attempt to concatenate a table- which the interpreter can't do, so it dies on you. There are numerous other tables in _G which would also cause this failure.
To follow up on DeadMG, change your
print("G["..k.."]=".._G[k])
to
print("G["..k.."]=",_G[k])
and you should be fine.
Here is the final code using DeadMG's solution:
function f(x)
return 2*x
end
a=3
b="hello world"
print("_G has "..#_G.." elements")
for k,v in pairs(_G) do
if k~="_G" then
if type(v)=="string" or type(v)=="number" then
print("G["..k.."]="..v)
else
print("G["..k.."]=("..type(v)..")")
end
end
end

Resources