Declaring variables and scope question for Lua - lua

I'm lead dev for Bitfighter, and we're using Lua as a scripting language to allow players to program their own custom robot ships.
In Lua, you need not declare variables, and all variables default to global scope, unless declared otherwise. This leads to some problems. Take the following snippet, for example:
loc = bot:getLoc()
items = bot:findItems(ShipType) -- Find a Ship
minDist = 999999
found = false
for indx, item in ipairs(items) do
local d = loc:distSquared(item:getLoc())
if(d < minDist) then
closestItem = item
minDist = d
end
end
if(closestItem != nil) then
firingAngle = getFiringSolution(closestItem)
end
In this snippet, if findItems() returns no candidates, then closestItem will still refer to whatever ship it found the last time around, and in the intervening time, that ship could have been killed. If the ship is killed, it no longer exists, and getFiringSolution() will fail.
Did you spot the problem? Well, neither will my users. It's subtle, but with dramatic effect.
One solution would be to require that all variables be declared, and for all variables to default to local scope. While that change would not make it impossible for programmers to refer to objects that no longer exist, it would make it more difficult to do so inadvertently.
Is there any way to tell Lua to default all vars to local scope, and/or to require that they be declared? I know some other languages (e.g. Perl) have this option available.
Thanks!
Lots of good answers here, thanks!
I've decided to go with a slightly modified version of the Lua 'strict' module. This seems to get me where I want to go, and I'll hack it a little to improve the messages and make them more appropriate for my particular context.

There is no option to set this behavior, but there is a module 'strict' provided with the standard installation, which does exactly that (by modifying the meta-tables).
Usage:
require 'strict'
For more in-depth info and other solutions: http://lua-users.org/wiki/DetectingUndefinedVariables, but I recommend 'strict'.

Sorta.
In Lua, globals notionally live in the globals table _G (the reality is a bit more complex, but from the Lua side there's no way to tell AFAIK). As with all other Lua tables, it's possible to attach a __newindex metatable to _G that controls how variables are added to it. Let this __newindex handler do whatever you want to do when someone creates a global: throw an error, permit it but print a warning, etc.
To meddle with _G, it's simplest and cleanest to use setfenv. See the documentation.

"Local by default is wrong". Please see
http://lua-users.org/wiki/LocalByDefault
http://lua-users.org/wiki/LuaScopingDiscussion
You need to use some kind of global environment protection. There are some static tools to do that (not too mature), but the most common solution is to use runtime protection, based on __index and __newindex in _G's metatable.
Shameles plug: this page may also be useful:
http://code.google.com/p/lua-alchemy/wiki/LuaGlobalEnvironmentProtection
Note that while it discusses Lua embedded into swf, the described technique (see sources) do work for generic Lua. We use something along these lines in our production code at work.

Actually, the extra global variable with the stale reference to the ship will be sufficient to keep the GC from discarding the object. So it could be detected at run time by noticing that the ship is now "dead" and refusing to do anything with it. It still isn't the right ship, but at least you don't crash.
One thing you can do is to keep user scripts in a sandbox, probably a sandbox per script. With the right manipulation of either the sandbox's environment table or its metatable, you can arrange to discard all or most global variables from the sandbox before (or just after) calling the user's code.
Cleaning up the sandbox after calls would have the advantage of discarding extra references to things that shouldn't hang around. This could be done by keeping a whitelist of fields that are allowed to remain in the environment, and deleting all the rest.
For example, the following implements a sandboxed call to a user-supplied function with an environment containing only white-listed names behind a fresh scratch table supplied for each call.
-- table of globals that will available to user scripts
local user_G = {
print=_G.print,
math=_G.math,
-- ...
}
-- metatable for user sandbox
local env_mt = { __index=user_G }
-- call the function in a sandbox with an environment in which new global
-- variables can be created and modified but they will be discarded when the
-- user code completes.
function doUserCode(user_code, ...)
local env = setmetatable({}, env_mt) -- create a fresh user environment with RO globals
setfenv(user_code, env) -- hang it on the user code
local results = {pcall(user_code, ...)}
setfenv(user_code,{})
return unpack(results)
end
This could be extended to make the global table read-only by pushing it back behind one more metatable access if you wanted.
Note that a complete sandbox solution would also consider what to do about user code that accidentally (or maliciously) executes an infinite (or merely very long) loop or other operation. General solutions for this are an occasional topic of discussion on the Lua list, but good solutions are difficult.

Related

Is there an easy way to get all global variables defined in a Lua code file?

Everybody knows that variables in Lua, if not explicitly defined as "local", will be global. This will sometimes cause problems, like overriding library functions, or unexpectedly providing a value for another global variable with the same name. So it should be very helpful if there's a way to find all global variables that is defined in a single Lua code file.
However, I failed to find any clue on this seemingly quite-popular problem. The best answer I can get online is using _G to print all global variables in the environment, which isn't of much help. I'm currently coding Lua in Intellij Idea with Emmylua, a powerful tool that can show global variables in a special style, and it can easily trace a global variable to its definition; but when the code becomes quite long, this will not help much either.
So basically, I just want to get a list of global variables defined in a given Lua code file. Either with a tool or with a wonderful function. If it can make things easier, we may presume the code file is a module. If it can further print the definition locations for these global variables, that's even better. Can somebody help me?
Lua doesn't have a way to tell when or where a global was introduced.
In the special that the value is a function, debug.getinfo may be able to help by telling you where the function is defined (which is often but not always the same place where the function is made global).
You can capture the needed information at the time the global is introduced. This can be done by setting a metatable with a __newindex method on the global table. This method will be called when a new global is introduced (but not when an existing global is overridden). In this method, you can figure out where the caller came from with debug.getinfo. Also beware, if any of your other code is trying to use a metatable on the global environment, you must play nicely with it. (It can only have one metatable.)
You can also avoid using the global table. One in-between way of doing this is to override the environment. In Lua 5.2 and Lua 5.3, this is done by declaring a local table called _ENV -- all accesses to the global table will instead access this table. (Actually, global accesses always use _ENV and _ENV is _G by default.) You can make this mostly invisible by giving this _ENV a metatable that forwards accesses to _G (or whatever other environment). The difference here is that __newindex will still be called even if a binding exists in _G, so this method can detect overrides.
Using _ENV, though is inherently local to a scope (e.g. each file needs to override it). Such a hook could be installed globally as well though. If you load your modules manually with the load function (unlikely), you can just supply a custom _ENV as an argument. If you use require, it is possible to get a hold of the loaded file before it is executed by overriding (or monkey patching) the Lua searcher in package.searchers[2]. This is the built-in function that require calls to find the file in your filesystem and then load it. The return value is the loaded function which require then runs. So, after it is loaded but before it is returned back to require, you could use debug.setupvalue to override the default _ENV value (if any).
Example code (only lightly tested):
local global_info = {}
local default_searcher2 = package.searchers[2]
package.searchers[2] = function(...)
local result = default_searcher2(...)
local parent_environment = _G
local my_env = setmetatable({}, {
__index = parent_environment,
__newindex = function(self, k, v)
local new_info = debug.getinfo(2)
-- keeping rich data like this could be a memory leak
-- if some globals are assigned repeatedly, but that
-- may still be okay in a debugging scenario
local history = global_info[k]
if history == nil then
history = {}
global_info[k] = history
end
table.insert(history, {info = new_info, value = v})
parent_environment[k] = v
end,
})
if type(result) == "function" then
debug.setupvalue(result, 1, my_env)
end
return result
end
function gethistory(name)
local history = global_info[name]
if history == nil then
print('"' .. name .. '" has never been defined...')
else
print('History for "' .. name .. '":')
for _, record in ipairs(history) do
print(record.info.short_src .. ": " .. record.info.currentline)
end
end
end
Note that the hook here will only apply to files required after this code has been run, and basically only applies to Lua files (not C libs) that get included via the built-in require. It doesn't set a metatable on the global environment, so not conflict there, but it could be circumvented if files access _G directly (or e.g. setup access to _G instead of _ENV in their own _ENV tables). Such things can also be accounted for, but it can be a rabbit hole depending on how "invisible" you need this patch to be.
In Lua 5.1, instead of _ENV, you have setfenv which I believe can be used to similar effect.
Also note that all the methods I'm outlining can only detect global accesses that actually get executed at runtime.
Yes. Local versus global is a binding issue, which is primarily established at compile-time. Setting a variable is, of course, well-determined at compile-time.
Lua provides the luac compiler, which takes the argument -l for list.
In Lua 5.1, there is the opcode SETGLOBAL. A column indicates the line number of the statement and the comment indicates the name of the global.
In 5.2 and later, there is the opcode SETTABUP. A column indicates the line number of the statement and the comment indicates the name of the table and key. "Globals" are in the table referenced by the _ENV upvalue.
So, you can easily find the line number of any statement that sets a global variable with the tools Lua provides.
BTW—under many module systems, a module script would not set any global variables.

What does this piece of lua code from awesome wm do?

Take a look at this code:
local urgent = {}
local capi =
{
client = client,
}
local client
do
client = setmetatable({}, {
__index = function(_, k)
client = require("awful.client")
return client[k]
end,
__newindex = error -- Just to be sure in case anything ever does this
})
end
I'm having trouble understanding what it does. It's from the awesome-wm project. These are the things I'm having trouble understanding:
client = client in the declaration of capi
setmetatable stuff inside do-end
client = client in the declaration of capi
This is defining what portion of the capi is available in this file's scope, If you look at the client.lua file you will see that the capi defined in it has client, mouse, screen, and awesome.
For each item defined in the capi table there is a corresponding .c file. These files define objects such as client. urgent.lua has visibility of that object, likely it is a global variable, that is how we can set client = client the second client refers to the global variable.
Here is an example of 2 files:
main.lua
bar = "Hello World!"
local foo = require('foo')
print(foo.bar)
foo.lua
local foo = {
bar = bar
}
return foo
The print function in main.lua will result in Hello World!
setmetatable stuff inside do-end
Here by warping the setmetatable in a do-end block the code is executing in a restricted scope. This is normally done to contain the block's local variables so that they do not persist after the code's execution.
That said that is not the purpose of this block as the block has no local variables. As I see it, the blocking is simply to show that the object being modified is the local variable of client and not the global variable of client.
Additionally the metatable here is used to prevent circular dependency loops, this is mentioned comments in some of the places where similar code appears in the project, such as client.lua where local screen is defined.
#Nifim answer is excellent. I just want to add more context on why this code exist in its proper historical context. Before Lua 5.2, the module system was different. There was a magic module() function defined in the core Lua library. When you made a module, you had to first make local version of all global variables before calling module() because otherwise it would run in its own global environment. "capi" stands for "Core API" or "C (language) API" depending on the weather. If Awesome was written today with all the knowledge we now have, there would not be a public "C language" API and they would always be hidden in the private section to increase flexibility. Right now setting "c.my_own_property" do a couple round trips between capi.client and awful.client just to accommodate all the legacy constraints.
Now, the metatable magic is a Lua pattern called meta-lazy-loading. Because the urgent is a submodule of awful.client, it cannot directly import awful.client without causing a circular dependency. Over time, as Awesome APIs became better defined, more and more refactoring were made and they often introduced weird dependencies to maintain some degree of backward compatibility. In the best universe, we would have disregarded all users config and just re-engineered the whole code to avoid these circular dependencies. However every time we do that all users of the said APIs wake up one morning and they cannot login into their computer anymore. So this kind of workaround exist to prevent such events in return for some weird code and maintenance burden.

Modify local variable from another script in Lua

I'm trying to make a mod for the game Don't Starve Together, which makes use of Lua. For this reason, I can't modify their source variables/files.
In order to try to modify the world generation, I need to access a local table that was instantiated in another file (the file is called "levels.lua"). The variable name is "levellist". Is there a way to access the variable so that I can add certain elements to the table?
Namely, I want to add {"task_set", "cave_custom"} to levellist[DST_CAVE].overrides.
If someone could help or even just tell me if this is possible or not, that would be great. Thanks!
What you are trying to do simply doesn't make sense. Local variables are accessible only from the scope it was defined in, and it's nested scopes. There is no, normal, way to change it from different scopes, let alone an entirely different script.
If you want variables that all your scripts use, use globals.
Of course you can't get to local variables (i.e. "pointers") used by another function, save for obscure debug methods that are rarely exposed to user sandbox, but you don't need to. Because you do not want to modify some local variable (i.e. make it point to another table for example), but get to some table and modify value inside it. So you just need to find any place where it is exposed to you in any way.
You should somehow edit in relevant content in your question because it is PITA to Alt-Tab back and forth to your files. According to structure from comments/chat AddLevel(LEVELTYPE.SURVIVAL, ...) inserts an entry into levellist[LEVELTYPE.SURVIVAL] table. If you check levels.lua you can also see that it returns table with sandbox_levels assigned exactly to this.
So:
local levels = require "levels"
print(levels.sandbox_levels)
-- Will print "table: SOMENUMBERS" - i.e. address of levellist[LEVELTYPE.SURVIVAL]
You now can iterate it with for idx = 1, #levels.sandbox_levels or ipairs and find entry belonging to "DST_CAVE". I can't tell how field with ID will be called or how it will be structured because data is preprocessed with function Level before inserting that you did not include in the files you posted.
As others have suggested, this may not be your best strategy.
But depending on your environment, it may be possible to abuse some more esoteric features of the runtime to let you indirectly modify values that aren't "yours". Have a look at debug.sethook and setfenv.

Why redefinition standard variables in awesome wm modules needed?

Any awesome wm module starts from redefinition standard variables to local. Something like that
local table = table
local string = string
local tostring = tostring
What does it do? All code still working fine after deleting this lines.
It's purely an optimization thing. Local variables are faster to read/write than global variables. This is in part because globals are hash table lookups (e.g. foo => _G["foo"]) and locals are VM register lookups. So it's not uncommon for modules that are going to be using a global a lot to alias it via a local variable.
For your code, unless you know something is going to be called a ton and is going to be a bottleneck, I wouldn't bother with this technique. Lua isn't C. You're trading performance for brevity and clarity. Don't trade it back until you know you have to.
"What does it do" is already answered.
For "why is it done": Back before awesome supported lua 5.2 (without deprecated functions), all modules used lua's module() function to set themselves up. This meant that values from the global variable became inaccessible and this "local trick" actually was necessary.

hot swap code in lua

I've heard mumblings around the internets about being able to hot-swap code in Lua similar to how it's done in Java, Erlang, Lisp, etc. However, 30 minutes of googling for it has turned up nothing. Has anyone read anything substantial about this? Anyone have any experience doing it? Does it work in LuaJIT or only in the reference VM?
I'm more interested in the technique as a shortcut in development/debugging than an upgrade path in a live environment.
Lua, and most scripting languages for that matter, do not support the most generalized form of "hot swapping" as you define it. That is, you cannot guaranteeably change a file on disk and have any changes in it propagate itself into an executing program.
However, Lua, and most scripting languages for that matter, are perfectly capable of controlled forms of hot swapping. Global functions are global functions. Modules simply load global functions (if you use them that way). So if a module loads global functions, you can reload the module again if it is changed, and those global function references will change to the newly loaded functions.
However, Lua, and most scripting languages for that matter, makes no guarantees about this. All that's happening is the changing of global state data. If someone copied an old function into a local variable, they can still access it. If your module uses local state data, the new version of the module cannot access the old module's state. If a module creates some kind of object that has member functions, unless those members are fetched from globals, these objects will always refer to the old functions, not the new ones. And so forth.
Also, Lua is not thread safe; you can't just interrupt a lua_State at some point and try to load a module again. So you would have to set up some specific point in time for it to check stuff out and reload changed files.
So you can do it, but it isn't "supported" in the sense that it can just happen. You have to work for it, and you have to be careful about how you write things and what you put in local vs. global functions.
As Nicol said, the language itself doesn't do it for you.
If you want to implement something like this yourself though, it's not that hard, the only thing "preventing" you is any "leftover" references (which will still point to the old code) and the fact require caches its return value in package.loaded.
The way I'd do it is by dividing your code into 3 modules:
the reloading logic at entry point (main.lua)
any data you want to preserve across reloads (data.lua)
the actual code to reload (payload.lua), making sure you don't keep any references to that (which is sometimes not possible when you e.g. have to give callbacks to some library; see below).
-- main.lua:
local PL = require("payload")
local D = require("data")
function reload(module)
package.loaded[module]=nil -- this makes `require` forget about its cache
return require(module)
end
PL.setX(5)
PL.setY(10)
PL.printX()
PL.printY()
-- .... somehow detect you want to reload:
print "reloading"
PL = reload("payload") -- make sure you don't keep references to PL elsewhere, e.g. as a function upvalue!
PL.printX()
PL.printY()
-- data.lua:
return {} -- this is a pretty dumb module, it's literally just a table stored in `package.loaded.data` to make sure everyone gets the same instance when requiring it.
-- payload.lua:
local D = require("data")
local y = 0
return {
setX = function(nx) D.x = nx end, -- using the data module is preserved
setY = function(ny) y = ny end, -- using a local is reset upon reload
printX = function() print("x:",D.x) end,
printY = function() print("y:", y) end
}
output:
x: 5
y: 10
reloading
x: 5
y: 0
you could flesh out that logic a bit better by having a "registry module" that keeps track of all the requiring/reloading for you and abstracts away any access into modules (thus allowing you to replace the references), and, using the __index metatable on that registry you could make it pretty much transparent without having to call ugly getters all over the place. this also means you can supply "one liner" callbacks that then actually just tail-call through the registry, if any 3rd party library needs that.

Resources