sanitizing a Lua table input - lua

Let's say I want a Lua table that will be provided from a third party, not totally reliable, from a file or other IO source.
I get the table as a string, like "{['valid'] = 10}" and I can load it as
externalTable = loadstring("return " .. txtTable)()
But this opens a breach to code injection, ie.: txtTable = os.execute('rm -rf /')
So I did this sanitizing function:
function safeLoadTable(txtTable)
txtTable = tostring(txtTable)
if (string.find(txtTable, "(", 1, true))
then return nil end
local _start = string.find(txtTable, "{", 1, true)
local _end = string.find(string.reverse(txtTable), "}", 1, true)
if (_start == nil or _end == nil)
then return nil end
txtTable = string.sub(txtTable, _start, #txtTable - _end + 1)
print("cropped to ", txtTable)
local pFunc = loadstring("return " .. txtTable)
if (pFunc) then
local _, aTable = pcall(pFunc)
return aTable
end
end
In the worst case it should return nil.
Can this be considered safe against a "regular bad-intentioned person" :)

You could run the unsafe code in a sandbox.
Here is how a simple sandbox could look in Lua 5.1 (error handling omitted for brevity):
local script = [[os.execute("rm -rf /")]]
local env = { print=print, table=table, string=string }
local f, err = loadstring(script)
if err then
-- handle syntax error
end
setfenv(f, env)
local status, err = pcall(f)
if not status then
-- handle runtime error
end
In Lua 5.2 you can load the script into it's own environment using the load function.
The result would be a runtime error returned from pcall:
attempt to index global 'os' (a nil value)
EDIT
As Lorenzo Donati pointed out in the comments this is not a complete solution to stop rogue scripts. It essentially allows you to white-list functions and tables that are approved for user scripts.
For more info about handling rogue scripts I would suggest this SO question:
Embedded Lua - timing out rogue scripts (e.g. infinite loop) - an example anyone?

I don't think it is safe. Try this:
print(safeLoadTable [[{ foo = (function() print"yahoo" end)() } ]])
EDIT
or this, for more fun:
print(safeLoadTable [[{ foo = (function() print(os.getenv "PATH") end)() } ]])
I won't suggest the alternative of replacing that os.getenv with os.execute, though. :-)
The problem is not easy to solve. Code injection avoidance is not at all simple in this case because you are executing a piece of Lua code when doing that loadstring. No simple string matching technique is really safe. The only secure way would be to implement a parser for a subset of the Lua table syntax and use that parser on the string.
BTW, even Lua team stripped off the bytecode verifier from Lua 5.2 since they discovered that it was amenable to attacks, and bytecode is a far simpler language than Lua source code.

I created sandbox.lua for exactly this purpose. It'll handle both insecure stuff as well as DOS-type attacks, assuming that your environment has access to the debug facility.
https://github.com/kikito/sandbox.lua
Note that for now it is Lua 5.1-compatible only.

Running in sandbox isn't safe, inspecting source code is not very simple. An idea: inspect bytecode!
Emmm, actually that's not very simple either, but here is a lazy implementation: http://codepad.org/mGqQ0Y8q

Related

Use curried lua function as custom complete list in neovim

I'm trying to provide a curried lua function as method to call for autocomplete a command. However, it sometimes does nothing and sometimes fails with a memory segmentation error.
This is a minimal reproduction example:
Given this init.lua
vim.cmd(string.format([[set rtp=%s]], vim.fn.getcwd()))
local custom_complete = [[customlist,v:lua.require'split_later'.split_later('a b')]]
local options = { complete = custom_complete, desc = "Test to break", nargs = 1, force = true }
vim.api.nvim_create_user_command("Test", "echo <args>", options)
And this file called split_later under the correct lua/ subfolder:
local M = {}
function M.split_later(str)
print("called split")
return function()
return vim.fn.split(str)
end
end
return M
The idea is to have a generic utility that is able to produce auto-complete functions from lua, but it is not working as expected.
I found a workaround using a lookup table that stores the command name and then uses it, but I don't like it that much.

redis lua script vs. single calls

I have the following setup:
2 different datastructures: Sets, Strings
They are in different namespaces *:collections:*, *:resources:*
The client doesn't know about this and I try to get both namespaces every time.
Based on exists I decide which datastructure to get finally.
all calls to redis are done asynchronous (vert.x redis-mod)
Now I have to decide if I execute this as lua script or as single commands.
The lua script I came up with:
local path = KEYS[1]
local resourcesPrefix = ARGV[1]
local collectionsPrefix = ARGV[2]
if redis.call('exists',resourcesPrefix..path) == 1 then
return redis.call('get',resourcesPrefix..path)
elseif redis.call('exists',collectionsPrefix..path) == 1 then
return redis.call('smembers',collectionsPrefix..path)
else
return "notFound"
end
Are there any pros and cons for single calls or lua script?
Yes, LUA script is a best solution in case of EVALSHA call:
You are working woth redis asynchronous. So LUA helps you to reduce number of code and code readability.
LUA case is faster becouse of reduce network communication.
I think you can write your code with just 2 commands. You do not need exists in your code.
local path = KEYS[1]
local resourcesPrefix = ARGV[1]
local collectionsPrefix = ARGV[2]
local ret
set ret = redis.call('get',resourcesPrefix..path)
if ret then
return ret
end
set ret = redis.call('smembers',collectionsPrefix..path)
if ret then
return ret
end
return "notFound"
It looks like a good use of Redis LUA scripting to me. The execution time will be short (long scripts are to be avoided as they are blocking). It avoids doing multiple calls, so reduces the total network communication time. So I think it's a better solution than single calls, if you make many of these calls. Especially if you use EVALSHA to have the script cached in Redis.

Globals are bad, does this increase performance in any way?

I'm working in LuaJIT and have all my libraries and whatnot stored inside "foo", like this:
foo = {}; -- The only global variable
foo.print = {};
foo.print.say = function(msg) print(msg) end;
foo.print.say("test")
Now I was wondering, would using metatables and keeping all libraries local help at all? Or would it not matter. What I thought of is this:
foo = {};
local libraries = {};
setmetatable(foo, {
__index = function(t, key)
return libraries[key];
end
});
-- A function to create a new library.
function foo.NewLibrary(name)
libraries[name] = {};
return libraries[name];
end;
local printLib = foo.NewLibrary("print");
printLib.say = function(msg) print(msg) end;
-- Other file:
foo.print.say("test")
I don't really have the tools to benchmark this right now, but would keeping the actual contents of the libraries in a local table increase performance at all? Even the slightest?
I hope I made mysef clear on this, basically all I want to know is: Performance-wise is the second method better?
If someone could link/give a detailed explaination on how global variables are processed in Lua which could explain this that would be great too.
don't really have the tools to benchmark this right now
Sure you do.
local start = os.clock()
for i=1,100000 do -- adjust iterations to taste
-- the thing you want to test
end
print(os.clock() - start)
With performance, you pretty much always want to benchmark.
would keeping the actual contents of the libraries in a local table increase performance at all?
Compared to the first version of the code? Theoretically no.
Your first example (stripping out the unnecessary cruft):
foo = {}
foo.print = {}
function foo.print.say(msg)
print(msg)
end
To get to your print function requires three table lookups:
index _ENV with "foo"
index foo table with "print"
index foo.print table with "say".
Your second example:
local libraries = {}
libraries.print = {}
function libraries.print.say(msg)
print(msg)
end
foo = {}
setmetatable(foo, {
__index = function(t, key)
return libraries[key];
end
});
To get to your print function now requires five table lookups along with other additional work:
index _ENV with "foo"
index foo table with "print"
Lua finds the result is nil, checks to see if foo has a metatable, finds one
index metatable with "__index"
check to see if the result is is is table or function, Lua find it's a function so it calls it with the key
index libraries with "print"
index the print table with "say"
Some of this extra work is done in the C code, so it's going to be faster than if this was all implemented in Lua, but it's definitely going to take more time.
Benchmarking using the loop I showed above, the first version is roughly twice as fast as the second in vanilla Lua. In LuaJIT, both are exactly the same speed. Obviously the difference gets optimized away at runtime in LuaJIT (which is pretty impressive). Just goes to show how important benchmarking is.
Side note: Lua allows you to supply a table for __index, which will result in a lookup equivalent to your code:
setmetatable(foo, { __index = function(t, key) return libraries[key] end } )
So you could just write:
setmetatable(foo, { __index = libraries })
This also happens to be a lot faster.
Here is how I write my modules:
-- foo.lua
local MyLib = {}
function MyLib.foo()
...
end
return MyLib
-- bar.lua
local MyLib = require("foo.lua")
MyLib.foo()
Note that the return MyLib is not in a function. require captures this return value and uses it as the library. This way, there are no globals.

Recreating setfenv() in Lua 5.2

How can I recreate the functionality of setfenv in Lua 5.2? I'm having some trouble understanding exactly how you are supposed to use the new _ENV environment variable.
In Lua 5.1 you can use setfenv to sandbox any function quite easily.
--# Lua 5.1
print('_G', _G) -- address of _G
local foo = function()
print('env', _G) -- address of sandbox _G
bar = 1
end
-- create a simple sandbox
local env = { print = print }
env._G = env
-- set the environment and call the function
setfenv(foo, env)
foo()
-- we should have global in our environment table but not in _G
print(bar, env.bar)
Running this example shows an output:
_G table: 0x62d6b0
env table: 0x635d00
nil 1
I would like to recreate this simple example in Lua 5.2. Below is my attempt, but it does not work like the above example.
--# Lua 5.2
local function setfenv(f, env)
local _ENV = env or {} -- create the _ENV upvalue
return function(...)
print('upvalue', _ENV) -- address of _ENV upvalue
return f(...)
end
end
local foo = function()
print('_ENV', _ENV) -- address of function _ENV
bar = 1
end
-- create a simple sandbox
local env = { print = print }
env._G = env
-- set the environment and call the function
foo_env = setfenv(foo, env)
foo_env()
-- we should have global in our envoirnment table but not in _G
print(bar, env.bar)
Running this example shows the output:
upvalue table: 0x637e90
_ENV table: 0x6305f0
1 nil
I am aware of several other questions on this subject, but they mostly seem to be dealing with loading dynamic code (files or string) which work quite well using the new load function provided in Lua 5.2. Here I am specifically asking for a solution to run arbitrary functions in a sandbox. I would like to do this without using the debug library. According to the Lua documentation we should not have to rely on it.
You cannot change the environment of a function without using the debug library from Lua in Lua 5.2. Once a function has been created, that is the environment it has. The only way to modify this environment is by modifying its first upvalue, which requires the debug library.
The general idea with environments in Lua 5.2 is that the environment should be considered immutable outside of trickery (ie: the debug library). You create a function in an environment; once created there, that's the environment it has. Forever.
This is how environments were often used in Lua 5.1, but it was easy and sanctioned to modify the environment of anything with a casual function call. And if your Lua interpreter removed setfenv (to prevent users from breaking the sandbox), then the user code can't set the environment for their own functions internally. So the outside world gets a sandbox, but the inside world can't have a sandbox within the sandbox.
The Lua 5.2 mechanism makes it harder to modify the environment post function-creation, but it does allow you to set the environment during creation. Which lets you sandbox inside the sandbox.
So what you really want is to just rearrange your code like this:
local foo;
do
local _ENV = { print = print }
function foo()
print('env', _ENV)
bar = 1
end
end
foo is now sandboxed. And now, it's much harder for someone to break the sandbox.
As you can imagine, this has caused some contention among Lua developers.
It's a bit expensive, but if it's that important to you...
Why not use string.dump, and re-load the function into the right environment?
function setfenv(f, env)
return load(string.dump(f), nil, nil, env)
end
function foo()
herp(derp)
end
setfenv(foo, {herp = print, derp = "Hello, world!"})()
To recreate setfenv/getfenv in Lua 5.2 you can do the following:
if not setfenv then -- Lua 5.2
-- based on http://lua-users.org/lists/lua-l/2010-06/msg00314.html
-- this assumes f is a function
local function findenv(f)
local level = 1
repeat
local name, value = debug.getupvalue(f, level)
if name == '_ENV' then return level, value end
level = level + 1
until name == nil
return nil end
getfenv = function (f) return(select(2, findenv(f)) or _G) end
setfenv = function (f, t)
local level = findenv(f)
if level then debug.setupvalue(f, level, t) end
return f end
end
RPFeltz's answer (load(string.dump(f)...)) is a clever one and may work for you, but it doesn't deal with functions that have upvalues (other than _ENV).
There is also compat-env module that implements Lua 5.1 functions in Lua 5.2 and vice versa.
In Lua5.2 a sandboxeable function needs to specify that itself. One simple pattern you can use is have it receive _ENV as an argument
function(_ENV)
...
end
Or wrap it inside something that defines the env
local mk_func(_ENV)
return function()
...
end
end
local f = mk_func({print = print})
However, this explicit use of _ENV is less useful for sandboxing, since you can't always assume the other function will cooperate by having an _ENV variable. In that case, it depends on what you do. If you just want to load code from some other file then functions such as load and loadfile usually receive an optional environment parameter that you can use for sandboxing. Additionally, if the code you are trying to load is in string format you can use string manipulation to add _ENV variables yourself (say, by wrapping a function with an env parameter around it)
local code = 'return function(_ENV) return ' .. their_code .. 'end'
Finally, if you really need dynamic function environment manipulation, you can use the debug library to change the function's internal upvalue for _ENV. While using the debug library is not usually encouraged, I think it is acceptable if all the other alternatives didn't apply (I feel that in this case changing the function's environment is deep voodoo magic already so using the debug library is not much worse)

lua how require works

I'm using a graphics library that lets you program in Lua. I have a need for the A* pathfinding library so I found one online. It's just 1 lua file that does the pathfinding and 1 example file. In the example file it uses the object like:
-- Loading the library
local Astar = require 'Astar'
Astar(map,1) -- Inits the library, sets the OBST_VALUE to 1
I run the script and everything works. So now I add the Astar.lua file to the path location where my graphics engine is running and do the same thing and I get the error on the Astar(map, 1) line:
"attempt to call local 'AStar' (a number value)
Any ideas why I would be getting that error when I'm doing the same thing as the example that comes with this AStar lib?
Here is a little of the AStar file
-- The Astar class
local Astar = {}
setmetatable(Astar, {__call = function(self,...) return self:init(...) end})
Astar.__index = Astar
-- Loads the map, sets the unwalkable value, inits pathfinding
function Astar:init(map,obstvalue)
self.map = map
self.OBST_VALUE = obstvalue or 1
self.cList = {}
self.oList = {}
self.initialNode = false
self.finalNode = false
self.currentNode = false
self.path = {}
self.mapSizeX = #self.map[1]
self.mapSizeY = #self.map
end
So note that when I run this from my graphics engine it's returning 1, but when run from the example that it came with it's returning a table, which is what it should be returning. So not sure why it would only be returning 1.
How is Astar getting added to the package.loaded table for the example script, as opposed to your code?
QUICK LUA SYNTACTIC SUGAR REVIEW:
func 'string' is equivalent to func('string')
tabl.ident is equivalent to tabl['ident']
When you run a script using require('Astar'), this is what it does:
checks if package.loaded['Astar'] is a non-nil value.
If it is, it returns this value. Otherwise it continues down this list.
Runs through filenames of the patterns listed in package.path (and package.cpath), with '?' replaced with 'Astar', until it finds the first file matching the pattern.
Sets package.loaded['Astar'] to true.
Runs the module script (found via path search above- for the sake of this example we'll assume it's not a C module) with 'Astar' as an argument (accessible as ... in the module script).
If the script returns a value, this value is placed into package.loaded['Astar'].
The contents of package.loaded['Astar'] are returned.
Note that the script can load the package into package.loaded['Astar'] as part of its execution and return nothing.
As somebody noted in the comments above, your problem may come from loading the module using 'AStar' instead of 'Astar'. It's possible that Lua is loading this script using this string (since, on the case-insensitive Windows, a search for a file named "AStar.lua" will open a file called "Astar.lua"), but the script isn't operating with that (by using a hard-coded "Astar" instead of the "AStar" Lua is loading the script under).
You need to add return Astar at the end of Astar.lua.

Resources