redis lua script vs. single calls - lua

I have the following setup:
2 different datastructures: Sets, Strings
They are in different namespaces *:collections:*, *:resources:*
The client doesn't know about this and I try to get both namespaces every time.
Based on exists I decide which datastructure to get finally.
all calls to redis are done asynchronous (vert.x redis-mod)
Now I have to decide if I execute this as lua script or as single commands.
The lua script I came up with:
local path = KEYS[1]
local resourcesPrefix = ARGV[1]
local collectionsPrefix = ARGV[2]
if redis.call('exists',resourcesPrefix..path) == 1 then
return redis.call('get',resourcesPrefix..path)
elseif redis.call('exists',collectionsPrefix..path) == 1 then
return redis.call('smembers',collectionsPrefix..path)
else
return "notFound"
end
Are there any pros and cons for single calls or lua script?

Yes, LUA script is a best solution in case of EVALSHA call:
You are working woth redis asynchronous. So LUA helps you to reduce number of code and code readability.
LUA case is faster becouse of reduce network communication.
I think you can write your code with just 2 commands. You do not need exists in your code.
local path = KEYS[1]
local resourcesPrefix = ARGV[1]
local collectionsPrefix = ARGV[2]
local ret
set ret = redis.call('get',resourcesPrefix..path)
if ret then
return ret
end
set ret = redis.call('smembers',collectionsPrefix..path)
if ret then
return ret
end
return "notFound"

It looks like a good use of Redis LUA scripting to me. The execution time will be short (long scripts are to be avoided as they are blocking). It avoids doing multiple calls, so reduces the total network communication time. So I think it's a better solution than single calls, if you make many of these calls. Especially if you use EVALSHA to have the script cached in Redis.

Related

How do I unit test Lua modules without OO?

In Lua I can write a simple module like so
local database = require 'database'
local M = {}
function M:GetData()
return database:GetData()
end
return M
Which when required, will load once, and all future versions will load the same copy.
If I wanted to take an object-oriented approach I could do something like:
local M = {}
M.__index = M
function M:GetData()
return self.database:GetData()
end
return function(database)
local newM = setmetatable({}, M)
newM.database = database
return newM
end
Where M is only loaded once, and each copy of newM just holds its own data and uses the methods of the original M.
When it comes to testing, with the OO approach I can just pass in a fake version of 'database' and check it gets called, but with the first approach I can't.
So my question is how can I make the first approach support DI/testing without making it class-like?
My thought was to wrap it in a closure something like this:
local mClosure = function(database)
local M = {}
function M:GetData()
return database:GetData()
end
return M
end
return mClosure
but then every time it is called it will create a new copy of M, so it will lose the benefits of both of the previous approaches.
That's clearly a use case for the Lua debug library. With that you can just modify the upvalues of your function and inject dependancies. Also consider that you can use require for this; just require your database module once, create small table that collects data and then redirects to the original module and put it in package.loaded so the next time you require it, the require call returns the modified version of the module. The OO approach is how you would do this kind of thing in a language like Ruby, but in Lua we have way nicer ways of tapping into a module or function without it being specifically designed for that purpose.
local real_db = require 'db'
local fake_db = setmetatable({}, {__index=db})
function fake_db.exec(query) print('running query: '..query) end -- dummy function
function fake_db.something(...) print('doing something'); real_db.something(...) end
package.loaded.db = fake_db
require 'my_tests' -- this in turn requires 'db', but gets the fake one
package.loaded.db = real_db
-- After this point, `require 'db'` will return the original module

Jedis/Redis SocketTimeout exception on Lua scripts

We are using lua scripts to perform batch deletes of data on updates to our DB. Jedis executes the lua script using a pipeline.
local result = redis.call('lrange',key,0,12470)
for i,k in ipairs(result) do
redis.call('del',k)
redis.call('ltrim',key,1,k)
end
try (Jedis jedis = jedisPool.getResource()) {
Pipeline pipeline = jedis.pipelined();
long len = jedis.llen(table);
String script = String.format(DELETE_LUA_SCRIPT, table, len);
LOGGER.info(script);
pipeline.eval(script);
pipeline.sync();
} catch (JedisConnectionException e) {
LOGGER.info(e.getMessage());
}
For large ranges we notice that the lua scripts slow down and we get SocketTimeOutExceptions.
running redis-cli slowlog displays only the lua scripts that have taken too long to execute.
Is there a better way to do this? is my lua script blocking?
When I use just pipeline to do the batch deletes, the slowlog also returns slow queries.
try (Jedis jedis = jedisPool.getResource()) {
Pipeline pipeline = jedis.pipelined();
long len = jedis.llen(table);
List<String> queriesContainingTable = jedis.lrange(table,0,len);
if(queriesContainingTable.size() > 0) {
for (String query: queriesContainingTable) {
pipeline.del(query);
pipeline.lrem(table,1,query);
}
pipeline.sync();
}
} catch (JedisConnectionException e) {
LOGGER.info("CACHE INVALIDATE FAIL:"+e.getMessage());
}
slowlog is capable of storing top 128 slowlogs alone (can be changed in redis.conf slowlog-max-len 128). So your 1st model of using LUA script is surely a blocking one.
If you delete such a number (12470) one by one it is surely a blocking one as it take more time to complete. Out of the 2 models 2nd one is fine for me (using pipeline), because you avoid the iteration all you do is hitting del query for n times.
You can use del of multiple keys for every 100 or 1000 (whichever you feel as optimal after a small testing). You can group them to a pipeline altogether.
Or if you can do the same without atomicity, you can delete every 100 or 1000 keys at once in a loop, so that it wouldn't be a blocking call.
Try out with different combinations take the metrics and go with the optimized one.

How to run multiple instances of a lua script from C

I have the following lua script :
mydata={}
function update(val)
mydata["x"] = val
if (val == 10)
-- Call C-Api(1)
else
--Register callback with C when free or some event happens
register_callback(callme)
end
function callme()
end
Basically I would like to have two instances of this script running in my C program/process with out having to create a new LUA state per script. And I want to call the function update() with val = 10 from the one instance and call the function update() with val = 20 from another instance. From the second instance it registers a callback function and just waits to be called.
Basically the script file is kind of a RULE that i am trying to achieve.
Several events on the system can trigger this rule or script file. I would like to process this rule as per the event that triggered it. There could be multiple events triggering this script to run at the same time. So I need to have multiple instances of this script running differentiated by the kind of event that triggered it.
To Summarize, I want each caller to have separate individual instance of mydata
I would like to achieve something like this. I read some where that we should be able to run multiple instances of a lua script with out having to create a new lua instance by loading a new environment before loading the script
But I am not able to find the exact details.
Could some body help ?
While I'm still not sure what exactly you are trying to achieve, but if you want to have two instances of the same function that keep the data they use private, you just need to create a closure and return an anonymous function that your C code will use.
Something like this should work:
function createenv(callme)
local mydata={}
return function (val) -- return anonymous function
mydata["x"] = val
if (val == 10)
-- Call C-Api(1)
else
--Register callback with C when free or some event happens
register_callback(callme)
end
end
end
Now in one part of your (C or Lua) code you can do:
local update = createenv(function() --[[do whatever]] end)
update(10)
And then in another part you can do:
local update = createenv(function() --[[do something else]] end)
update(20)
And they should have nothing in common between each other. Note that they still shared the same Lua state, but their instances of mydata will be independent of each other.

sanitizing a Lua table input

Let's say I want a Lua table that will be provided from a third party, not totally reliable, from a file or other IO source.
I get the table as a string, like "{['valid'] = 10}" and I can load it as
externalTable = loadstring("return " .. txtTable)()
But this opens a breach to code injection, ie.: txtTable = os.execute('rm -rf /')
So I did this sanitizing function:
function safeLoadTable(txtTable)
txtTable = tostring(txtTable)
if (string.find(txtTable, "(", 1, true))
then return nil end
local _start = string.find(txtTable, "{", 1, true)
local _end = string.find(string.reverse(txtTable), "}", 1, true)
if (_start == nil or _end == nil)
then return nil end
txtTable = string.sub(txtTable, _start, #txtTable - _end + 1)
print("cropped to ", txtTable)
local pFunc = loadstring("return " .. txtTable)
if (pFunc) then
local _, aTable = pcall(pFunc)
return aTable
end
end
In the worst case it should return nil.
Can this be considered safe against a "regular bad-intentioned person" :)
You could run the unsafe code in a sandbox.
Here is how a simple sandbox could look in Lua 5.1 (error handling omitted for brevity):
local script = [[os.execute("rm -rf /")]]
local env = { print=print, table=table, string=string }
local f, err = loadstring(script)
if err then
-- handle syntax error
end
setfenv(f, env)
local status, err = pcall(f)
if not status then
-- handle runtime error
end
In Lua 5.2 you can load the script into it's own environment using the load function.
The result would be a runtime error returned from pcall:
attempt to index global 'os' (a nil value)
EDIT
As Lorenzo Donati pointed out in the comments this is not a complete solution to stop rogue scripts. It essentially allows you to white-list functions and tables that are approved for user scripts.
For more info about handling rogue scripts I would suggest this SO question:
Embedded Lua - timing out rogue scripts (e.g. infinite loop) - an example anyone?
I don't think it is safe. Try this:
print(safeLoadTable [[{ foo = (function() print"yahoo" end)() } ]])
EDIT
or this, for more fun:
print(safeLoadTable [[{ foo = (function() print(os.getenv "PATH") end)() } ]])
I won't suggest the alternative of replacing that os.getenv with os.execute, though. :-)
The problem is not easy to solve. Code injection avoidance is not at all simple in this case because you are executing a piece of Lua code when doing that loadstring. No simple string matching technique is really safe. The only secure way would be to implement a parser for a subset of the Lua table syntax and use that parser on the string.
BTW, even Lua team stripped off the bytecode verifier from Lua 5.2 since they discovered that it was amenable to attacks, and bytecode is a far simpler language than Lua source code.
I created sandbox.lua for exactly this purpose. It'll handle both insecure stuff as well as DOS-type attacks, assuming that your environment has access to the debug facility.
https://github.com/kikito/sandbox.lua
Note that for now it is Lua 5.1-compatible only.
Running in sandbox isn't safe, inspecting source code is not very simple. An idea: inspect bytecode!
Emmm, actually that's not very simple either, but here is a lazy implementation: http://codepad.org/mGqQ0Y8q

Recreating setfenv() in Lua 5.2

How can I recreate the functionality of setfenv in Lua 5.2? I'm having some trouble understanding exactly how you are supposed to use the new _ENV environment variable.
In Lua 5.1 you can use setfenv to sandbox any function quite easily.
--# Lua 5.1
print('_G', _G) -- address of _G
local foo = function()
print('env', _G) -- address of sandbox _G
bar = 1
end
-- create a simple sandbox
local env = { print = print }
env._G = env
-- set the environment and call the function
setfenv(foo, env)
foo()
-- we should have global in our environment table but not in _G
print(bar, env.bar)
Running this example shows an output:
_G table: 0x62d6b0
env table: 0x635d00
nil 1
I would like to recreate this simple example in Lua 5.2. Below is my attempt, but it does not work like the above example.
--# Lua 5.2
local function setfenv(f, env)
local _ENV = env or {} -- create the _ENV upvalue
return function(...)
print('upvalue', _ENV) -- address of _ENV upvalue
return f(...)
end
end
local foo = function()
print('_ENV', _ENV) -- address of function _ENV
bar = 1
end
-- create a simple sandbox
local env = { print = print }
env._G = env
-- set the environment and call the function
foo_env = setfenv(foo, env)
foo_env()
-- we should have global in our envoirnment table but not in _G
print(bar, env.bar)
Running this example shows the output:
upvalue table: 0x637e90
_ENV table: 0x6305f0
1 nil
I am aware of several other questions on this subject, but they mostly seem to be dealing with loading dynamic code (files or string) which work quite well using the new load function provided in Lua 5.2. Here I am specifically asking for a solution to run arbitrary functions in a sandbox. I would like to do this without using the debug library. According to the Lua documentation we should not have to rely on it.
You cannot change the environment of a function without using the debug library from Lua in Lua 5.2. Once a function has been created, that is the environment it has. The only way to modify this environment is by modifying its first upvalue, which requires the debug library.
The general idea with environments in Lua 5.2 is that the environment should be considered immutable outside of trickery (ie: the debug library). You create a function in an environment; once created there, that's the environment it has. Forever.
This is how environments were often used in Lua 5.1, but it was easy and sanctioned to modify the environment of anything with a casual function call. And if your Lua interpreter removed setfenv (to prevent users from breaking the sandbox), then the user code can't set the environment for their own functions internally. So the outside world gets a sandbox, but the inside world can't have a sandbox within the sandbox.
The Lua 5.2 mechanism makes it harder to modify the environment post function-creation, but it does allow you to set the environment during creation. Which lets you sandbox inside the sandbox.
So what you really want is to just rearrange your code like this:
local foo;
do
local _ENV = { print = print }
function foo()
print('env', _ENV)
bar = 1
end
end
foo is now sandboxed. And now, it's much harder for someone to break the sandbox.
As you can imagine, this has caused some contention among Lua developers.
It's a bit expensive, but if it's that important to you...
Why not use string.dump, and re-load the function into the right environment?
function setfenv(f, env)
return load(string.dump(f), nil, nil, env)
end
function foo()
herp(derp)
end
setfenv(foo, {herp = print, derp = "Hello, world!"})()
To recreate setfenv/getfenv in Lua 5.2 you can do the following:
if not setfenv then -- Lua 5.2
-- based on http://lua-users.org/lists/lua-l/2010-06/msg00314.html
-- this assumes f is a function
local function findenv(f)
local level = 1
repeat
local name, value = debug.getupvalue(f, level)
if name == '_ENV' then return level, value end
level = level + 1
until name == nil
return nil end
getfenv = function (f) return(select(2, findenv(f)) or _G) end
setfenv = function (f, t)
local level = findenv(f)
if level then debug.setupvalue(f, level, t) end
return f end
end
RPFeltz's answer (load(string.dump(f)...)) is a clever one and may work for you, but it doesn't deal with functions that have upvalues (other than _ENV).
There is also compat-env module that implements Lua 5.1 functions in Lua 5.2 and vice versa.
In Lua5.2 a sandboxeable function needs to specify that itself. One simple pattern you can use is have it receive _ENV as an argument
function(_ENV)
...
end
Or wrap it inside something that defines the env
local mk_func(_ENV)
return function()
...
end
end
local f = mk_func({print = print})
However, this explicit use of _ENV is less useful for sandboxing, since you can't always assume the other function will cooperate by having an _ENV variable. In that case, it depends on what you do. If you just want to load code from some other file then functions such as load and loadfile usually receive an optional environment parameter that you can use for sandboxing. Additionally, if the code you are trying to load is in string format you can use string manipulation to add _ENV variables yourself (say, by wrapping a function with an env parameter around it)
local code = 'return function(_ENV) return ' .. their_code .. 'end'
Finally, if you really need dynamic function environment manipulation, you can use the debug library to change the function's internal upvalue for _ENV. While using the debug library is not usually encouraged, I think it is acceptable if all the other alternatives didn't apply (I feel that in this case changing the function's environment is deep voodoo magic already so using the debug library is not much worse)

Resources