Wrap an iterator to iterate over only some of the values - lua

I have a Lua iterator function that I don't have control over:
for x, y, z in otherfunc(stuff) do
...
end
I want to write a generic Lua wrapper to this function that skips over some of the values it returns, hiding this implementation detail from my users:
for x, y, z in myfunc(stuff) do
-- not every x/y/z triplet will appear here
end
Though I happen to have three return values here, how can I handle this generically for any number of return values?

local function check_and_return(my_state, x, ...)
if x == nil or my_state[1](x, ...) then
return x, ...
else
return check_and_return(my_state, my_state[2](my_state[3], x))
end
end
local function my_generator(my_state, prev_x)
return check_and_return(my_state, my_state[2](my_state[3], prev_x))
end
function subiterate(selector, generator, state, init_x)
-- the iterator is stateless
-- no closures created at loop initialization
-- one table is created at loop initialization
-- no tables created on every step
return my_generator, {selector, generator, state}, init_x
end
Usage example:
local function everythird(i,_)
return (i-1)%3==2
end
for i, n in subiterate(everythird, ipairs{'a','b','c','d','e','f'}) do
print(i, n)
end
--> 3 c
--> 6 f

For the specific case of a known iterator function with specific number of return values, and a hard-coded selection criteria, I used this:
function myfunc(...)
local generator, constant, x,y,z in otherfunc(...)
return function()
repeat x,y,z = generator(constant,x,y,z)
until (not x) or x+y+z > 7
if x then return x,y,z end
end
end
The above will call the otherfunc() iterator repeatedly, but only yield x/y/z values when the sum of them is greater than 7.
More generically, here's a function that takes a selection function and an iterator, handling an arbitrary number of return values from the iterator and returning them only when the selector function returns a truthy value:
function subiterate(selector, generator, constant, ...)
local values = {...}
return function()
repeat values = table.pack(generator(constant, table.unpack(values)))
until not values[1] or selector(table.unpack(values))
return table.unpack(values)
end
end
local function everythird(i,_)
return (i-1)%3==2
end
for i, n in subiterate(everythird, ipairs{'a','b','c','d','e','f'}) do
print(i, n)
end
--> 3 c
--> 6 f
I make no guarantees about the speed of the generic function, given all the pack() and unpack() going on. But it does work. I welcome any more efficient way to handle arbitrary numbers of return values for an iterator.

Related

What is the "type signature" of pairs() in Lua?

Looking at the chapter 7.1 – Iterators and Closures from "Programming in Lua" it seems like the the for foo in bar loop takes requires bar to be of type (using Java typesto express it) Supplier<Tuple> and the the for-in will keep calling bar until it returns nil.
So for something like:
for k,v in pairs( tables ) do
print( 'key: '..k..', value: '..v )
end
that implies pairs has a type of Function<Table,Supplier<Tuple>>.
I want to create a function that behaves like pairs except it skips tuples where the first argument starts with an underscore (ie _).
local function mypairs( list )
local --[[ Supplier<Tuple> ]] pairIterator = pairs( list )
return --[[ Supplier<Tuple> ]] function ()
while true do
local key, value = pairIterator()
if key == nil then
return nil
elseif key:sub(1,1) ~= '_' then
return key, value
end
end
end
end
however it doesn't work since
--[[should be: Supplier<Table>]] pairIterator = pairs({ c=3; b=2; a=1 })
when I call it
pairIterator()
it returns
stdin:1: bad argument #1 to 'pairIterator' (table expected, got no value)
stack traceback:
[C]: in function 'pairIterator'
stdin:1: in main chunk
[C]: in ?
but
pairIterator({ c=3; b=2; a=1 })
returns
Lua>pairIterator({ c=3; b=2; a=1 })
c 3
Your basic problem is that you're using Java logic on Lua problems. Java and Lua are different languages with different constructs, and it's important to recognize that.
pairs does not have a return value; it has multiple return values. This is a concept that Java completely lacks. A Tuple is a single value that can store and manipulate multiple values. A Lua function can return multiple values. This is syntactically and semantically distinct from returning a table containing multiple values.
The iterator-based for statement takes multiple values as its input, not a table or container of multiple values. Specifically, it stores 3 values: an iterator function, a state value (which you use to preserve state between calls), and an initial value.
So, if you want to mimic pairs's behavior, you need to be able to store and manipulate its multiple return values.
Your first step is to store what pairs actually returns:
local f, s, var = pairs(list)
You are creating a new iterator function. So you need to return that, but you also need to return the s and var that pairs returns. Your return statement needs to look like this:
return function (s, var)
--[[Contents discussed below]]
end, s, var --Returning what `pairs` would return.
Now, inside your function, you need to call f with s and var. This function will return the key/value pair. And you need to process them correctly:
return function (s, var)
repeat
local key, value = f(s, var)
if(type(key) ~= "string") then
--Non-string types can't have an `_` in them.
--And no need to special-case `nil`.
return key, value
elseif(key:sub(1, 1) ~= '_') then
return key, value
end
until true
end, s, var --Returning what `pairs` would return.
pairs() returns three separate values:
a function to call with parameters (table, key) that returns a key and value
the table you passed to it
the first 'key' value to pass to the function (nil for pairs(), 0 for ipairs())
So something like this:
for k,v in pairs({a=1, b=13, c=169}) do print(k, v) end
Can be done like this:
local f,t,k = pairs({a=1, b=13, c=169})
local v
print('first k: '..tostring(k))
k,v = f(t, k)
while k ~= nil do
print('k: '..tostring(k)..', v: '..tostring(v))
k,v = f(t, k)
end
Results:
first k: nil
k: c, v: 169
k: b, v: 13
k: a, v: 1
And you don't have to take an argument, this has manual if statements for each value:
function mypairs()
-- the function returned should take the table and an index, and
-- return the next value you expect AND the next index to pass to
-- get the value after. return nil and nil to end
local myfunc = function(t, val)
if val == 0 then return 1, 'first' end
if val == 1 then return 2, 'second' end
if val == 2 then return 3, 'third' end
return nil, nil
end
-- returns a function, the table, and the first value to pass
-- to the function
return myfunc, nil, 0
end
for i,v in mypairs() do
print('i: '..tostring(i)..', v: '..tostring(v))
end
-- output:
-- i: 1, v: first
-- i: 2, v: second
-- i: 3, v: third
For your mypairs(list) you can just keep calling the function returned from pairs as long as the key has an underscore to get the next value:
local function mypairs( list )
local f,t,k = pairs(list)
return function(t,k)
local a,b = f(t, k)
while type(a) == 'string' and a:sub(1,1) == '_' do a,b = f(t,a) end
return a,b
end, t, k
end
local list = {a=5, _b=11, c = 13, _d=69}
for k,v in mypairs(list) do print(k, v) end
-- output:
-- c 13
-- a 5
The docs you link to have an iterator that only returns one value and pairs() returns 2, but you could return more if you want. The for ... in ... construct will only execute the body if the first value is non-nil. Here's a version that also returns the keys that were skipped, the body isn't executed if you don't end up with an actual value though so you might not see all the _ keys:
local function mypairs( list )
local f,t,k = pairs(list)
return function(t,k)
local skipped = {}
local a,b = f(t, k)
while type(a) == 'string' and a:sub(1,1) == '_' do
table.insert(skipped, a)
a,b = f(t,a)
end
return a,b,skipped
end, t, k
end
local list = {a=5, _b=11, c = 13, _d=69}
for k,v,skipped in mypairs(list) do
for i,s in ipairs(skipped) do
print('Skipped: '..s)
end
print(k, v)
end

Is there any reason not to unpack simple values in Lua

Let's say I do unpack(4) or unpack("hello world"). Are there any unexpected behaviors to this?
Reason is something like this:
function a(bool)
if bool then
return {1, 2}, "foo"
else
return 1, "foo"
end
end
function b(x, z)
end
function b(x, y, z)
end
i, j = a(???)
b(unpack(i), j) -- is this ok?
unpack(4) will cause an error
attempt to get length of a number value
unpack("hello world") will return
nil nil nil nil nil nil nil nil nil nil nil
so that's not very useful as well.
unpack is meant for unpacking tables. If you'd work with a recent version of Lua you would notice that it is now table.unpack()
Other issues with your code:
Lua does not support overloading functions. Functions are variables.
You write:
function b(x, z)
end
function b(x, y, z)
end
The first definition is lost once the second definition is processed.
If you use the another notation it will be more clear.
Your code is equivalent to
b = function (x, z)
end
b = function (x, y, z)
end
and I think you will agree that after
b = 3
b = 4
b will be 4. Same principle...
You can amend the unpack standard function to obtain the desired behavior:
local old_unpack = table.unpack or unpack
local function new_unpack(list, ...)
if type(list) ~= "table" then
list = {list}
end
return old_unpack(list, ...)
end
table.unpack = new_unpack
unpack = new_unpack
-- Usage:
print(unpack(4))
print(unpack("hello world"))
print(unpack(nil)) -- ops! nothing is printed!

When sethook is set to an empty function, is it considerable performance hit?

I'm writing small profiling library for my lua code based on hooks, because I cannot use any of the existing ones (company policies).
I'm considering if it makes sense to allow always working on-demand profiling for all of my scripts just by setting variable to true, eg.
function hook(event)
if prof_enabled then
do_stuff()
end
end
--(in main context)
debug.sethook(hook, "cr")
So the question is, should I expect a significant performance hit if prof_enabled = false and hook is always set? I'm not expecting exact answer, but rather some insights (maybe, for example lua interpreter will optimize it anyway?)
I know that the best solution would be to only set the hook when it's required, but I cannot do that here.
There are ways of doing that without hook functions. One way is to replace functions of interest with a "generated" function that does profiling things (like count number of calls) and then calls the "real" function. Like this:
-- set up profiling
local profile = {}
-- stub function to profile an existing function
local function make_profile_func (fname, f)
profile [fname] = { func = f, count = 0 }
return function (...)
profile [fname].count = profile [fname].count + 1
local results = { f (...) } -- call original function
return unpack (results) -- return results
end -- function
end -- make_profile_func
function pairsByKeys (t, f)
local a = {}
-- build temporary table of the keys
for n in pairs (t) do
table.insert (a, n)
end
table.sort (a, f) -- sort using supplied function, if any
local i = 0 -- iterator variable
return function () -- iterator function
i = i + 1
return a[i], t[a[i]]
end -- iterator function
end -- pairsByKeys
-- show profile, called by alias
-- non-profiled functions
local npfunctions = {
string_format = string.format,
string_rep = string.rep,
string_gsub = string.gsub,
table_insert = table.insert,
table_sort = table.sort,
pairs = pairs,
}
function show_profile (name, line, wildcards)
print (npfunctions.string_rep ("-", 20), "Function profile - alpha order",
npfunctions.string_rep ("-", 20))
print ""
print (npfunctions.string_format ("%25s %8s", "Function", "Count"))
print ""
for k, v in pairsByKeys (profile) do
if v.count > 0 then
print (npfunctions.string_format ("%25s %8i", k, v.count))
end -- if
end -- for
print ""
local t = {}
for k, v in npfunctions.pairs (profile) do
if v.count > 0 then
npfunctions.table_insert (t, k)
end -- if used
end -- for
npfunctions.table_sort (t, function (a, b) return profile [a].count > profile [b].count end )
print (npfunctions.string_rep ("-", 20), "Function profile - count order",
npfunctions.string_rep ("-", 20))
print ""
print (npfunctions.string_format ("%25s %8s", "Function", "Count"))
print ""
for _, k in ipairs (t) do
print (npfunctions.string_format ("%25s %8i", k, profile [k].count))
end -- for
print ""
end -- show_profile
-- replace string functions by profiling stub function
for k, f in pairs (string) do
if type (f) == "function" then
string [k] = make_profile_func (k, f)
end -- if
end -- for
-- test
for i = 1, 10 do
string.gsub ("aaaa", "a", "b")
end -- for
for i = 1, 20 do
string.match ("foo", "f")
end -- for
-- display results
show_profile ()
In this particular test I called string.gsub 10 times and string.match 20 times. Now the output is:
-------------------- Function profile - alpha order --------------------
Function Count
gsub 10
match 20
-------------------- Function profile - count order --------------------
Function Count
match 20
gsub 10
You could do other things, like time functions, if you had a high-precision timer.
I adapted this from a post on the MUSHclient forum where I also got precise timings.
What make_profile_func does is keep a copy of the original function pointer in an upvalue, and return a function that adds one to a call count, calls the original function, and returns its results. The beauty of this is that you can omit the overhead by simply not replacing the functions with their generated counterparts.
In other words, omit these lines and there will be no overhead:
for k, f in pairs (string) do
if type (f) == "function" then
string [k] = make_profile_func (k, f)
end -- if
end -- for
There is a bit of fiddling around in the code to use non-profiled versions of the functions that are used in generating the profiles (otherwise you get a slightly misleading reading).

what is actual implementation of lua __pairs?

Does anybody know actual implementation of lua 5.2. metamethod __pairs? In other words, how do I implement __pairs as a metamethod in a metatable so that it works exactly same with pairs()?
I need to override __pairs and want to skip some dummy variables that I add in a table.
The following would use the metatable meta to explicitly provide pairs default behavior:
function meta.__pairs(t)
return next, t, nil
end
Now, for skipping specific elements, we must replace the returned next:
function meta.__pairs(t)
return function(t, k)
local v
repeat
k, v = next(t, k)
until k == nil or theseok(t, k, v)
return k, v
end, t, nil
end
For reference: Lua 5.2 manual, pairs
The code below skips some entries. Adapt as needed.
local m={
January=31, February=28, March=31, April=30, May=31, June=30,
July=31, August=31, September=30, October=31, November=30, December=31,
}
setmetatable(m,{__pairs=
function (t)
local k=nil
return
function ()
local v
repeat k,v=next(t,k) until v==31 or k==nil
return k,v
end
end})
for k,v in pairs(m) do print(k,v) end

In Lua, how to get the tail of an array without copying it?

I'm wokring with Lua 5.2, and for the sake of this question, assume that the tables are used exclusively as arrays.
Here's a function that returns the tail of an array (the array minus its first element):
function tail(t)
if # t <= 1 then
return nil
end
local newtable = {}
for i, v in ipairs(t) do
if i > 1 then
table.insert(newtable, v)
end
end
return newtable
end
For instance:
prompt> table.concat(tail({10, 23, 8}), ", ")
23, 8
However this is achieved by returning a new copy of the table. Is there a way to avoid the creation of a new table?
I am looking for the equivalent of C's returning a pointer to the next element (t++). Is it possible?
As already explained, this is normally impossible.
However, using metatables, you could implement a tail function that performs what you want without copying all the data, by referencing the original table. The following works for most operations in Lua 5.2, but for example not for table.concat:
function tail(t)
return setmetatable({}, {
__index = function(_, k) return t[k+1] end,
__newindex = function(_, k, v) t[k+1] = v end,
__len = function(_) return #t-1 end,
__ipairs = function(_) return
function(_, i)
if i+1==#t then return nil end
return i+1, t[i+2] end,
t, 0 end,
__pairs = function(t) return ipairs(t) end,
})
end
This is the nicest way I know to implement tail(). It makes one new table, but I don't think that's avoidable.
function tail(list)
return { select(2, unpack(list)) }
end
Nicol is correct that you can't reference a slice of an array, but there is an easier/shorter way to do what you want to do:
function tail(t)
local function helper(head, ...) return #{...} > 0 and {...} or nil end
return helper((table.unpack or unpack)(t))
end
print(table.concat(tail({10, 23, 8}), ", ")) will then print 23,8.
(added table.unpack or unpack to make it also work with Lua 5.2)
I am looking for the equivalent of C's returning a pointer to the next element (t++). Is it possible?
No. The only possible reason you could want this is performance. Such a feature is only found in low-level programming languages. Lua is a scripting language: performance is not such a priority that this would be implemented.
Just make another table like you're doing, or use table.remove to modify the original. Whichever works best for you. Remember: the big, important objects like tables and userdata are all stored by reference in Lua, not by value.
prapin's suggestion, to use metatables to present a view of the sequence, is roughly the way I'd do it. An abstraction that might help is defining a metatable for segments, which can be an 0-ary function that returns a pair of a table and an offset index - we are only using functions here to represent tuples. We can then define a metatable that makes this function behave like a table:
do
local tail_mt = {
__index = function(f, k) local t, i=f(); return t[k+i] end,
__newindex = function(f, k, v) local t,i=f(); t[k+1] = v end,
__len = function(f) local t,i=f(); return #t-i end,
__ipairs = function(f)
local t,i = f ()
return
function (_, j)
if i+j>=#t then
return nil
else
return j+1, t[i+j+1]
end
end, nil, 0
end,
}
tail_mt.__pairs = tail_mt.__ipairs -- prapin collapsed this functionality, so I do too
function tail (t)
if type(t) == "table" then
return setmetatable ( function () return t, 1 end, tail_mt )
elseif type(t) == "function" then
local t1, i = t ()
return setmetatable ( function () return t1, i+1 end, tail_mt )
end
end
end
With __index and __newindex metamethods, you can write code such as f[2]=f[1]+1.
Although this (untested) code doesn't endlessly create one-off metatables, it is probably less efficient than prapin's, since it will be calling thunks (0-ary functions) to get at their contents. But if you might be interested in extending the functionality, say by having more general views on the sequence, I think this is a bit more flexible.

Resources