store results from a nested foreach loop in a FBM class - foreach

I am working with a big shared memory matrix of 1.3e6x1.3e6 in a foreach loop. I create that matrix with FBM function of bigstatsr package.
I need the results of the loop in the FBM class object to not run out of RAM memory.
This is what I want to do without FBM class object.
library(doParallel)
library(foreach)
library("doFuture")
cl=makeCluster(2)
registerDoParallel(cl
)
registerDoFuture()
plan(multicore)
results=foreach(a=1:4,.combine='cbind') %dopar% {
a=a-1
foreach(b=1:2,.combine='c') %dopar% {
return(10*a + b)
}
}
And this is how I try it
library(bigstatsr)
results=FBM(4,4,init=0)
saveinFBM=function(x,j){results[,j]=x}
foreach(a=1:4,.combine='savinFBM') %dopar% {
a=a-1
foreach(b=1:2,.combine='c') %dopar% {
return(10*a + b)
}
}
Error in get(as.character(FUN), mode = "function", envir = envir) :
object 'savinFBM' of mode 'function' was not found
PS: Could anybody add the tag "dofuture"?

If I understand correctly what you want to do, a faster alternative is using outer(1:2, 1:4, function(b, a) 10 * (a - 1) + b).
If you want to fill an FBM with this function, you can do:
library(bigstatsr)
X <- FBM(200, 400)
big_apply(X, a.FUN = function(X, ind) {
X[, ind] <- outer(rows_along(X), ind, function(b, a) 10 * (a - 1) + b)
NULL
})
Usually, using parallelism won't help when you write data on disk (what you do when you fill X[, ind]), but it you really want to try, you can use ncores = nb_cores() as additional argument of big_apply().

Related

Handling error with regressions inside a parallel foreach loop

Hi I am having issues regarding a foreach loop where in every iteration I estimate a regression on a subset of the data with a different list of controls on several outcomes. The problem is that for some outcomes in some countries I only have missing values and therefore the regression function returns an error message. I would like to be able to run the loop, get the output with NAs or a string saying "Error" for example instead of the coefficient table. I tried several things but they don't quite work with the .combine = rbind option and if I use .combine = c I get a very messy output. Thanks in advance for any help.
reg <- function(y, d, c){
if (missing(c))
feols(as.formula(paste0(y, "~ 0 + treatment")), data = d)
else {
feols(as.formula(paste0(y, "~ 0 + treatment + ", c)), data = d)
}
}
# Here we set up the parallelization to run the code on the server
n.cores <- 9 #parallel::detectCores() - 1
#create the cluster
my.cluster <- parallel::makeCluster(
n.cores,
type = "PSOCK"
)
# print(my.cluster)
#register it to be used by %dopar%
doParallel::registerDoParallel(cl = my.cluster)
# #check if it is registered (optional)
# foreach::getDoParRegistered()
# #how many workers are available? (optional)
# foreach::getDoParWorkers()
# Here is the cycle to parallel regress each outcome on the global treatment
# variable for each RCT with strata control
tables <- foreach(
n = 1:9, .combine = rbind, .packages = c('data.table', 'fixest'),
.errorhandling = "pass"
) %dopar% {
dt_target <- dt[country == n]
c <- controls[n]
est <- lapply(outcomes, function(x) reg(y = x, d = dt_target, c))
table <- etable(est, drop = "!treatment", cluster = "uid", fitstat = "n")
table
}

Destructured iteration over variadic arguments like a tuple sequence in D

Let's say I want to process a variadic function which alternately gets passed start and end values of 1 or more intervals and it should return a range of random values in those intervals. You can imagine the input to be a flattened sequence of tuples, all tuple elements spread over one single range.
import std.meta; //variadic template predicates
import std.traits : isFloatingPoint;
import std.range;
auto randomIntervals(T = U[0], U...)(U intervals)
if (U.length/2 > 0 && isFloatingPoint!T && NoDuplicates!U.length == 1) {
import std.random : uniform01;
T[U.length/2] randomValues;
// split and iterate over subranges of size 2
foreach(i, T start, T end; intervals.chunks(2)) { //= intervals.slide(2,2)
randomValues[i] = uniform01 * (end - start) + start,
}
return randomValues.dup;
}
The example is not important, I only use it for explanation. The chunk size could be any finite positive size_t, not only 2 and changing the chunk size should only require changing the number of loop-variables in the foreach loop.
In this form above it will not compile since it would only expect one argument (a range) to the foreach loop. What I would like is something which rather automatically uses or infers a sliding-window as a tuple, derived from the number of given loop-variables, and fills the additional variables with next elements of the range/array + allows for an additional index, optionally. According to the documentation a range of tuples allows destructuring of the tuple elements in place into foreach-loop-variables so the first thing, I thought about, is turning a range into a sequence of tuples but didn't find a convenience function for this.
Is there a simple way to loop over destructured subranges (with such a simplicity as shown in my example code) together with the index? Or is there a (standard library) function which does this job of splitting a range into enumerated tuples of equal size? How to easily turn the range of subranges into a range of tuples?
Is it possible with std.algorithm.iteration.map in this case (EDIT: with a simple function argument to map and without accessing tuple elements)?
EDIT: I want to ignore the last chunk which doesn't fit into the entire tuple. It just is not iterated over.
EDIT: It's not, that I couldn't program this myself, I only hope for a simple notation because this use case of looping over multiple elements is quite useful. If there is something like a "spread" or "rest" operator in D like in JavaScript, please let me know!
Thank you.
(Added as a separate answer because it's significantly different from my previous answer, and wouldn't fit in a comment)
After reading your comments and the discussion on the answers thus far, it seems to me what you seek is something like the below staticChunks function:
unittest {
import std.range : enumerate;
size_t index = 0;
foreach (i, a, b, c; [1,2,3,1,2,3].staticChunks!3.enumerate) {
assert(a == 1);
assert(b == 2);
assert(c == 3);
assert(i == index);
++index;
}
}
import std.range : isInputRange;
auto staticChunks(size_t n, R)(R r) if (isInputRange!R) {
import std.range : chunks;
import std.algorithm : map, filter;
return r.chunks(n).filter!(a => a.length == n).map!(a => a.tuplify!n);
}
auto tuplify(size_t n, R)(R r) if (isInputRange!R) {
import std.meta : Repeat;
import std.range : ElementType;
import std.typecons : Tuple;
import std.array : front, popFront, empty;
Tuple!(Repeat!(n, ElementType!R)) result;
static foreach (i; 0..n) {
result[i] = r.front;
r.popFront();
}
assert(r.empty);
return result;
}
Note that this also deals with the last chunk being a different size, if only by silently throwing it away. If this behavior is undesirable, remove the filter, and deal with it inside tuplify (or don't, and watch the exceptions roll in).
chunks and slide return Ranges, not tuples. Their last element can contain less than the specified size, whereas tuples have a fixed compile time size.
If you need destructuring, you have to implement your own chunks/slide that return tuples. To explicitly add an index to the tuple, use enumerate. Here is an example:
import std.typecons, std.stdio, std.range;
Tuple!(int, int)[] pairs(){
return [
tuple(1, 3),
tuple(2, 4),
tuple(3, 5)
];
}
void main(){
foreach(size_t i, int start, int end; pairs.enumerate){
writeln(i, ' ', start, ' ', end);
}
}
Edit:
As BioTronic said using map is also possible:
foreach(i, start, end; intervals
.chunks(2)
.map!(a => tuple(a[0], a[1]))
.enumerate){
Your question has me a little confused, so I'm sorry if I've misunderstood. What you're basically asking is if foreach(a, b; [1,2,3,4].chunks(2)) could work, right?
The simple solution here is to, as you say, map from chunk to tuple:
import std.typecons : tuple;
import std.algorithm : map;
import std.range : chunks;
import std.stdio : writeln;
unittest {
pragma(msg, typeof([1,2].chunks(2).front));
foreach(a, b; [1,2,3,4].chunks(2).map!(a => tuple(a[0], a[1]))) {
writeln(a, ", ", b);
}
}
At the same time with BioTronic, I tried to code some own solution to this problem (tested on DMD). My solution works for slices (BUT NOT fixed-size arrays) and avoids a call to filter:
import std.range : chunks, isInputRange, enumerate;
import std.range : isRandomAccessRange; //changed from "hasSlicing" to "isRandomAccessRange" thanks to BioTronics
import std.traits : isIterable;
/** turns chunks into tuples */
template byTuples(size_t N, M)
if (isRandomAccessRange!M) { //EDITED
import std.meta : Repeat;
import std.typecons : Tuple;
import std.traits : ForeachType;
alias VariableGroup = Tuple!(Repeat!(N, ForeachType!M)); //Tuple of N repititions of M's Foreach-iterated Type
/** turns N consecutive array elements into a Variable Group */
auto toTuple (Chunk)(Chunk subArray) #nogc #safe pure nothrow
if (isInputRange!Chunk) { //Chunk must be indexable
VariableGroup nextLoopVariables; //fill the tuple with static foreach loop
static foreach(index; 0 .. N) {
static if ( isRandomAccessRange!Chunk ) { // add cases for other ranges here
nextLoopVariables[index] = subArray[index];
} else {
nextLoopVariables[index] = subArray.popFront();
}
}
return nextLoopVariables;
}
/** returns a range of VariableGroups */
auto byTuples(M array) #safe pure nothrow {
import std.algorithm.iteration : map;
static if(!isInputRange!M) {
static assert(0, "Cannot call map() on fixed-size array.");
// auto varGroups = array[].chunks(N); //fixed-size arrays aren't slices by default and cannot be treated like ranges
//WARNING! invoking "map" on a chunk range from fixed-size array will fail and access wrong memory with no warning or exception despite #safe!
} else {
auto varGroups = array.chunks(N);
}
//remove last group if incomplete
if (varGroups.back.length < N) varGroups.popBack();
//NOTE! I don't know why but `map!toTuple` DOES NOT COMPILE! And will cause a template compilation mess.
return varGroups.map!(chunk => toTuple(chunk)); //don't know if it uses GC
}
}
void main() {
testArrayToTuples([1, 3, 2, 4, 5, 7, 9]);
}
// Order of template parameters is relevant.
// You must define parameters implicitly at first to be associated with a template specialization
void testArrayToTuples(U : V[], V)(U arr) {
double[] randomNumbers = new double[arr.length / 2];
// generate random numbers
foreach(i, double x, double y; byTuples!2(arr).enumerate ) { //cannot use UFCS with "byTuples"
import std.random : uniform01;
randomNumbers[i] = (uniform01 * (y - x) + x);
}
foreach(n; randomNumbers) { //'n' apparently works despite shadowing a template parameter
import std.stdio : writeln;
writeln(n);
}
}
Using elementwise operations with the slice operator would not work here because uniform01 in uniform01 * (ends[] - starts[]) + starts[] would only be called once and not multiple times.
EDIT: I also tested some online compilers for D for this code and it's weird that they behave differently for the same code. For compilation of D I can recommend
https://run.dlang.io/ (I would be very surprised if this one wouldn't work)
https://www.mycompiler.io/new/d (but a bit slow)
https://ideone.com (it works but it makes your code public! Don't use with protected code.)
but those didn't work for me:
https://tio.run/#d2 (didn't finish compilation in one case, otherwise wrong results on execution even when using dynamic array for the test)
https://www.tutorialspoint.com/compile_d_online.php (doesn't compile the static foreach)

What does this code mean (if v then return v end)?

So I have this piece of code and it is this:
do
local function index(n,m)
return n*(n+1)//2 + m
end
local binomtable = {}
function binom3(n,m)
if n<0 or m<0 or m>n then return 0 end
if n=0 or m=0 or m=n then return 1 end
local i = index(n,m)
local v = binomtable[i]
if v then return v end
v = binom3(n-1,m-1) + binom3(n-1,m)
binomtable[i] = v
return v
end
end
and I would like to know what
if v then return v end
means.
Thank you!
The short answer is that if v then return v end returns the value v if it is truthy, i.e., if it is neither false nor nil. Otherwise the function continues by calculating a value for v, storing that value in binomtable, and finally returning it. The more interesting question is, why is the function doing all of this?
In the posted code, binom3 is a recursive function. With the recursive calls v = binom3(n-1,m-1) + binom3(n-1,m) there will be a lot of duplication of effort, meaning a lot of wasted space and time. Consider:
binom3(4, 2)
--> binom3(3, 1) + binom3(3, 2)
--> binom3(2, 0) + binom3(2, 1) + binom3(2, 1) + binom3(2, 2)
--> 1 + binom3(1, 0) + binom3(1, 1) + binom3(1, 0) + binom3(1, 1) + 1
Note how in the second reduction there are two identical terms:
binom3(2, 1) + binom3(2, 1)
There is no reason to calculate the term binom3(2, 1) twice, and doing so means that the pair of terms:
binom3(1, 0) + binom3(1, 1)
also must be calculated twice, as seen in the third reduction. It would be smart to calculate binom3(2, 1) only once, and to save the result for later use in the larger calculation. When m and n are larger and the number of calculations explodes exponentially this becomes a very important issue for performance both in the amount of memory required and in the amount of time required.
The posted code is using memoization to improve performance. When a calculation is made, it is stored in the table binomtable. Before any calculation is made, binomtable is consulted. First, v is set to the value of binomtable[i]; if this value is any truthy value (any integer is a truthy in Lua), then that value is simply returned without the need for recursive calculation. Otherwise, if nil is returned (i.e., no value has yet been stored for the calculation), the function continues with a recursive calculation. After completing the calculation, the new value is stored in binomtable for use the next time it is needed. This strategy saves a lot of wasted computational effort, and can make a huge difference in the performance of such recursive algorithms.
For your specific question of what
if v then return v end
means, is that if v, a variable, is not nil or false it is to return the value of the v variable and stop executing that function.
--Similar
function myfunc(input)
local MyVar = "I am a string and am not nil!"
if MyVar then
return "hi"
else
return "hello"
end
print("I am not seen because I am unreachable code!")
end
if this function was called it would always return "hi" instead of "hello" because MyVar is true, because it has a value. Also the print function below that will never get called because it stops executing the function after a return is called.
Now for your codes case it is checking a table to see if it has an entry at a certain index and if it does it returns the value.

Does cloning an iterator copy the entire underlying vector?

I would like to iterate over a vector several times:
let my_vector = vec![1, 2, 3, 4, 5];
let mut out_vector = vec![];
for i in my_vector {
for j in my_vector {
out_vector.push(i * j + i + j);
}
}
The j-loop has a "value used here after move" error. I know that I can place an & before the two my_vectors and borrow the vectors, but it is nice to have more than one way to do things. I would like a little insight as well.
Alternatively, I can write the following:
let i_vec = vec![1, 2, 3, 4, 5, 6];
let iterator = i_vec.iter();
let mut out_vec = vec![];
for i in iterator.clone() {
for j in iterator.clone() {
out_vec.push(i * j + i + j);
}
}
I looked at What's the most efficient way to reuse an iterator in Rust?:
Iterators in general are Clone-able if all their "pieces" are Clone-able.
Is the actual heap allocated data a "piece" of the iterator or is it the memory address that points to the heap data the aforementioned piece?
Cloning a slice iterator (this is the type of iterator you get when calling iter() on a Vec or an array) does not copy the underlying data. Both iterators still point to data stored in the original vector, so the clone operation is cheap.
The same is likely true for clonable iterators on other types.
In your case, instead of calling i_vec.iter() and then cloning it, you can also call i_vec.iter() multiple times:
for i in i_vec.iter() {
for j in i_vec.iter() {
which gives the same result but is probably more readable.

Hiding a Lua metatable and only exposing an object's attributes

How do you create a Lua object that only exposes its attributes and not its methods? For example:
local obj = {
attr1 = 1,
attr2 = 2,
print = function(...)
print("obj print: ", ...)
end,
}
Produces:
> for k,v in pairs(obj) do print(k, v) end
attr1 1
attr2 2
print function: 0x7ffe1240a310
Also, is it possible to not use the colon syntax for OOP in Lua? I don't need inheritance, polymorphism, only encapsulation and privacy.
I started out with the above question and after chasing down the rabbit hole, I was surprised by the limited number of examples, lack of examples for the various metamethods (i.e. __ipairs, __pairs, __len), and how few Lua 5.2 resources there were on the subject.
Lua can do OOP, but IMO the way that OOP is prescribed is a disservice to the language and community (i.e. in such a way as to support polymorphism, multiple inheritance, etc). There are very few reasons to use most of Lua's OOP features for most problems. It doesn't necessarily mean there's a fork in the road either (e.g. in order to support polymorphism there's nothing that says you have to use the colon syntax - you can fold the literature's described techniques in to the closure-based OOP method).
I appreciate that there are lots of ways to do OOP in Lua, but it's irritating to have there be different syntax for object attributes versus object methods (e.g. obj.attr1 vs obj:getAttr() vs obj.method() vs obj:method()). I want a single, unified API to communicate internally and externally. To that end, PiL 16.4's section on Privacy is a fantastic start, but it's an incomplete example that I hope to remedy with this answer.
The following example code:
emulates a class's namespace MyObject = {} and saves the object constructor as MyObject.new()
hides all of the details of the objects inner workings so that a user of an object only sees a pure table (see setmetatable() and __metatable)
uses closures for information hiding (see Lua Pil 16.4 and Object Benchmark Tests)
prevents modification of the object (see __newindex)
allows for methods to be intercepted (see __index)
lets you get a list of all of the functions and attributes (see the 'key' attribute in __index)
looks, acts, walks, and talks like a normal Lua table (see __pairs, __len, __ipairs)
looks like a string when it needs to (see __tostring)
works with Lua 5.2
Here's the code to construct a new MyObject (this could be a standalone function, it doesn't need to be stored in the MyObject table - there is absolutely nothing that ties obj once its created back to MyObject.new(), this is only done for familiarity and out of convention):
MyObject = {}
MyObject.new = function(name)
local objectName = name
-- A table of the attributes we want exposed
local attrs = {
attr1 = 123,
}
-- A table of the object's methods (note the comma on "end,")
local methods = {
method1 = function()
print("\tmethod1")
end,
print = function(...)
print("MyObject.print(): ", ...)
end,
-- Support the less than desirable colon syntax
printOOP = function(self, ...)
print("MyObject:printOOP(): ", ...)
end,
}
-- Another style for adding methods to the object (I prefer the former
-- because it's easier to copy/paste function()'s around)
function methods.addAttr(k, v)
attrs[k] = v
print("\taddAttr: adding a new attr: " .. k .. "=\"" .. v .. "\"")
end
-- The metatable used to customize the behavior of the table returned by new()
local mt = {
-- Look up nonexistent keys in the attrs table. Create a special case for the 'keys' index
__index = function(t, k)
v = rawget(attrs, k)
if v then
print("INFO: Successfully found a value for key \"" .. k .. "\"")
return v
end
-- 'keys' is a union of the methods and attrs
if k == 'keys' then
local ks = {}
for k,v in next, attrs, nil do
ks[k] = 'attr'
end
for k,v in next, methods, nil do
ks[k] = 'func'
end
return ks
else
print("WARN: Looking up nonexistant key \"" .. k .. "\"")
end
end,
__ipairs = function()
local function iter(a, i)
i = i + 1
local v = a[i]
if v then
return i, v
end
end
return iter, attrs, 0
end,
__len = function(t)
local count = 0
for _ in pairs(attrs) do count = count + 1 end
return count
end,
__metatable = {},
__newindex = function(t, k, v)
if rawget(attrs, k) then
print("INFO: Successfully set " .. k .. "=\"" .. v .. "\"")
rawset(attrs, k, v)
else
print("ERROR: Ignoring new key/value pair " .. k .. "=\"" .. v .. "\"")
end
end,
__pairs = function(t, k, v) return next, attrs, nil end,
__tostring = function(t) return objectName .. "[" .. tostring(#t) .. "]" end,
}
setmetatable(methods, mt)
return methods
end
And now the usage:
-- Create the object
local obj = MyObject.new("my object's name")
print("Iterating over all indexes in obj:")
for k,v in pairs(obj) do print('', k, v) end
print()
print("obj has a visibly empty metatable because of the empty __metatable:")
for k,v in pairs(getmetatable(obj)) do print('', k, v) end
print()
print("Accessing a valid attribute")
obj.print(obj.attr1)
obj.attr1 = 72
obj.print(obj.attr1)
print()
print("Accessing and setting unknown indexes:")
print(obj.asdf)
obj.qwer = 123
print(obj.qwer)
print()
print("Use the print and printOOP methods:")
obj.print("Length: " .. #obj)
obj:printOOP("Length: " .. #obj) -- Despite being a PITA, this nasty calling convention is still supported
print("Iterate over all 'keys':")
for k,v in pairs(obj.keys) do print('', k, v) end
print()
print("Number of attributes: " .. #obj)
obj.addAttr("goosfraba", "Satoshi Nakamoto")
print("Number of attributes: " .. #obj)
print()
print("Iterate over all keys a second time:")
for k,v in pairs(obj.keys) do print('', k, v) end
print()
obj.addAttr(1, "value 1 for ipairs to iterate over")
obj.addAttr(2, "value 2 for ipairs to iterate over")
obj.addAttr(3, "value 3 for ipairs to iterate over")
obj.print("ipairs:")
for k,v in ipairs(obj) do print(k, v) end
print("Number of attributes: " .. #obj)
print("The object as a string:", obj)
Which produces the expected - and poorly formatted - output:
Iterating over all indexes in obj:
attr1 123
obj has a visibly empty metatable because of the empty __metatable:
Accessing a valid attribute
INFO: Successfully found a value for key "attr1"
MyObject.print(): 123
INFO: Successfully set attr1="72"
INFO: Successfully found a value for key "attr1"
MyObject.print(): 72
Accessing and setting unknown indexes:
WARN: Looking up nonexistant key "asdf"
nil
ERROR: Ignoring new key/value pair qwer="123"
WARN: Looking up nonexistant key "qwer"
nil
Use the print and printOOP methods:
MyObject.print(): Length: 1
MyObject.printOOP(): Length: 1
Iterate over all 'keys':
addAttr func
method1 func
print func
attr1 attr
printOOP func
Number of attributes: 1
addAttr: adding a new attr: goosfraba="Satoshi Nakamoto"
Number of attributes: 2
Iterate over all keys a second time:
addAttr func
method1 func
print func
printOOP func
goosfraba attr
attr1 attr
addAttr: adding a new attr: 1="value 1 for ipairs to iterate over"
addAttr: adding a new attr: 2="value 2 for ipairs to iterate over"
addAttr: adding a new attr: 3="value 3 for ipairs to iterate over"
MyObject.print(): ipairs:
1 value 1 for ipairs to iterate over
2 value 2 for ipairs to iterate over
3 value 3 for ipairs to iterate over
Number of attributes: 5
The object as a string: my object's name[5]
Using OOP + closures is very convenient when embedding Lua as a facade or documenting an API.
Lua OOP can also be very, very clean and elegant (this is subjective, but there aren't any rules with this style - you always use a . to access either an attribute or a method)
Having an object behave exactly like a table is VERY, VERY useful for scripting and interrogating the state of a program
Is extremely useful when operating in a sandbox
This style does consume slightly more memory per object, but for most situations this isn't a concern. Factoring out the metatable for reuse would address this, though the example code above doesn't.
A final thought. Lua OOP is actually very nice once you dismiss most of the examples in the literature. I'm not saying the literature is bad, btw (that couldn't be further from the truth!), but the set of sample examples in PiL and other online resources lead you to using only the colon syntax (i.e. the first argument to all functions is self instead of using a closure or upvalue to retain a reference to self).
Hopefully this is a useful, more complete example.
Update (2013-10-08): There is one notable drawback to the closure-based OOP style detailed above (I still think the style is worth the overhead, but I digress): each instance must have its own closure. While this is obvious in the above lua version, this becomes slightly problematic when dealing with things on the C-side.
Assume we're talking about the above closure style from the C-side from here on out. The common case on the C side is to create a userdata via lua_newuserdata() object and attach a metatable to the userdata via lua_setmetatable(). On face value this doesn't appear like a problem until you realize that methods in your metatable require an upvalue of the userdata.
using FuncArray = std::vector<const ::luaL_Reg>;
static const FuncArray funcs = {
{ "__tostring", LI_MyType__tostring },
};
int LC_MyType_newInstance(lua_State* L) {
auto userdata = static_cast<MyType*>(lua_newuserdata(L, sizeof(MyType)));
new(userdata) MyType();
// Create the metatable
lua_createtable(L, 0, funcs.size()); // |userdata|table|
lua_pushvalue(L, -2); // |userdata|table|userdata|
luaL_setfuncs(L, funcs.data(), 1); // |userdata|table|
lua_setmetatable(L, -2); // |userdata|
return 1;
}
int LI_MyType__tostring(lua_State* L) {
// NOTE: Blindly assume that upvalue 1 is my userdata
const auto n = lua_upvalueindex(1);
lua_pushvalue(L, n); // |userdata|
auto myTypeInst = static_cast<MyType*>(lua_touserdata(L, -1));
lua_pushstring(L, myTypeInst->str()); // |userdata|string|
return 1; // |userdata|string|
}
Note how the table created with lua_createtable() didn't get associated with a metatable name the same as if you would have registered the metatable with luaL_getmetatable()? This is 100% a-okay because these values are completely inaccessible outside of the closure, but it does mean that luaL_getmetatable() can't be used to look up a particular userdata's type. Similarly, this also means that luaL_checkudata() and luaL_testudata() are also off limits.
The bottom line is that upvalues (such as userdata above) are associated with function calls (e.g. LI_MyType__tostring) and are not associated with the userdata itself. As of now, I'm not aware of a way in which you can associate an upvalue with a value such that it becomes possible to share a metatable across instances.
UPDATE (2013-10-14) I'm including a small example below that uses a registered metatable (luaL_newmetatable()) and also lua_setuservalue()/lua_getuservalue() for a userdata's "attributes and methods". Also adding random comments that have been the source of bugs/hotness that I've had to hunt down in the past. Also threw in a C++11 trick to help with __index.
namespace {
using FuncArray = std::vector<const ::luaL_Reg>;
static const std::string MYTYPE_INSTANCE_METAMETHODS{"goozfraba"}; // I use a UUID here
static const FuncArray MyType_Instnace_Metamethods = {
{ "__tostring", MyType_InstanceMethod__tostring },
{ "__index", MyType_InstanceMethod__index },
{ nullptr, nullptr }, // reserve space for __metatable
{ nullptr, nullptr } // sentinel
};
static const FuncArray MyType_Instnace_methods = {
{ "fooAttr", MyType_InstanceMethod_fooAttr },
{ "barMethod", MyType_InstanceMethod_barMethod },
{ nullptr, nullptr } // sentinel
};
// Must be kept alpha sorted
static const std::vector<const std::string> MyType_Instance___attrWhitelist = {
"fooAttr",
};
static int MyType_ClassMethod_newInstance(lua_State* L) {
// You can also use an empty allocation as a placeholder userdata object
// (e.g. lua_newuserdata(L, 0);)
auto userdata = static_cast<MyType*>(lua_newuserdata(L, sizeof(MyType)));
new(userdata) MyType(); // Placement new() FTW
// Use luaL_newmetatable() since all metamethods receive userdata as 1st arg
if (luaL_newmetatable(L, MYTYPE_INSTANCE_METAMETHODS.c_str())) { // |userdata|metatable|
luaL_setfuncs(L, MyType_Instnace_Metamethods.data(), 0); // |userdata|metatable|
// Prevent examining the object: getmetatable(MyType.new()) == empty table
lua_pushliteral(L, "__metatable"); // |userdata|metatable|literal|
lua_createtable(L, 0, 0); // |userdata|metatable|literal|table|
lua_rawset(L, -3); // |userdata|metatable|
}
lua_setmetatable(L, -2); // |userdata|
// Create the attribute/method table and populate with one upvalue, the userdata
lua_createtable(L, 0, funcs.size()); // |userdata|table|
lua_pushvalue(L, -2); // |userdata|table|userdata|
luaL_setfuncs(L, funcs.data(), 1); // |userdata|table|
// Set an attribute that can only be accessed via object's fooAttr, stored in key "fooAttribute"
lua_pushliteral(L, "foo's value is hidden in the attribute table"); // |userdata|table|literal|
lua_setfield(L, -2, "fooAttribute"); // |userdata|table|
// Make the attribute table the uservalue for the userdata
lua_setuserdata(L, -2); // |userdata|
return 1;
}
static int MyType_InstanceMethod__tostring(lua_State* L) {
// Since we're using closures, we can assume userdata is the first value on the stack.
// You can't make this assumption when using metatables, only closures.
luaL_checkudata(L, 1, MYTYPE_INSTANCE_METAMETHODS.c_str()); // Test anyway
auto myTypeInst = static_cast<MyType*>(lua_touserdata(L, 1));
lua_pushstring(L, myTypeInst->str()); // |userdata|string|
return 1; // |userdata|string|
}
static int MyType_InstanceMethod__index(lua_State* L) {
lua_getuservalue(L, -2); // |userdata|key|attrTable|
lua_pushvalue(L, -2); // |userdata|key|attrTable|key|
lua_rawget(L, -2); // |userdata|key|attrTable|value|
if (lua_isnil(L, -1)) { // |userdata|key|attrTable|value?|
return 1; // |userdata|key|attrTable|nil|
}
// Call cfunctions when whitelisted, otherwise the caller has to call the
// function.
if (lua_type(L, -1) == LUA_TFUNCTION) {
std::size_t keyLen = 0;
const char* keyCp = ::lua_tolstring(L, -3, &keyLen);
std::string key(keyCp, keyLen);
if (std::binary_search(MyType_Instance___attrWhitelist.cbegin(),
MyType_Instance___attrWhitelist.cend(), key))
{
lua_call(L, 0, 1);
}
}
return 1;
}
static int MyType_InstanceMethod_fooAttr(lua_State* L) {
// Push the uservalue on to the stack from fooAttr's closure (upvalue 1)
lua_pushvalue(L, lua_upvalueindex(1)); // |userdata|
lua_getuservalue(L, -1); // |userdata|attrTable|
// I haven't benchmarked whether lua_pushliteral() + lua_rawget()
// is faster than lua_getfield() - (two lua interpreter locks vs one lock + test for
// metamethods).
lua_pushliteral(L, "fooAttribute"); // |userdata|attrTable|literal|
lua_rawget(L, -2); // |userdata|attrTable|value|
return 1;
}
static int MyType_InstanceMethod_barMethod(lua_State* L) {
// Push the uservalue on to the stack from barMethod's closure (upvalue 1)
lua_pushvalue(L, lua_upvalueindex(1)); // |userdata|
lua_getuservalue(L, -1); // |userdata|attrTable|
// Push a string to finish the example, not using userdata or attrTable this time
lua_pushliteral(L, "bar() was called!"); // |userdata|attrTable|literal|
return 1;
}
} // unnamed-namespace
The lua script side of things looks something like:
t = MyType.new()
print(typue(t)) --> "userdata"
print(t.foo) --> "foo's value is hidden in the attribute table"
print(t.bar) --> "function: 0x7fb560c07df0"
print(t.bar()) --> "bar() was called!"
how do you create a lua object that only exposes its attributes and not its methods?
If you don't expose methods in any way, you can't call them, right? Judging from your example, it sounds like what you really want is a way to iterate through the attributes of an object without seeing methods, which is fair.
The simplest approach is just to use a metatable, which puts the methods in a separate table:
-- create Point class
Point = {}
Point.__index = Point
function Point:report() print(self.x, self.y) end
-- create instance of Point
pt = setmetatable({x=10, y=20}, Point)
-- call method
pt:report() --> 10 20
-- iterate attributes
for k,v in pairs(pt) do print(k,v) end --> x 10 y 20
is it possible to not use the colon syntax for OOP in Lua?
You can use closures instead, but then pairs is going to see your methods.
function Point(x, y)
local self = { x=x, y=y}
function pt.report() print(self.x, self.y) end
return self
end
pt = Point(10,20)
pt.report() --> 10 20
for k,v in pairs(pt) do print(k,v) end --> x 10 y 20 report function: 7772112
You can fix the latter problem by just writing an iterator that shows only attributes:
function nextattribute(t, k)
local v
repeat
k,v = next(t, k)
if type(v) ~= 'function' then return k,v end
until k == nil
end
function attributes (t)
return nextattribute, t, nil
end
for k,v in attributes(pt) do print(k,v) end --> x 10 y 20
I don't need inheritance, polymorphism
You get polymorphism for free in Lua, without or without classes. If your zoo has a Lion, Zebra, Giraffe each of which can Eat() and want to pass them to the same Feed(animal) routine, in a statically typed OO languages you'd need to put Eat() in a common base class (e.g. Animal). Lua is dynamically typed and your Feed routine can be passed any object at all. All that matters is that the object you pass it has an Eat method.
This is sometimes called "duck typing": if it quacks like a duck and swims like a duck, it's a duck. As far as our Feed(animal) routine is concerned, if it Eats like an animal, it's an animal.
only encapsulation and privacy.
Then I think exposing data members while hiding methods is the opposite of what you want to do.

Resources