metamethods shadowing problems with luaL_ref key - lua

I have an empty table, whose __newindex and __index metamethods are implemented from the C side. The table is going to be used as an array (t[1]=3, print(t[2])...), with C catching all the accesses.
Now, I want to use luaL_ref to add a reference of another object into this table, just to prevent the second from being thrown away by gc. But I think that the returned reference could shadow the "virtual" indexes that I'm going to use with this table:
For example, I expect t[1]=3 to call the __newindex, but if lauL_ref returned 1 then my table would really have a element at '1', then __newindex wouldn't be called anymore.
I know that luaL_ref is guaranteed to return a key not already used in the table, but since the table is empty (so that my metamethods are always called), I think it actually can return low values, which I'm likely to use.
Are there flaws in this reasoning? If not, how can I workaround this?

I would advise not using luaL_ref at all. At least, not on the empty table you're putting your metatable on. Maybe you should reference it in the metatable itself, or something other internal table that you store in the registry.

Related

Why does assigning a table as a value to another table cause problems?

How come we can't intuitively copy tables around in Lua like so:
a = {
a = {},
b = {},
}
b = {}
b = a.b
I've run into some weird bugs doing this. If I use a table clone function like the following, it will work fine, I just don't understand why having to use a clone function is needed/best practice in the first place.
It's hard to describe the bug I've run into when trying to do the first method, but basically, if I try to add additional key-values inside the a.b part of b = a.b, then the additional key-values don't always become what I set them to.
function deepCopy(object)
local lookup_table = {}
local function _copy(object)
if type(object) ~= "table" then
return object
elseif lookup_table[object] then
return lookup_table[object]
end
local new_table = {}
lookup_table[object] = new_table
for index, value in pairs(object) do
new_table[_copy(index)] = _copy(value)
end
return setmetatable(new_table, getmetatable(object))
end
return _copy(object)
end
and then doing the following removes any bugs
b = deepCopy(a.b)
In Lua, a table is a value, and each distinct table has a distinct value. The value of a table is used to identify its contents, but the contents of a table are not conceptually the value of the table. That is, to access the contents of a table, you need the table's value, but the table's value is not the same thing as its contents.
The table's value can be stored in any variable. And again, that value is used to identify that table and to access that table's contents, but that is not the same thing as the value logically being the table's contents.
Consider the following:
tbl1 = { 1, 2, 3 }
tbl2 = tbl1
tbl3 = { 1, 2, 3 }
The value of tbl1 and tbl2 is the same; this means that they both refer to the same table and thus you can access the contents of that table through either variable. So tbl1[2] and tbl2[2] don't simply return 2; they both access the same table.
tbl3 is not the same table as tbl1. They might have contents which are logically identical, but as far as Lua is concerned, they are different tables. Manipulating the contents of the table stored in tbl3 will not affect anyone looking at the tables stored in tbl1 or tbl2.
So, why does storing a table into a variable not copy the table's contents? Several reasons.
Deep copies are expensive. If all copies were deep, you wouldn't even be able to execute a simple return {1, 2, 3} without performing a copy. A pointless copy, because there are no other variables that can talk to that table (since it was created in-situ). Why waste performance? Same goes for passing a table as a parameter to a function or any number of other things.
Deep-copying-only prevents useful things like accessing the same table from different locations. If every table copy was deep, how could you have something as simple as a local copy of a module table? You couldn't have a table "member function" return a table internal to an object, so you can use to manipulate data in that object because that return would have to copy the table. And thus, the table object would only be mutable through direct member functions.
Deep copying is a useful tool. But it isn't the default because it shouldn't be. Most cases of copying tables don't need it, and users need a way to access a table from multiple locations.
There is no standard function or mechanism for deep copying either. The reason for that is simple: there are many ways to do a deep copy, ranging from the simple to the complex. Your simple deepCopy function for example breaks on a table that stores (recursively) itself:
me = { a = 4, other = {} }
me.other.me = me
That is 100% valid, and your deepCopy function will break on it. There are ways to implement deepCopy such that it will handle this, but they are complicated and expensive. Most users don't need a deepCopy that can handle recursive objects.
If the Lua's standard library had a deep copy function, then either it would handle every such case (and thus be expensive) or it would be a simpler one which could break on any number of corner cases (having multiple references to the same table in a table, etc).
So it's best to make any potential user of a deep copy sit down and decide exactly which cases they want to handle and which they do not.
Variables hold references, not entire tables.
It's far more efficient to copy a reference than an entire table.
A function call effectively assigns the arguments to that function's parameters, so if assignment did a full copy, it would be impossible to write a function that modifies a table.
Usually, when we assign a table to something, we either (a) don't plan on modifying the table, or (b) explicitly intend to use at least one of the variables to modify the underlying table. See the previous point on functions. This means that doing a full copy by default would be a waste of resources.
My advice is to only copy tables when you really need to, and prefer a shallow copy unless you really need a deep copy. In fact, when I need to copy tables, I usually write a specialized copy function so I don't copy any more than I need to.

What is the difference between lua_getmetatable and luaL_getmetatable

Lua API has a function lua_getmetatable which will fetch the table with metafunctions if the value has one.
Lua auxiliary library (which is part of lua API) has another function luaL_getmetatable which is a macro that will fetch a value from LUA_REGISTRYINDEX.
But another function from this library luaL_getmetafield with similar name does a completely different thing - it will look for a method in the get_metatable's location.
Why is there two different locations?
When is each metatable used?
lua_getmetatable gets the metatable associated with the given object. This is a fundamental feature; if this function didn't exist, there would be no way to access the metatable for a given object.
luaL_getmetatable is part of a convention for giving types to userdata (C objects that can be accessed from Lua) or classes of tables. In this convention you add tables to the registry with luaL_newmetatable, and then use these tables to represent the metatables for different userdata/table types (when you need them you can read them from the registry and set them with luaL_setmetatable).
This is a convenience feature only; and you do not need to follow this convention if you don't want to. Everything will still work if you place the metadata tables somewhere that isn't in the registry and bind them to your userdata with lua_setmetatable. That said, if the luaL_*metatable functions didn't exist, where would you put the tables that you were using to represent the different userdata/table types; and how would you find them again when you needed them for a second time? You could definitely solve this problem in a different way, but why not use the pre-built convention if it works for you.

Accessing Type Metatables Lua

Its obviously that getmetatable can access the metatables of several types:
getmetatable("")
getmetatable({})
getmetatable(newproxy(true))
However it appears as though you cannot get the metatable of other types (functions aside). There appears to be no way to access the metatable of numbers, booleans, or nil.
I was also wondering if one would be able to access a metatable of the entire table type. To be able to do something like this:
({}) + ({})
strings, numbers, nil, functions and lightuserdata have a single metatable for the whole type. tables and full userdata have a metatable for each instance.
from the docs:
Tables and full userdata have
individual metatables (although
multiple tables and userdata can share
their metatables). Values of all other
types share one single metatable per
type; that is, there is one single
metatable for all numbers, one for all
strings, etc.strings, etc.
there's no 'table type metatable', just like there's no 'metatable for this string'
the string type has the 'string' table as metatable by default; but you can set the metatable for other types using the debug.setmetatable() function.strings, etc.
Numbers, Booleans and nil have no metatable by default (hence getmetatable returning nil). You can give them one with debug.setmetatable though.
There is no common table metatable. (and same for userdata (at least of the heavy variety))

programming in lua, objects

Sample code:
function Account:new (o)
o = o or {} -- create object if user does not provide one
setmetatable(o, self)
self.__index = self
return o
end
taken from:
http://www.lua.org/pil/16.1.html
What is the purpose of the:
self.__index = self
line? And why is it executed every time an object is created?
As others have said, self (the Account table) is used as a metatable assigned to the objects created using new. Simplifying slightly (more info available at the links provided) when a field is not found in 'o', it goes to the 'Account' table because o's metatable says to go to Account (this is what __index does).
It does not, however, need to be executed every time an object is created. You could just as easily stick this somewhere:
Account.__index = Account
and it would work as well.
The somewhat longer story is that if an object o has a metatable, and that metatable has the __index field set, then a failed field lookup on o will use __index to find the field (__index can be a table or function). If o has the field set, you do not go to its metatable's __index function to get the information. Again, though, I encourage you to read more at the links provided above.
The Lua documentation is somewhat vague on this detail and many of the answers here either echo the Lua docs or don't thoroughly explain this confusing tidbit.
The line self._index = self is present purely for the benefit of the newly-created object, o; it has no meaningful or functional impact on Account.
The _index field only has special meaning within the context of metatables; therefore self._index is just a plain old regular field for Account. However, when Account is used as a metatable for o, the _index field "becomes" a metamethod for o. (So what's a field for Account is a metamethod for o.)
When you take the two statements in combination ...
(1) setmetatable(o, self)
(2) self._index = self
... you're using Account as the metatable for o on line (1) and setting the _index metamethod for o to Account on line (2). (On line (2), you're also setting the "plain old field" __index in Account to Account.) So the useful aspect of self._index = self isn't the setting of the _index field for Account, but rather the setting of the _index metamethod for o.
The following is functionally equivalent:
setmetatable(o, self)
getmetatable(o)._index = self
Lua is not an object-oriented language, but it has all of the facilities for writing object-oriented code. However, it is done in a prototyping fashion a la JavaScript. Instead of explicitly creating classes, a prototype object is created and then cloned to create new instances.
The __index metamethod is invoked to perform key lookups on read accesses to a table when the key is not already present in the table. So, self.__index = self essentially allows for inheritance of all methods and fields of the Account "class" by the new "instance" that is created in the o = o or {} and setmetatable(o, self) lines.
See also:
Metatables
Metamethods Tutorial
Inheritance Tutorial
They're used to re-direct table accesses (local y = table[key]) which are also used in method calls. In the above line, object o will have any attempts to access keys re-directed to the current object self, effortlessly inheriting all member functions. And possibly data variables too, depending on what exactly that __index is and how it works.
Creating objects (which are simply Tables) is quite different with Lua.
The basic idea is to create a regular table containing attributes(functions and values) that are common to all instances. This table, I'll call CAT for Common Attributes Table.
If you reference an attribute in a table and Lua can't find this attribute, there is a way to tell Lua where else to look for the attribute. We want Lua to look in the CAT for common attributes. Metatables answer that need. More on how that works later.
We also need methods in the CAT to be able to use instance values. Self answers that need. When you call a table function (method) this way: tableName:methodName(), Lua automatically places a reference to the table object as the first parameter. The name of this parameter is self. Even though the method is located in the CAT, self will refer to the particular calling object instance table.
Say we have a CAT called Car.
metaCar = { __index = Car }
-- this table will be used as the metatable for all instances of Car
-- Lua will look in Car for attributes it can't find in the instance
For example:
-- instance table is called mustang
-- setmetatable(mustang, metaCar)
Here is a general purpose function that creates new instance objects and sets the metatable for it. If the CAT has a constructor function (init), it gets executed as well.
function newObj(metatable)
..obj = {} -- create new empty instance object
..setmetatable(obj, metatable) –- connect the metatable to it
..if obj.init then -- if the CAT has an init method, execute it
....obj:init()
..end
..return obj
end
Note that setmetatable(o, self) only set Account as metatable of o (otherwise nil by default). That's the first step of a prototype binding but it's NOT enough make functions of Account searchable from o!
To make o search methods on Account, the metatable object (Account) have to include a __index event with the value pointing to itself, which contains prototype methods.
So it's has to be done in 2 steps:
create a metatable value with __index event
setmetatable to the target table.
As said in the original book, this is "a small optimization" -- you will usually need another mold table value created as metatable of o. But here in this case, the code re-used the Acccount table value, as both metatable and prototype object.

How to remove a lua table entry by its key?

I have a lua table that I use as a hashmap, ie with string keys :
local map = { foo = 1, bar = 2 }
I would like to "pop" an element of this table identified by its key. There is a table.remove() method, but it only takes the index of the element to remove (ie a number) and not a generic key. I would like to be able to do table.remove(map, 'foo') and here is how I implemented it :
function table.removekey(table, key)
local element = table[key]
table[key] = nil
return element
end
Is there a better way to do that ?
No, setting the key's value to nil is the accepted way of removing an item in the hashmap portion of a table. What you're doing is standard. However, I'd recommend not overriding table.remove() - for the array portion of a table, the default table.remove() functionality includes renumbering the indices, which your override would not do. If you do want to add your function to the table function set, then I'd probably name it something like table.removekey() or some such.
TLDR
(because you're only damn trying to remove a thing from a map, how hard can that be)
Setting a key to nil (e.g. t[k] = nil) is not only accepted and standard, but it's the only way of removing the entry from the table's "content" in general. And that makes sense. Also, array portion and hashmap portion are implementation details and shouldn't have ever be mentioned in this Q/A.
Understanding Lua's tables
(and why you can't remove from an infinite)
Lua tables don't literally have concept of "removing an entry from a table" and it hardly has a concept of "inserting an entry to a table". This is different from many other data structures from different programming languages.
In Lua, tables are modelling a perfect associative structure of infinite size.
Tables in Lua are inherently mutable and there's only one fundamental way to construct a new table: using the {} expression. Constructing a table with initial content (e.g. t = {10, 20, ["foo"] = 123, 30} or anything alike) is actually a syntactic sugar equivalent to first constructing a new table (e.g. t = {}} and then setting the "initial" entries one by one (e.g. t[1] = 10, t[2] = 20, t["foo"] = 123, t[3] = 30) . The details of how the de-sugaring works doesn't help with understanding the discussed matter, so I will be avoiding the table construction sugar in this answer.
In Lua, a freshly-constructed table initially associates all possible values with nil. That means that for a table t = {}, t[2] will evaluate to nil, t[100] will evaluate to nil, t["foo"] will evaluate to nil, t[{}] will evaluate to nil, etc.
After construction, you can mutate the table by setting a value at some key. Then that key will be now associated with that value. For example, after setting t["foo"] = "bar", the key "foo" will now be associated with the value "bar". In consequence, t["foo"] will now evaluate to "bar".
Setting nil at some key will associate that key to nil. For example, after setting t["foo"] = nil, "foo" will (again) be associated with nil. In consequence, t["foo"] will (again) evaluate to nil.
While any key can be associated to nil (and initially all possible keys are), such entries (key/value pairs) are not considered a part of the table (i.e. aren't considered part of the table content).
Functions pairs and ipairs (and multiple others) operate on table's content, i.e. the of associations in which the value isn't nil. The number of such associations is always finite.
Having everything said above in mind, associating a key with nil will probably do everything you could expect when saying "removing an entry from a table", because t[k] will evaluate to nil (like it did after constructing the table) and functions like pairs and ipairs will ignore such entries, as entries (associations) with value nil aren't considered "a part of the table".
Sequences
(if tables weren't already tricky)
In this answer, I'm talking about tables in general, i.e. without any assumption about their keys. In Lua, tables with a particular set of keys can be called a sequence, and you can use table.remove to remove an integer key from such table. But, first, this function is effectively undefined for non-sequences (i.e. tables in general) and, second, there's no reason to assume that it's more than a util, i.e. something that could be directly implemented in Lua using primitive operations.
Which tables are or aren't a sequence is another hairy topic and I won't get into details here.
Referencing the reference
(I really didn't make up all that)
Everything said above is based on the official language reference manual. The interesting parts are mostly chapter 2.1 – Values and Types...
Tables can be heterogeneous; that is, they can contain values of all types (except nil). Any key with value nil is not considered part of the table. Conversely, any key that is not part of a table has an associated value nil.
This part is not worded perfectly. First, I find the phrase "tables can be heterogeneous" confusing. It's the only use of this term in the reference and the "can be" part makes it non-obvious whether "being heterogeneous" is a possible property of a table, or whether it tables are defined that way. The second sentence make the first explanation more reasonable, because if "any key with value nil is not considered part of the table", then it means that "a key with value nil" is not a contradiction. Also, the specification of the rawset function, which (indirectly) gives semantics to the t[k] = v syntax, in the 6.1 – Basic Functions chapter says...
Sets the real value of table[index] to value, without invoking any metamethod. table must be a table, index any value different from nil and NaN, and value any Lua value.
As nil values are not special-cased here, that means that you can set t[k] to nil. The only way to understand that is to accept that from now on, that key will be "a key with value nil", and in consequence "will not be considered part of the table" (pairs will ignore it, etc.), and as "any key that is not part of a table has an associated value nil", t[k] will evaluate to nil.
The whole reference also doesn't mention "removing" a key or an entry from tables in any other place.
Another perspective on tables
(if you hate infinities)
While I personally find the perspective from the reference elegant, I also understand that the fact that it's different from other popular models might make it more difficult to reason about.
I believe that the following perspective is effectively equivalent to the previous one.
You can think that...
{} returns an empty table.
t[k] evaluates to v if t contains key k, and nil otherwise
Setting t[k] = v inserts a new entry (k, v) to the table if it doesn't contain key k, updates such entry if t already contains key k, and finally, as a special case, removes the entry for the key k if v is nil
The content of the table (i.e. what's considered "a part of the table") is the set of all entries from the table
In this model, tables aren't capable of "containing" nil values.
This is not how the language reference defines things, but to the best of my understanding, such model is observably equivalent.
Don't talk implementation details
(unless you're sure that that's what you mean)
The so-called "hashmap portion" of the table (which supplements the so-called "array portion" of the table) are implementation details and talking about them, unless we discuss performance or the explanation of specific undefined or implementation-defined behaviors, is in my opinion confusing in the best case and plain wrong in the worst.
For example, in a table constructed like this... t = {}, t[1] = 10, t[2] = 20, t[3] = 30, the array portion will (probably!) be [10, 20, 30] and setting t[2] = nil will "remove" the entry (2, 20) "from the array part", possibly also resizing it or moving 3 -> 30 to the hashmap part. I'm not really sure. I'm just saying this to prove that discussing implementation details is not what we want to do here.

Resources