Performance of accessing table via reference vs ipairs loop - lua

I'm modding a game. I'd like to optimize my code if possible for a frequently called function. The function will look into a dictionary table (consisting of estimated 10-100 entries). I'm considering 2 patterns a) direct reference and b) lookup with ipairs:
PATTERN A
tableA = { ["moduleName.propertyName"] = { some stuff } } -- the key is a string with dot inside, hence the quotation marks
result = tableA["moduleName.propertyName"]
PATTERN B
function lookup(type)
local result
for i, obj in ipairs(tableB) do
if obj.type == "moduleName.propertyName" then
result = obj
break
end
end
return result
end
***
tableB = {
[1] = {
type = "moduleName.propertyName",
... some stuff ...
}
}
result = lookup("moduleName.propertyName")
Which pattern should be faster on average? I'd expect the 'native' referencing to be faster (it is certainly much neater), but maybe this is a silly assumption? I'm able to sort (to some extent) tableB in a order of frequency of the lookups whereas (as I understand it) tableA will have in Lua random internal order by default even if I declare the keys in proper order.

A lookup table will always be faster than searching a table every time.
For 100 elements that's one indexing operation compared to up to 100 loop cycles, iterator calls, conditional statements...
It is questionable though if you would experience a difference in your application with so little elements.
So if you build that data structure for this purpose only, go with a look-up table right away.
If you already have this data structure for other purposes and you just want to look something up once, traverse the table with a loop.
If you have this structure already and you need to look values up more than once, build a look up table for that purpose.

Related

Order of table initialization in Lua

I know that there is no guarantee concerning table element order when iterating over all table elements using pairs(). The table elements could be returned in any order.
But what about initializing a table, e.g. consider the following code:
function func(x)
print(x)
return(x)
end
t = {func(0), x = func(1), y = func(2), [0] = func(3), func(4), [1000] = func(5)}
The test shows that func() is called in the order the table elements are initialized but is this guaranteed? I don't seem to find anything about this in the Lua reference but I'm sure there must be some official explanation on this.
The order of evaluation in table constructors and in function arguments is not defined to allow room for optimization by the compiler.
You shouldn't rely on the order of evaluation anyway.
This is not specific to Lua. Most languages don't specify order of evaluation in lists of expressions.

Lua 5.0 - iterations over tables ignore duplicate keys even though values are different

Reading the injected comments in the Code Snippet should give enough context.
--| Table |--
QuestData = {
["QuestName"]={
["Quest Descrip"]={8,1686192712},
["Quest Descrip"]={32,1686193248},
["Quest Descrip"]={0,2965579272},
},
}
--| Code Snippet |--
--| gets QuestName then does below |--
if QuestName then
-- (K = QuestName) and (V = the 3 entries below it in the table)
for k,v in pairs(QuestData) do
-- Checks to make sure the external function that obtained the QuestName matches what is in the table before cont
if strlower(k) == strlower(QuestName) then
local index = 0
-- Iterates over the first two pairs - Quest Descrip key and values
for kk,vv in pairs(v) do
index = index + 1
end
-- Iterates over the second two pairs of values
if index == 1 then
for kk,vv in pairs(v) do
-- Sends the 10 digit hash number to the function
Quest:Function(vv[2])
end
end
end
end
end
The issue I'm running into is that Lua will only pick up one of the numbers and ignore the rest. I need all the possible hash numbers regardless of duplicates. The QuestData table ("database") has well over 10,000 entries. I'm not going to go through all of them and remove the duplicates. Besides, the duplicates are there because the same quest can be picked up in more than one location in the game. It's not a duplicate quest but it has a different hash number.
Key is always unique. It is the point of the key, that the key is pointing to unique value and you can't have more keys with same name to point different values. It is by definition by Lua tables.
It is like if you would want to have two variables with same name and different content. It does not make sense ...
The table type implements associative arrays. [...]
Like global variables, table fields evaluate to nil if they are not initialized. Also like global variables, you can assign nil to a table field to delete it. That is not a coincidence: Lua stores global variables in ordinary tables.
Quote from Lua Tables
Hashing in Lua
Based on comments, I update the answer to give some idea about hashing.
You are using hashing usually in low-level languages like C. In Lua, the associative arrays are already hashed somehow in the background, so it will be overkill (especially using SHA or so).
Instead of linked lists commonly used in C, you should just construct more levels of tables to handle collisions (there is nothing "better" in Lua).
And if you want to have it fancy set up some metatables to make it somehow transparent. But from your question, it is really not clear how your data look like and what you really want.
Basically you don't need more than this:
QuestData = {
["QuestName"]={
["Quest Descrip"]={
{8,1686192712},
{32,1686193248},
{0,2965579272},
},
},
}
As Jakuje already mentioned table keys are unique.
But you can store both as a table member like:
QuestData = {
-- "QuestName" must be unique! Of course you can put it into a table member as well
["QuestName"]={
{hash = "Quest Descrip", values = {8,1686192712} },
{hash = "Quest Descrip", values = {32,1686193248} },
{hash = "Quest Descrip", values = {0,2965579272} }
}
}
I'm sure you can organize this in a better way. It looks like a rather confusing concept to me.
You've said you can't "rewrite the database", but the problem is the QuestData table doesn't hold what you think it holds.
Here's your table:
QuestData = {
["QuestName"]={
["Quest Descrip"]={8,1686192712},
["Quest Descrip"]={32,1686193248},
["Quest Descrip"]={0,2965579272},
},
}
But, this is actually like writing...
QuestData["Quest Descrip"] = {8,1686192712}
QuestData["Quest Descrip"] = {32,1686193248}
QuestData["Quest Descrip"] = {0,2965579272}
So the second (and then, third) values overwrite the first. The problem is not that you can't access the table, but that the table doesn't contain the values any more.
You need to find a different way of representing your data.

Potential problems with storing the second table of two dimensional arrays in the index

In the past I found myself using a table as index and value of
a table when the order was irrelevant.
Since every table returns a unique value they are save to use as
index and with that I already got all the information I want to
use later on in the program. Now I did not see any similar lua code
jet and didn't use it in a non test-program. So I'm worrying that I
might get some unforeseen/unexpected problems when using this method.
example:
a = {1,2,3,4,5} --some testing values
b = {2,nil,4,nil,1}
c = {3,nil,nil,nil,2}
d = {4,nil,1,nil,3}
e = {5,1,2,3,4}
tab = {a,b,c,d,e}
t = {}
for i, v in pairs(tab) do
t[v] = 0
end
for iv in pairs(t) do --is almost every time outputting it in a different order
print(iv[1],iv[2],iv[3],iv[4],iv[5]) --could be a list of data where you have to go through all of it anyway
end
io.read()
Now I can store some additional information in t[v] but if I don't have
any is there maybe some lua-type that is smaller?
Edit:
Does this go well with the use of weak-tables?
Note:
Standard 2d table: table[key1] = table
table[key1][key2] <-- contains stuff
this version: table[table] = anything but nil <-- not accessible over table[key1][key2]
key1[key2] <-- contains stuff
It's fine to use a table as a key in another table.
However, note that different tables will be different keys, even of the tables have the same contents.

Mongoid MapReduce giving irregular results for recursive reduce function

I have an Item model which has an attribute category. I want the items count grouped by category. I wrote a map reduce for this functionality. It was working fine. I recently wrote a script to create 5000 items. Now I realize my map reduce only gives the result for the last 80 records. The following is the code for the mapreduce function.
map = %Q{
function(){
emit({},{category: this.category});
}
}
reduce = %Q{
function(key, values){
var category_count = {};
values.forEach(function(value){
if(category_count.hasOwnProperty(value.category))
category_count[value.category]++;
else
category_count[value.category] = 1
})
return category_count;
}
}
Item.map_reduce(map,reduce).out(inline: true).first.try(:[],"value")
After researching a bit and I discovered mongodb invokes reduce function multiple times. How can achieve the functionality I intended for?
There is a rule you must follow when writing map-reduce code in MongoDB (a few rules, actually). One is that the emit (which emits key/value pairs) must have the same format for the value that your reduce function will return.
If you emit(this.key, this.value) then reduce must return the exact same type that this.value has. If you emit({},1) then reduce must return a number. If you emit({},{category: this.category}) then reduce must return the document of format {category:"string"} (assuming category is a string).
So that clearly can't be what you want, since you want totals, so let's look at what reduce is returning and work out from that what you should be emitting.
It looks like at the end you want to accumulate a document where there is a keyname for each category and its value is a number representing the number of its occurrences. Something like:
{category_name1:total, category_name2:total}
If that's the case then the correct map function would emit({},{"this.category":1}) in which case your reduce will need to add up the numbers for each key corresponding to a category.
Here is what the map should look like:
map=function (){
category = { };
category[this.category]=1;
emit({},category);
}
And here is the correct corresponding reduce:
reduce=function (key,values) {
var category_count = {};
values.forEach(function(value){
for (cat in value) {
if( !category_count.hasOwnProperty(cat) ) category_count[cat]=0;
category_count[cat] += value[cat];
}
});
return category_count;
}
Note that it satisfies two other requirements for MapReduce - it works correctly if the reduce function is never called (which will be the case if there is only one document in your collection) and it will work correctly if the reduce function gets called multiple times (which is what's happening when you have more than 100 documents).
A more conventional way to do that would be to emit category name as key and the number as value. This simplifies map and reduce:
map=function() {
emit(this.category, 1);
}
reduce=function(key,values) {
var count=0;
values.forEach(function(val) {
count+=val;
}
return count;
}
This will sum the number of times each category appears. This also satisfies requirements for MapReduce - it works correctly if the reduce function is never called (which will be the case for any category that only appears once) and it will work correctly if the reduce function gets called multiple times (which will happen if any category appears more than 100 times).
As others pointed out, aggregation framework makes the same exercise much simpler with:
db.collection.aggregate({$group:{_id:"$category",count:{$sum:1}}})
although that matches the format of the second mapReduce I showed, and not the original format that you had which is outputting category names as keys. However aggregation framework will always be significantly faster than MapReduce.
I agree with Neil Lunn's comment.
What I can see from the info that is provided is that if you are on a version of MongoDB greater or equal than 2.2 you can use the aggregation framework instead of map-reduce.
db.items.aggregate([
{ $group: { _id: '$category', category_count: { $sum: 1 } }
])
Which is a lot simpler and performant (see Map/Reduce vs. Aggregation Framework )

Lua table C api

I know of:
http://lua-users.org/wiki/SimpleLuaApiExample
It shows me how to build up a table (key, value) pair entry by entry.
Suppose instead, I want to build a gigantic table (say something a 1000 entry table, where both key & value are strings), is there a fast way to do this in lua (rather than 4 func calls per entry:
push
key
value
rawset
What you have written is the fast way to solve this problem. Lua tables are brilliantly engineered, and fast enough that there is no need for some kind of bogus "hint" to say "I expect this table to grow to contain 1000 elements."
For string keys, you can use lua_setfield.
Unfortunately, for associative tables (string keys, non-consecutive-integer keys), no, there is not.
For array-type tables (where the regular 1...N integer indexing is being used), there are some performance-optimized functions, lua_rawgeti and lua_rawseti: http://www.lua.org/pil/27.1.html
You can use createtable to create a table that already has the required number of slots. However, after that, there is no way to do it faster other than
for(int i = 0; i < 1000; i++) {
lua_push... // key
lua_push... // value
lua_rawset(L, tableindex);
}

Resources