Lua Torch7 array index notation - lua

I have an array named xTain of size nDatax1
I initialize it as
xTrain = torch.linspace(-1,1,nData)
To access the array, the author uses xTrain[{{i}}]
can you please explain this notation? Why not simply xTrain[i] ?
Please refer the author's code here on Pg No 21- https://www.cs.ox.ac.uk/people/nando.defreitas/machinelearning/lecture4.pdf
As an additional note-
xTrain=torch.linespace(-1,1,10)
when I do
th> print(xTrain[1])
-1
th> print(xTrain[{1}])
-1
th> print(xTrain[{{1}}])
-1
[torch.DoubleTensor of size 1]
Why does it also print [torch.DoubleTensor of size 1] in 3rd case. My guess is in first 2 case its returning a scalar value at that location and in 3rd case a DoubleTensor

Good place to start is the Lua manual, it's syntax and explessions. You can see, what is meaning of {...} in Lua:
{...} -- creates a list with all vararg parameters
So in short your {1} creates a list with single value 1. Repeating it once more you got a list containing list containing single number 1.
If the xTrain would be simple table, it would probably fail, because it is hard to index using lists, but Lua supports metatables so the actual value is not used for indexing the table, but passed to some function which takes care of the lists.
Also reading further about the Tensor class, which is returned from the torch.linespace() function is a good place to see. The indexing using "array access" is explained in the section [Tensor] [{ dim1,dim2,... }] or [{ {dim1s,dim1e}, {dim2s,dim2e} }]

Related

What is "object = {...}" in lua good for?

I recently read about lua and addons for the game "World of Warcraft". Since the interface language for addons is lua and I want to learn a new language, I thought this was a good idea.
But there is this one thing I can't get to know. In almost every addon there is this line on the top which looks for me like a constructor that creates a object on which member I can have access to. This line goes something like this:
object = {...}
I know that if a function returns several values (which is IMHO one huge plus for lua) and I don't want to store them seperatly in several values, I can just write
myArray = {SomeFunction()}
where myArray is now a table that contains the values and I can access the values by indexing it (myArray[4]). Since the elements are not explicitly typed because only the values themselfe hold their type, this is fine for lua. I also know that "..." can be used for a parameter array in a function for the case that the function does not know how many parameter it gets when called (like String[] args in java). But what in gods name is this "curly bracket - dot, dot, dot - curly bracket" used for???
You've already said all there is to it in your question:
{...} is really just a combination of the two behaviors you described: It creates a table containing all the arguments, so
function foo(a, b, ...)
return {...}
end
foo(1, 2, 3, 4, 5) --> {3, 4, 5}
Basically, ... is just a normal expression, just like a function call that returns multiple values. The following two expressions work in the exact same way:
local a, b, c = ...
local d, e, f = some_function()
Keep in mind though that this has some performance implications, so maybe don't use it in a function that gets called like 1000 times a second ;)
EDIT:
Note that this really doesn't apply just to "functions". Functions are actually more of a syntax feature than anything else. Under the hood, Lua only knows of chunks, which are what both functions and .lua files get turned into. So, if you run a Lua script, the entire script gets turned into a chunk and is therefore no different than a function.
In terms of code, the difference is that with a function you can specify names for its arguments outside of its code, whereas with a file you're already at the outermost level of code; there's no "outside" a file.
Luckily, all Lua files, when they're loaded as a chunk, are automatically variadic, meaning they get the ... to access their argument list.
When you call a file like lua script.lua foo bar, inside script.lua, ... will actually contain the two arguments "foo" and "bar", so that's also a convenient way to access arguments when using Lua for standalone scripts.
In your example, it's actually quite similar. Most likely, somewhere else your script gets loaded with load(), which returns a function that you can call—and, you guessed it, pass arguments to.
Imagine the following situation:
function foo(a, b)
print(b)
print(a)
end
foo('hello', 'world')
This is almost equivalent to
function foo(...)
local a, b = ...
print(b)
print(a)
end
foo('hello', 'world')
Which is 100% (Except maybe in performance) equivalent to
-- Note that [[ string ]] is just a convenient syntax for multiline "strings"
foo = load([[
local a, b = ...
print(b)
print(a)
]])
foo('hello', 'world')
From the Lua 5.1 Reference manual then {...} means the arguments passed to the program. In your case those are probably the arguments passed from the game to the addon.
You can see references to this in this question and this thread.
Put the following text at the start of the file:
local args = {...}
for __, arg in ipairs(args) do
print(arg)
end
And it reveals that:
args[1] is the name of the addon
args[2] is a (empty) table passed by reference to all files in the same addon
Information inserted to args[2] is therefore available to different files.

Lua length operator (#) with nil values

After reading this topic and after experimenting a bit, I am trying to understand how the Lua length operator works when a table contains nil values.
Before I started to investigate, I thought that the length was simply the number of consecutive non-nil elements, starting at index 1:
print(#{nil}) -- 0
print(#{"o"}) -- 1
print(#{"o",nil}) -- 1
print(#{"o","o"}) -- 2
print(#{"o","o",nil}) -- 2
That looks pretty simple, right?
But my headache started when I accidentally added an element after a nil-terminated table:
print(#{"o",nil,"o"})
My guess was that it should probably print 1 because it would stop counting when the first nil is found. Or maybe it should print 2 if the length operator is greedy enough to look for non-nil elements after the first nil. But the above code prints 3.
So I’ve ran several other tests to see what happens:
-- nil before the end
print(#{nil,"o"}) -- 2
print(#{nil,"o","o"}) -- 3
print(#{"o",nil,"o"}) -- 3
-- several nil elements
print(#{"o",nil,nil}) -- 1
print(#{nil,"o",nil}) -- 0
print(#{nil,nil,"o"}) -- 3
I should mention that repl.it currently uses Lua 5.1.5 which is rather old, but if you test with the Lua demo, which currently uses Lua 5.3.5, you’ll get the same results.
By looking at those results and by looking at this answer, I assume that:
if the last element is not nil, the length operator returns the full size of the table, including nil entries if any
if the last element is nil, it counts the number of consecutive non-nil and stops counting at the first nil
Are those assumptions correct?
Can we predict a 100% well-defined behavior when a table contains one or several nil values?
The Lua documentation states that the length of a table is only defined if the table is a sequence. Does that mean that the length operator has undefined behavior for non-sequences?
Apart from the length operator, can nil values cause any trouble in a table?
We can predict some behaviour, but it is not standardised, and as such you should never rely on it. It's quite possible that the behaviour may change within this major version of Lua.
Should you ever need to fill a table with nil values, I suggest wrapping the table and replace holes with a unique placeholder value (eg. NIL={}; if v==nil then t[k]=NIL end, this is quite cheap to test against and safe.).
That said...
As there is even a difference in the result of # depending on how the table is defined, you'll have to distinguish between statically defined (constant) tables and dynamic defined (muted) tables.
Static table definitions:
#{nil,nil,nil,nil,nil, 1} -- 6
#{3, 2, nil, 1} -- 4
#{nil,nil,nil, 1, 1,nil} -- 0
#{nil,nil, 1, 1, 1,nil} -- 5
#{nil, 1, 1, 1, 1,nil} -- 5
#{nil,nil,nil,nil, 1,nil} -- 0
#{nil,nil, 1,nil, 1,nil,nil} -- 5
#{nil,nil,nil, 1,nil,nil, 1,nil} -- 4
Using this kind of definition, as long as the last value is non-nil, you will get a length equal to the position of the last value. If the last value is nil, Lua starts a (non-linear) search from the tail until it finds the first non-nil value.
Dynamic data definition
local x={}; x[5]=1;print(#x) -- 0
local x={}; x[1]=1;x[2]=1;x[3]=1;x[5]=1;print(#x) -- 3
local x={}; x[1]=1;x[2]=1;x[4]=1;x[5]=1;print(#x) -- 5
#{[5]=1} -- 0
local x={nil,nil,nil,1};x[5]=1;print(#x) -- 0
As soon as the table was changed once, the operator works the other way (that includes static definitions with []). If the first element is nil, # always returns 0, but if not it starts a search that I did not investigate further (I guess you can check the sources, though I don't think it's a standard binary search), until it finds a nil value that is preceded by a non-nil value.
As said before, relying on this behaviour is not a good idea, and invites lots of issues down the road. Though if you want to make a nasty unmaintainable program to mess with a colleague, that's a sure way to do it.
When a table is a sequence (all numeric keys start at 1 and there are no nil gaps), # is defined to be precisely the count of those elements.
For non-sequence tables, it is a bit more complicated. Lua 5.2 seems to leave the result as undefined. For 5.1 and 5.3, the result of the operation is a border.
A border in a table is any positive index that contains a non-nil value followed by nil, or 0 if the first element is nil. # is defined to return any value that satifies these conditions.
Looking at it from another perspective, since tables contain an "array" part and a "map" part, Lua has no way of knowing where the "map" indices start. For example, you can create a table with 1000 values and then set the first 999 of them to nil; that could leave you with a table of "size" 1000. However, you can also start with an empty table and set the 1000th element, having a table of "size" 0 but still structurally equivalent to the first one. The result of # is then simply the first valid value the internal algorithm finds.
The length operator produces undefined behaviour for tables that aren't sequences (i.e. tables with nil elements in the middle of the array). This means that even if the Lua implementation always behaves in a certain way, you shouldn't rely on that behaviour, as it may change in future versions of Lua, or in different implementations like LuaJIT.
You can use nils in tables - there is nothing wrong with that - just don't use the length operator on a table which might have nils before non-nil values.
The post you linked to contains more details about how the actual algorithm works. It mentions counting elements with a "binsearch", i.e. a binary search. This is not the same as just counting the elements one by one - if there are nils in the table, then depending on their exact position, the binary search algorithm may treat them as the end of the table, or may just ignore them.
To sum up, the algorithm is harder to predict than you were assuming, and even though it is technically possible to predict what will happen in any given case, you shouldn't rely on that behaviour as it is liable to change.

How can filter any SET by its concat value according to another SET in Redis

I have a filter optimization problem in Redis.
I have a Redis SET which keeps the doc and pos pairs of a type in a corpus.
example:
smembers type_in_docs.1
result: doc.pos pairs
array (size=216627)
0 => string '2805.2339' (length=9)
1 => string '2410.14208' (length=10)
2 => string '3516.1810' (length=9)
...
Another redis set i create live according to user choices
It contains selected docs.
smembers filteredDocs
I want to filter doc.pos pairs "type_in_docs" set according to user Doc id choices.
In fact if i didnt use concat values in set it was easy with SINTER.
So i implement a php filter code as below.
It works but need an optimization.
In big doc.pairs set too much time need. (Nearly After 150000 members!)
$concordance= $this->redis->smembers('types_in_docs.'.$typeID);
$filteredDocs= $this->redis->smembers('filteredDocs');
$filtered = array_filter($concordance, function($pairs) use ($filteredDocs) {
if( in_array(substr($pairs, 0, strpos($pairs, '.')), $filteredDocs) ) return true;
});
I tried sorted set with scores as docId.
Bu couldnt find a intersect or filter option for score values.
I am thinking and searching a Redis based solution with supported keys, sets or Lua script for time optimization.
But nothing find.
How can i filter Redis sets with concat values?
Thanks for helps.
Your code is slow primarily because you're moving a lot of data from Redis to your PHP filter. The general motivation here should be perform as much filtering as possible on the server. To do that you'd need to pay some sort of price in CPU & RAM.
There are many ways to do this, here's one:
Ensure you're using Redis v2.8.9 or above.
To allow efficiently looking for doc only, keep your doc.pos pairs as is but use Sorted Sets with score = 0, your e.g.:
ZADD type_in_docs.1 0 2805.2339 0 2410.14208 0 3516.1810
This will allow you to mimic SISMEMBER for doc in the set with:
ZRANGEBYLEX type_in_docs.1 [<$typeID> (<$typeID + "\xff">
You can now just SMEMBERS on the (usually) smaller filterDocs set and then call ZRANGEBYLEX on each for immediate gains.
If you want to do better - in extreme cases (i.e. large filterDocs, small type_in_docs) you should do the reverse.
If you want to do even better, use Lua to wrap up the filtering logic - something like:
-- #usage: redis-cli --filter_doc_pos.lua <filter set keyname> <type pairs keyname>
-- #returns: list of matching doc.pos pairs
local r = {}
for _, fv in pairs(redis.call("SMEMBERS", KEYS[1])) do
local t = redis.call("ZRANGEBYLEX", KEYS[2], "[" .. fv , "(" .. fv .. "\xff")
for _, tv in pairs(t) do
r[#r+1] = tv
end
end
return r

Lua: understanding table array part and hash part

In section 4, Tables, in The Implementation of Lua 5.0 there is and example:
local t = {100, 200, 300, x = 9.3}
So we have t[4] == nil. If I write t[0] = 0, this will go to hash part.
If I write t[5] = 500 where it will go? Array part or hash part?
I would eager to hear answer for Lua 5.1, Lua 5.2 and LuaJIT 2 implementation if there is difference.
Contiguous integer keys starting from 1 always go in the array part.
Keys that are not positive integers always go in the hash part.
Other than that, it is unspecified, so you cannot predict where t[5] will be stored according to the spec (and it may or may not move between the two, for example if you create then delete t[4].)
LuaJIT 2 is slightly different - it will also store t[0] in the array part.
If you need it to be predictable (which is probably a design smell), stick to pure-array tables (contiguous integer keys starting from 1 - if you want to leave gap use a value of false instead of nil) or pure hash tables (avoid non-negative integer keys.)
Quoting from Implementation of Lua 5.0
The array part tries to store the values corresponding to integer keys from 1 to some limit n.Values corresponding to non-integer keys or to integer keys outside the array range are
stored in the hash part.
The index of the array part starts from 1, that's why t[0] = 0 will go to hash part.
The computed size of the array part is the largest nsuch that at least half the slots between 1 and n are in use (to avoid wasting space with sparse arrays) and there is at least one used slot between n/2+1 and n(to avoid a size n when n/2 would do).
According from this rule, in the example table:
local t = {100, 200, 300, x = 9.3}
The array part which holds 3 elements, may have a size of 3, 4 or 5. (EDIT: the size should be 4, see #dualed's comment.)
Assume that the array has a size of 4, when writing t[5] = 500, the array part can no longer hold the element t[5], what if the array part resize to 8? With a size of 8, the array part holds 4 elements, which is equal to (so, not less that) half of the array size. And the index from between n/2+1 and n, which in this case, is 5 to 8, has one element:t[5]. So an array size of 8 can accomplish the requirement. In this case, t[5] will go to the array part.

What's the difference between table.insert(t, i) and t[#t+1] = i?

In Lua, there seem to be two ways of appending an element to an array:
table.insert(t, i)
and
t[#t+1] = i
Which should I use, and why?
Which to use is a matter of preference and circumstance: as the # length operator was introduced in version 5.1, t[#t+1] = i will not work in Lua 5.0, whereas table.insert has been present since 5.0 and will work in both. On the other hand, t[#t+1] = i uses exclusively language-level operators, wheras table.insert involves a function (which has a slight amount of overhead to look up and call and depends on the table module in the environment).
In the second edition of Programming in Lua (an update of the Lua 5.0-oriented first edition), Roberto Ierusalimschy (the designer of Lua) states that he prefers t[#t+1] = i, as it's more visible.
Also, depending on your use case, the answer may be "neither". See the manual entry on the behavior of the length operator:
If the array has "holes" (that is, nil values between other non-nil values), then #t can be any of the indices that directly precedes a nil value (that is, it may consider any such nil value as the end of the array).
As such, if you're dealing with an array with holes, using either one (table.insert uses the length operator) may "append" your value to a lower index in the array than you want. How you define the size of your array in this scenario is up to you, and, again, depends on preference and circumstance: you can use table.maxn (disappearing in 5.2 but trivial to write), you can keep an n field in the table and update it when necessary, you can wrap the table in a metatable, or you could use another solution that better fits your situation (in a loop, a local tsize in the scope immediately outside the loop will often suffice).
The following is slightly on the amusing side but possibly with a grain of aesthetics. Even though there are obvious reasons that mytable:operation() is not supplied like mystring:operation(), one can easily roll one's own variant, and get a third notation if desired.
Table = {}
Table.__index = table
function Table.new()
local t = {}
setmetatable(t, Table)
return t
end
mytable = Table.new()
mytable:insert('Hello')
mytable:insert('World')
for _, s in ipairs(mytable) do
print(s)
end
insert can insert arbitrarily (as its name states), it only defaults to #t + 1, where as t[#t + 1] = i will always append to the (end of the) table. see section 5.5 in the lua manual.
'#' operator only use indexed key table.
t = {1, 2 ,3 ,4, 5, x=1, y=2}
at above code
print(#t) --> print 5 not 7
'#' operator whenever not using.
If you want to '#' operator, then check it to table elements type.
Insert function can using any type use.But element count to work slow than '#'

Resources