Lua more than one locals in one line - lua

Assuming we have the following code:
local x = 1
local x, y = 2, 3
I know x will become 2 after the second line, however, does the local on the that line create a new x, or use the one before?

They will be two different local values: the first one will be shadowed and not accessible as the second one is created with the same name in the same block. Here is the information that luac -l -l (Lua 5.3) shows for this script:
main <local.lua:0,0> (4 instructions at 00697ae8)
0+ params, 3 slots, 1 upvalue, 3 locals, 3 constants, 0 functions
1 [1] LOADK 0 -1 ; 1
2 [2] LOADK 1 -2 ; 2
3 [2] LOADK 2 -3 ; 3
4 [2] RETURN 0 1
constants (3) for 00697ae8:
1 1
2 2
3 3
locals (3) for 00697ae8:
0 x 2 5
1 x 4 5
2 y 4 5
upvalues (1) for 00697ae8:
0 _ENV 1 0
The locals section shows three variables with two x that have the same end-of-scope location.

Related

Can i avoid 2 lookups by using _G?

I've read that if you use some functions like table.insert, lua will first try to lookup the variable in the local scope, then in global scope. Can i bypass the local lookup by using _G.table.insert instead?
Here's the output of luac -l:
without _G
main <main.lua:0,0> (7 instructions at 0x55581ae50c60)
0+ params, 4 slots, 1 upvalue, 1 local, 3 constants, 0 functions
1 [1] NEWTABLE 0 0 0
2 [2] GETTABUP 1 0 -1 ; _ENV "table"
3 [2] GETTABLE 1 1 -2 ; "insert"
4 [2] MOVE 2 0
5 [2] LOADK 3 -3 ; 5
6 [2] CALL 1 3 1
7 [2] RETURN 0 1
with _G
main <main.lua:0,0> (8 instructions at 0x5562b3d6dc60)
0+ params, 4 slots, 1 upvalue, 1 local, 4 constants, 0 functions
1 [1] NEWTABLE 0 0 0
2 [2] GETTABUP 1 0 -1 ; _ENV "_G"
3 [2] GETTABLE 1 1 -2 ; "table"
4 [2] GETTABLE 1 1 -3 ; "insert"
5 [2] MOVE 2 0
6 [2] LOADK 3 -4 ; 5
7 [2] CALL 1 3 1
8 [2] RETURN 0 1
I'm not sure what the numbers mean.
_G is not reserved and it's probably even worse with it, as i can see from the compiler output. I think doing it with _G is even slower.

Where are class names stored in a machine learning dataset in Python?

I'm learning machine learning using the iris dataset on Python 3.6 with sklearn, and I don't understand where the class names that are being retrieved are stored. In Iris, there are 3 classes, and each class contains 50 observations. You can use several commands to print the classes, and their associated numerical values:
print(iris.target)
print(iris.target_names)
This will result in the output:
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2]
['setosa' 'versicolor' 'virginica']
So as can be seen, the classes are Setosa, Versicolor, and Virginica. What I don't understand is where these class names are being stored, or how they're called upon within the model. If you use the shape command on the data, or target, the result is (150,4) and (150,) meaning there is 150 observations and 4 rows in the data, and 150 rows in the target. I am just not able to bridge the gap with my mind as to where this is coming from, however.
What I don't understand is where the class names are supposed to be stored. If I made a brand new dataset for pokemon types and had ice, fire, water, flying, where could I store these types? Would they be required to be numerical as well, like iris, with 0,1,2,3?
Sklearn uses a custom type of object to store its datasets, exactly so that they can store metadata along with the raw data.
If you load the iris dataset
In [2]: from sklearn import datasets
In [3]: iris = datasets.load_iris()
You can inspect the type of object with type:
In [4]: type(iris)
Out[4]: sklearn.utils.Bunch
You can look at the attributes inside the object with dir:
In [5]: dir(iris)
Out[5]: ['DESCR', 'data', 'feature_names', 'target', 'target_names']
And then use . notation to take a look at the attributes themselves:
In [6]: type(iris.data)
Out[6]: numpy.ndarray
In [7]: type(iris.target)
Out[7]: numpy.ndarray
In [8]: type(iris.feature_names)
Out[8]: list
If you want to mimic this for your own datasets, you will have to define your own custom object type to mimic this structure. That would involve defining your own class.

Does the Lua compiler optimize local vars?

Is the current Lua compiler smart enough to optimize away local variables that are used for clarity?
local top = x - y
local bottom = x + y
someCall(top, bottom)
Or does inlining things by hand run faster?
someCall(x - y, x + y)
Since Lua often compiles source code into byte code on the fly, it is designed to be a fast single-pass compiler. It does do some constant folding, but other than that there are not many optimizations. You can usually check what the compiler does by executing luac -l -l -p file.lua and looking at the generated (disassembled) byte code.
In your case the Lua code
function a( x, y )
local top = x - y
local bottom = x + y
someCall(top, bottom)
end
function b( x, y )
someCall(x - y, x + y)
end
results int the following byte code listing when run through luac5.3 -l -l -p file.lua (some irrelevant parts skipped):
function <file.lua:1,5> (7 instructions at 0xcd7d30)
2 params, 7 slots, 1 upvalue, 4 locals, 1 constant, 0 functions
1 [2] SUB 2 0 1
2 [3] ADD 3 0 1
3 [4] GETTABUP 4 0 -1 ; _ENV "someCall"
4 [4] MOVE 5 2
5 [4] MOVE 6 3
6 [4] CALL 4 3 1
7 [5] RETURN 0 1
constants (1) for 0xcd7d30:
1 "someCall"
locals (4) for 0xcd7d30:
0 x 1 8
1 y 1 8
2 top 2 8
3 bottom 3 8
upvalues (1) for 0xcd7d30:
0 _ENV 0 0
function <file.lua:7,9> (5 instructions at 0xcd7f10)
2 params, 5 slots, 1 upvalue, 2 locals, 1 constant, 0 functions
1 [8] GETTABUP 2 0 -1 ; _ENV "someCall"
2 [8] SUB 3 0 1
3 [8] ADD 4 0 1
4 [8] CALL 2 3 1
5 [9] RETURN 0 1
constants (1) for 0xcd7f10:
1 "someCall"
locals (2) for 0xcd7f10:
0 x 1 6
1 y 1 6
upvalues (1) for 0xcd7f10:
0 _ENV 0 0
As you can see, the first variant (the a function) has two additional MOVE instructions, and two additional locals.
If you are interested in the details of the opcodes, you can check the comments for the OpCode enum in lopcodes.h.
E.g. the opcode format for OP_ADD is:
OP_ADD,/* A B C R(A) := RK(B) + RK(C) */
So the 2 [3] ADD 3 0 1 from above takes the values from registers 0 and 1 (the locals x and y in this case), adds them together, and stores the result in register 3. It is the second opcode in this function and the corresponding source code is on line 3.

Difference between variable.functionName and variable["functionName"]

I know that you can get variables and call functions both by using the name directly
variable.functionName
or using the name as a string
variable["functionName"] or variable[functionNameString]
Now my question is:
Is there any resulting difference in these different ways or are they completely interchangeable?
I am mostly interested about performance here, but any enlightenment is welcome.
The PUC-Rio Lua 5.1 byte code for
print(variable.functionName)
print(variable["functionName"])
print(variable[functionNameString])
is
main <var.lua:0,0> (14 instructions, 56 bytes at 0xafe530)
0+ params, 3 slots, 0 upvalues, 0 locals, 4 constants, 0 functions
1 [1] GETGLOBAL 0 -1 ; print
2 [1] GETGLOBAL 1 -2 ; variable
3 [1] GETTABLE 1 1 -3 ; "functionName"
4 [1] CALL 0 2 1
5 [2] GETGLOBAL 0 -1 ; print
6 [2] GETGLOBAL 1 -2 ; variable
7 [2] GETTABLE 1 1 -3 ; "functionName"
8 [2] CALL 0 2 1
9 [3] GETGLOBAL 0 -1 ; print
10 [3] GETGLOBAL 1 -2 ; variable
11 [3] GETGLOBAL 2 -4 ; functionNameString
12 [3] GETTABLE 1 1 2
13 [3] CALL 0 2 1
14 [3] RETURN 0 1
As you can see the first two lines generate exactly the same byte code (and thus take the same amount of time), while the third line has an additional (global) variable access.
The first line only works since "functionName" is a valid Lua identifier and not a reserved word. Lines 2 and 3 don't have restrictions about the format of the string key.
They are the same. From the manual:
... To represent records, Lua uses the field name as an index. The language supports this representation by providing a.name as syntactic sugar for a["name"].

Quick way to determine the number of k-length paths from A to B in a dense complete graph

Given a complete dense graph (over 250.000 nodes) , what is the quickest way to determine the number of k-length paths from node A to B ?
I understand this is an old post, but I had the exact same question and could not find the answer.
I like to think of this problem as a "permutation without repetition", as the order of the nodes visited matters (permutation) and we aren't backtracking (no repetitions). The number of permutations without repetition is: n!/(n-r)!
For a complete graph with N nodes, there are N - 2 remaining nodes to choose from when creating a path between a given A and B. To create a path of length K, K-1 nodes must be chosen from the remaining nodes after A and B are excluded. Therefore, in this context, n = N - 2, and r = k - 1.
Plugging into the above formula yields:
(N-2)!/(N-K-1)!
Example: for N = 5, with nodes 0,1,2,3,4 the following paths are possible from 0 to 1:
0 1
0 2 1
0 2 3 1
0 2 3 4 1
0 2 4 1
0 2 4 3 1
0 3 1
0 3 2 1
0 3 2 4 1
0 3 4 1
0 3 4 2 1
0 4 1
0 4 2 1
0 4 2 3 1
0 4 3 1
0 4 3 2 1
This yields 1 path of length 1, 3 paths of length 2, 6 paths of length 3, and 6 paths of length 4.
This appears to work for any N>=2 and K<=N-1.
You can use basically dynamic programming: For each node Y and path length k, you can compute the number of paths from A to Y of length k if you know the number of paths from A to X of path length k-1 for all nodes X. Total complexity is O(KV), where K is the total path length you are trying to compute for and V is the number of vertices.

Resources