Can i avoid 2 lookups by using _G? - lua

I've read that if you use some functions like table.insert, lua will first try to lookup the variable in the local scope, then in global scope. Can i bypass the local lookup by using _G.table.insert instead?
Here's the output of luac -l:
without _G
main <main.lua:0,0> (7 instructions at 0x55581ae50c60)
0+ params, 4 slots, 1 upvalue, 1 local, 3 constants, 0 functions
1 [1] NEWTABLE 0 0 0
2 [2] GETTABUP 1 0 -1 ; _ENV "table"
3 [2] GETTABLE 1 1 -2 ; "insert"
4 [2] MOVE 2 0
5 [2] LOADK 3 -3 ; 5
6 [2] CALL 1 3 1
7 [2] RETURN 0 1
with _G
main <main.lua:0,0> (8 instructions at 0x5562b3d6dc60)
0+ params, 4 slots, 1 upvalue, 1 local, 4 constants, 0 functions
1 [1] NEWTABLE 0 0 0
2 [2] GETTABUP 1 0 -1 ; _ENV "_G"
3 [2] GETTABLE 1 1 -2 ; "table"
4 [2] GETTABLE 1 1 -3 ; "insert"
5 [2] MOVE 2 0
6 [2] LOADK 3 -4 ; 5
7 [2] CALL 1 3 1
8 [2] RETURN 0 1
I'm not sure what the numbers mean.

_G is not reserved and it's probably even worse with it, as i can see from the compiler output. I think doing it with _G is even slower.

Related

Where are class names stored in a machine learning dataset in Python?

I'm learning machine learning using the iris dataset on Python 3.6 with sklearn, and I don't understand where the class names that are being retrieved are stored. In Iris, there are 3 classes, and each class contains 50 observations. You can use several commands to print the classes, and their associated numerical values:
print(iris.target)
print(iris.target_names)
This will result in the output:
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2]
['setosa' 'versicolor' 'virginica']
So as can be seen, the classes are Setosa, Versicolor, and Virginica. What I don't understand is where these class names are being stored, or how they're called upon within the model. If you use the shape command on the data, or target, the result is (150,4) and (150,) meaning there is 150 observations and 4 rows in the data, and 150 rows in the target. I am just not able to bridge the gap with my mind as to where this is coming from, however.
What I don't understand is where the class names are supposed to be stored. If I made a brand new dataset for pokemon types and had ice, fire, water, flying, where could I store these types? Would they be required to be numerical as well, like iris, with 0,1,2,3?
Sklearn uses a custom type of object to store its datasets, exactly so that they can store metadata along with the raw data.
If you load the iris dataset
In [2]: from sklearn import datasets
In [3]: iris = datasets.load_iris()
You can inspect the type of object with type:
In [4]: type(iris)
Out[4]: sklearn.utils.Bunch
You can look at the attributes inside the object with dir:
In [5]: dir(iris)
Out[5]: ['DESCR', 'data', 'feature_names', 'target', 'target_names']
And then use . notation to take a look at the attributes themselves:
In [6]: type(iris.data)
Out[6]: numpy.ndarray
In [7]: type(iris.target)
Out[7]: numpy.ndarray
In [8]: type(iris.feature_names)
Out[8]: list
If you want to mimic this for your own datasets, you will have to define your own custom object type to mimic this structure. That would involve defining your own class.

Is there a more efficient way to write this if (or) statement?

I'm looking to make sure that it is/isn't possible to write this statement more efficiently in Lua:
if (value == 1 or value == 2) then
something like this for an example (nonworking I assume):
if (value == (1 or 2)) then
or
if value == (1;2) then
Let's look at the produced bytecode as a proxy for speed. (Microbenchmarks are not reliable. Caching, pipelinging, branch prediction, … can have really weird effects that can make code that should be slower in principle perform better in practice in the context where you're actually using it. Bytecode size also isn't a very good indicator (same problems apply), but at least it's easy to produce, deterministic, and easy to interpret.)
(To follow along, throw your test files at luac -p -l, which will only parse (not write a compiled file) and list the resulting bytecode as a side-effect. If you want to understand the bytecode, have a look at the unofficial bytecode reference as initially created by Kein-Hong Man and kindly updated by Dibyendu Majumdar. But you don't have to.)
If value is a global variable, you'll get this:
1 [1] GETTABUP 0 0 -1 ; _ENV "value"
2 [1] EQ 1 0 -2 ; - 1 (fall through to next comparison)
3 [1] JMP 0 3 ; to 7 (true branch)
4 [1] GETTABUP 0 0 -1 ; _ENV "value"
5 [1] EQ 0 0 -3 ; - 2 (fall through into true branch)
6 [1] JMP 0 1 ; to 8 (beyond true branch)
Translating back into "pseudo-Lua", this is rougly
local r0 = _ENV["value"]
if r0 == 1 then goto true_branch end
local r0 = _ENV["value"]
if r0 ~= 2 then goto fin end
::true_branch::
-- stuff here
::fin::
If value is local (or a function argument) in the function where you use this, then you'll get something like this instead:
1 [1] EQ 1 0 -1 ; - 1
2 [1] JMP 0 2 ; to 5
3 [1] EQ 0 0 -2 ; - 2
4 [1] JMP 0 1 ; to 6
or roughly
if r0 == 1 then goto true_branch end
if r0 ~= 2 then goto fin end
::true_branch::
-- stuff here
::fin::
"Much" better!
So if value is a global variable (or an upvalue), doing
local value = value
if value == 1 or value == 2 then
-- stuff
end
will give you
1 [1] GETTABUP 0 0 -1 ; _ENV "value"
2 [1] EQ 1 0 -2 ; - 1
3 [1] JMP 0 2 ; to 6
4 [1] EQ 0 0 -3 ; - 2
5 [1] JMP 0 1 ; to 7
or
local r0 = _ENV["value"]
if r0 == 1 then goto true_branch end
if r0 == 2 then goto true_branch end
goto fin
::true_branch::
-- stuff here
::fin::
which saves one lookup. (While microbenchmarks will show a clear difference, in practice you'll almost never notice a difference. If you're doing a deeper lookup if foo.bar.baz == 1 or foo.bar.baz == 2 then it makes sense to local first, and it will probably also increase readability.)

Lua more than one locals in one line

Assuming we have the following code:
local x = 1
local x, y = 2, 3
I know x will become 2 after the second line, however, does the local on the that line create a new x, or use the one before?
They will be two different local values: the first one will be shadowed and not accessible as the second one is created with the same name in the same block. Here is the information that luac -l -l (Lua 5.3) shows for this script:
main <local.lua:0,0> (4 instructions at 00697ae8)
0+ params, 3 slots, 1 upvalue, 3 locals, 3 constants, 0 functions
1 [1] LOADK 0 -1 ; 1
2 [2] LOADK 1 -2 ; 2
3 [2] LOADK 2 -3 ; 3
4 [2] RETURN 0 1
constants (3) for 00697ae8:
1 1
2 2
3 3
locals (3) for 00697ae8:
0 x 2 5
1 x 4 5
2 y 4 5
upvalues (1) for 00697ae8:
0 _ENV 1 0
The locals section shows three variables with two x that have the same end-of-scope location.

Does the Lua compiler optimize local vars?

Is the current Lua compiler smart enough to optimize away local variables that are used for clarity?
local top = x - y
local bottom = x + y
someCall(top, bottom)
Or does inlining things by hand run faster?
someCall(x - y, x + y)
Since Lua often compiles source code into byte code on the fly, it is designed to be a fast single-pass compiler. It does do some constant folding, but other than that there are not many optimizations. You can usually check what the compiler does by executing luac -l -l -p file.lua and looking at the generated (disassembled) byte code.
In your case the Lua code
function a( x, y )
local top = x - y
local bottom = x + y
someCall(top, bottom)
end
function b( x, y )
someCall(x - y, x + y)
end
results int the following byte code listing when run through luac5.3 -l -l -p file.lua (some irrelevant parts skipped):
function <file.lua:1,5> (7 instructions at 0xcd7d30)
2 params, 7 slots, 1 upvalue, 4 locals, 1 constant, 0 functions
1 [2] SUB 2 0 1
2 [3] ADD 3 0 1
3 [4] GETTABUP 4 0 -1 ; _ENV "someCall"
4 [4] MOVE 5 2
5 [4] MOVE 6 3
6 [4] CALL 4 3 1
7 [5] RETURN 0 1
constants (1) for 0xcd7d30:
1 "someCall"
locals (4) for 0xcd7d30:
0 x 1 8
1 y 1 8
2 top 2 8
3 bottom 3 8
upvalues (1) for 0xcd7d30:
0 _ENV 0 0
function <file.lua:7,9> (5 instructions at 0xcd7f10)
2 params, 5 slots, 1 upvalue, 2 locals, 1 constant, 0 functions
1 [8] GETTABUP 2 0 -1 ; _ENV "someCall"
2 [8] SUB 3 0 1
3 [8] ADD 4 0 1
4 [8] CALL 2 3 1
5 [9] RETURN 0 1
constants (1) for 0xcd7f10:
1 "someCall"
locals (2) for 0xcd7f10:
0 x 1 6
1 y 1 6
upvalues (1) for 0xcd7f10:
0 _ENV 0 0
As you can see, the first variant (the a function) has two additional MOVE instructions, and two additional locals.
If you are interested in the details of the opcodes, you can check the comments for the OpCode enum in lopcodes.h.
E.g. the opcode format for OP_ADD is:
OP_ADD,/* A B C R(A) := RK(B) + RK(C) */
So the 2 [3] ADD 3 0 1 from above takes the values from registers 0 and 1 (the locals x and y in this case), adds them together, and stores the result in register 3. It is the second opcode in this function and the corresponding source code is on line 3.

Difference between variable.functionName and variable["functionName"]

I know that you can get variables and call functions both by using the name directly
variable.functionName
or using the name as a string
variable["functionName"] or variable[functionNameString]
Now my question is:
Is there any resulting difference in these different ways or are they completely interchangeable?
I am mostly interested about performance here, but any enlightenment is welcome.
The PUC-Rio Lua 5.1 byte code for
print(variable.functionName)
print(variable["functionName"])
print(variable[functionNameString])
is
main <var.lua:0,0> (14 instructions, 56 bytes at 0xafe530)
0+ params, 3 slots, 0 upvalues, 0 locals, 4 constants, 0 functions
1 [1] GETGLOBAL 0 -1 ; print
2 [1] GETGLOBAL 1 -2 ; variable
3 [1] GETTABLE 1 1 -3 ; "functionName"
4 [1] CALL 0 2 1
5 [2] GETGLOBAL 0 -1 ; print
6 [2] GETGLOBAL 1 -2 ; variable
7 [2] GETTABLE 1 1 -3 ; "functionName"
8 [2] CALL 0 2 1
9 [3] GETGLOBAL 0 -1 ; print
10 [3] GETGLOBAL 1 -2 ; variable
11 [3] GETGLOBAL 2 -4 ; functionNameString
12 [3] GETTABLE 1 1 2
13 [3] CALL 0 2 1
14 [3] RETURN 0 1
As you can see the first two lines generate exactly the same byte code (and thus take the same amount of time), while the third line has an additional (global) variable access.
The first line only works since "functionName" is a valid Lua identifier and not a reserved word. Lines 2 and 3 don't have restrictions about the format of the string key.
They are the same. From the manual:
... To represent records, Lua uses the field name as an index. The language supports this representation by providing a.name as syntactic sugar for a["name"].

Resources