In lua when u have a function in a table, what is the difference between declaring a global variable within the function vs declaring the variable as an entry in the table (if any)? The variable is x in the example below.
i.e.
dog={x=33,
func=function(self)
self.x=self.x*self.x
end
}
cat={func=function()
x=33
x=x*x
end
}
In dog I can use the properties of self to call the function with dog:func() instead of dog.func(dog). But outside of that, is there anything performance-wise to take into consideration in choosing between the two? The examples work a bit different when called in a loop, but outside of that?
Well, I heard that the two first rules about optimization are "Don't do it!" and "Don't do it yet!".
There is an official document exposing some ways to optimize Lua code and I recommend it. The most important rule is to prefer local variables to global variables because global variables are 30% slower than local ones.
The first thing we can do with the previous code is to compile it and check the bytecode instructions to understand what happen at the execution time. I stored the first function inside "test-1.lua" and the second one in "test-2.lua".
> cat test-1.lua
dog={x=33,
func=function(self)
self.x=self.x*self.x
end
}
function TEST ()
dog:func()
end
> luac54 -l -s test-1.lua
#
#(part of output omitted for clarity)
#
# Function: dog.func
#
function <test-1.lua:2,4> (6 instructions at 0000000000768740)
1 param, 3 slots, 0 upvalues, 1 local, 1 constant, 0 functions
1 [3] GETFIELD 1 0 0 ; "x"
2 [3] GETFIELD 2 0 0 ; "x"
3 [3] MUL 1 1 2
4 [3] MMBIN 1 2 8 ; __mul
5 [3] SETFIELD 0 0 1 ; "x"
6 [4] RETURN0
#
# Function: TEST (function to call dog.func)
#
function <test-1.lua:7,9> (4 instructions at 00000000000a8a90)
0 params, 2 slots, 1 upvalue, 0 locals, 2 constants, 0 functions
1 [8] GETTABUP 0 0 0 ; _ENV "dog"
2 [8] SELF 0 0 1k ; "func"
3 [8] CALL 0 2 1 ; 1 in 0 out
4 [9] RETURN0
So, if we want to execute TEST 10 times, we will need to execute at least 10*(4+6) bytecode instructions, that's said 100 bytecode instructions.
> cat test-2.lua
cat={func=function()
x=x*x
end
}
x=33
function TEST ()
cat.func()
end
> luac54 -l -s test-2.lua
#
#(part of output omitted for clarity)
#
# Function: cat.func
#
function <test-2.lua:1,3> (6 instructions at 00000000001b87f0)
0 params, 2 slots, 1 upvalue, 0 locals, 1 constant, 0 functions
1 [2] GETTABUP 0 0 0 ; _ENV "x"
2 [2] GETTABUP 1 0 0 ; _ENV "x"
3 [2] MUL 0 0 1
4 [2] MMBIN 0 1 8 ; __mul
5 [2] SETTABUP 0 0 0 ; _ENV "x"
6 [3] RETURN0
#
# Function: TEST (function to call cat.func)
#
function <test-2.lua:8,10> (4 instructions at 00000000001b8a80)
0 params, 2 slots, 1 upvalue, 0 locals, 2 constants, 0 functions
1 [9] GETTABUP 0 0 0 ; _ENV "cat"
2 [9] GETFIELD 0 0 1 ; "func"
3 [9] CALL 0 1 1 ; 0 in 0 out
4 [10] RETURN0
So, if we want to execute TEST 10 times, we will need to execute at least 10*(4+6) bytecode instructions, that's said 100 bytecode instructions.... which is exactly the same as the first version!
Obviously, all the bytecode instructions does not take the same time to execute. Some instructions will spend much more time in the C runtime the other ones. The addition of two integer might be much faster than allocating a new table and initialize some fields. At that point, we could try to do a dirty-and-pointless microbenchmark to give us an idea.
One might copy and paste this code in a Lua interpreter:
> cat dirty-and-pointess-benchmark.lua
dog={x=33,
func=function(self)
self.x=self.x*self.x
end
}
cat={func=function()
x=x*x
end
}
x=33
function StartMeasure ()
StartTime = os.clock()
end
function StopMeasure (TestName)
local Duration = os.clock() - StartTime
print(string.format("%s: %f sec", TestName, Duration))
end
function DoTest1 (Count)
for Index = 1, Count do
dog:func()
end
end
function DoTest2 (Count)
for Index = 1, Count do
cat.func()
end
end
COUNT = 5000000000
StartMeasure()
DoTest1(COUNT)
StopMeasure("VERSION_1")
StartMeasure()
DoTest2(COUNT)
StopMeasure("VERSION_2")
This code give this results on my computer:
VERSION_1: 246.816000 sec
VERSION_2: 250.412000 sec
Obviously, the difference is probably negligible for the most of the programs. We should always try to spend more time on writing correct programs and less time to do micro-benchmarks.
The two code snippets do very different things. dog.func sets self.x to the square of its previous value. cat.func sets the global x to 1089. You can't really compare performance between two things whose functionality are so different.
First of all you should change
cat={func=function()
x=33
x=x*x
end
}
to
x=33
cat={func=function()
x=x*x
end
}
Now we have the same operations.
If I run both functions 10000 times I end up with cat.func() a few percent slower than dog:func()
This does not surprise as indexing locals is faster than indexing globals.
To speed up cat you could do something like this:
x=33
cat={func=function()
local _x = x
x = _x*_x
end
}
The fastest solution is probably
dog={x=33,
func=function(self)
local x = self.x
self.x = x*x
end
}
and you could even gain more speed by making your tables and x local.
Usually you don't win anything significant doing things like that.
Premature optimiziation is a big no-no and you should ask yourself what problem you're actually trying to solve.
It also doesn't make sense to squeeze the last percent out of your code if you do not even know enough Lua to write a simple benchmark for your code... just a thought.
Related
Problem (Tested on Lua 5.3 and 5.4):
a = -9223372036854775807 - 1 ==> -9223372036854775808 (lua_integer)
b = -9223372036854775808 ==> -9.2233720368548e+018 (lua_number)
Question:
Is it possible to get "-9223372036854775808" without modify "luaconf.h" or write "-9223372036854775807 - 1"?
When you write b = -9223372036854775808 in your program, the Lua parser treats this as "apply negation operator to positive integer constant", but positive constant is beyond integer range, so it's treated as float, and negation is applied to the float number, and the final result is float.
There are two solutions to get minimal integer:
Bitwise operators convert floats to integers (bitwise OR has lower priority then negation):
b = -9223372036854775808|0
Use the special constant from math library:
b = math.mininteger
P.S.
Please note that additional OR in the expression b = -9223372036854775808|0 does not make your program slower. Actually, all calculations (negation and OR) are done at compile time, and the bytecode contains only the final constant you need:
$ luac53 -l -l -p -
b = -9223372036854775808|0
main <stdin:0,0> (2 instructions at 0x244f780)
0+ params, 2 slots, 1 upvalue, 0 locals, 2 constants, 0 functions
1 [1] SETTABUP 0 -1 -2 ; _ENV "b" -9223372036854775808
2 [1] RETURN 0 1
constants (2) for 0x244f780:
1 "b"
2 -9223372036854775808
locals (0) for 0x244f780:
upvalues (1) for 0x244f780:
0 _ENV 1 0
I'm attempting to read IR information from a NodeMCU running Lua 5.1.4 from a master build as of 8/19/2017.
I might be misunderstanding how GPIO works and I'm having a hard time finding examples that relate to what I'm doing.
pin = 4
pulse_prev_time = 0
irCallback = nil
function trgPulse(level, now)
gpio.trig(pin, level == gpio.HIGH and "down" or "up", trgPulse)
duration = now - pulse_prev_time
print(level, duration)
pulse_prev_time = now
end
function init(callback)
irCallback = callback
gpio.mode(pin, gpio.INT)
gpio.trig(pin, 'down', trgPulse)
end
-- example
print("Monitoring IR")
init(function (code)
print("omg i got something", code)
end)
I'm triggering the initial interrupt on low, and then alternating from low to high in trgPulse. In doing so I'd expect the levels to alternate from 1 to 0 in a perfect pattern. But the output shows otherwise:
1 519855430
1 1197
0 609
0 4192
0 2994
1 589
1 2994
1 1198
1 3593
0 4201
1 23357
0 608
0 5390
1 1188
1 4191
1 1198
0 3601
0 3594
1 25147
0 608
1 4781
0 2405
1 3584
0 4799
0 1798
1 1188
1 2994
So I'm clearly doing something wrong or fundamentally don't understand how GPIO works. If this is expected, why are the interrupts being called multiple times if the low/high levels didn't change? And if this does seem wrong, any ideas how to fix it?
I'm clearly doing something wrong or fundamentally don't understand how GPIO works
I suspect it's a bit a combination of both - the latter may be the cause for the former.
My explanation may not be 100% correct from a mechanical/electronic perspective (not my world) but it should be enough as far as writing software for GPIO goes. Switches tend to bounce between 0 and 1 until they eventually settle for one. A good article to read up on this is https://www.allaboutcircuits.com/technical-articles/switch-bounce-how-to-deal-with-it/. The effect can be addressed with hardware and/or software.
Doing it with software usually involves introducing some form of delay to skip the bouncing signals as you're only interested in the "settled state". I documented the NodeMCU Lua function I use for that at https://gist.github.com/marcelstoer/59563e791effa4acb65f
-- inspired by https://github.com/hackhitchin/esp8266-co-uk/blob/master/tutorials/introduction-to-gpio-api.md
-- and http://www.esp8266.com/viewtopic.php?f=24&t=4833&start=5#p29127
local pin = 4 --> GPIO2
function debounce (func)
local last = 0
local delay = 50000 -- 50ms * 1000 as tmr.now() has μs resolution
return function (...)
local now = tmr.now()
local delta = now - last
if delta < 0 then delta = delta + 2147483647 end; -- proposed because of delta rolling over, https://github.com/hackhitchin/esp8266-co-uk/issues/2
if delta < delay then return end;
last = now
return func(...)
end
end
function onChange ()
print('The pin value has changed to '..gpio.read(pin))
end
gpio.mode(pin, gpio.INT, gpio.PULLUP) -- see https://github.com/hackhitchin/esp8266-co-uk/pull/1
gpio.trig(pin, 'both', debounce(onChange))
Note: delay is an empiric value specific to the sensor/switch!
In writing some one-off Lua code for an answer, I found myself code golfing to fit a function on a single line. While this code did not fit on one line...
foo=function(a,b) local c=bob; some_code_using_c; return c; end
...I realized that I could just make it fit by converting it to:
foo=function(a,b,c) c=bob; some_code_using_c; return c; end
Are there any performance or functional implications of using a function parameter to declare a function-local variable (assuming I know that a third argument will never be passed to the function) instead of using local? Do the two techniques ever behave differently?
Note: I included semicolons in the above for clarity of concept and to aid those who do not know Lua's handling of whitespace. I am aware that they are not necessary; if you follow the link above you will see that the actual code does not use them.
Edit Based on #Oka's answer, I compared the bytecode generated by these two functions, in separate files:
function foo(a,b)
local c
return function() c=a+b+c end
end
function foo(a,b,c)
-- this line intentionally blank
return function() c=a+b+c end
end
Ignoring addresses, the byte code report is identical (except for the number of parameters listed for the function).
You can go ahead and look at the Lua bytecode generated by using luac -l -l -p my_file.lua, comparing instruction sets and register layouts.
On my machine:
function foo (a, b)
local c = a * b
return c + 2
end
function bar (a, b, c)
c = a * b
return c + 2
end
Produces:
function <f.lua:1,4> (4 instructions at 0x80048fe0)
2 params, 4 slots, 0 upvalues, 3 locals, 1 constant, 0 functions
1 [2] MUL 2 0 1
2 [3] ADD 3 2 -1 ; - 2
3 [3] RETURN 3 2
4 [4] RETURN 0 1
constants (1) for 0x80048fe0:
1 2
locals (3) for 0x80048fe0:
0 a 1 5
1 b 1 5
2 c 2 5
upvalues (0) for 0x80048fe0:
function <f.lua:6,9> (4 instructions at 0x800492b8)
3 params, 4 slots, 0 upvalues, 3 locals, 1 constant, 0 functions
1 [7] MUL 2 0 1
2 [8] ADD 3 2 -1 ; - 2
3 [8] RETURN 3 2
4 [9] RETURN 0 1
constants (1) for 0x800492b8:
1 2
locals (3) for 0x800492b8:
0 a 1 5
1 b 1 5
2 c 1 5
upvalues (0) for 0x800492b8:
Not very much difference, is there? If I'm not mistaken, there's just a slightly different declaration location specified for each c, and the difference in the params size, as one might expect.
In the below code, could anyone please explain how b,a = a,b internally works?
-- Variable definition:
local a, b
-- Initialization
a = 10
b = 30
print("value of a:", a)
print("value of b:", b)
-- Swapping of variables
b, a = a, b
print("value of a:", a)
print("value of b:", b)
Consider the Lua script:
local a, b
a = 10
b = 30
b, a = a, b
Run luac -l on it and you'll get this:
1 [1] LOADNIL 0 1
2 [2] LOADK 0 -1 ; 10
3 [3] LOADK 1 -2 ; 30
4 [4] MOVE 2 0
5 [4] MOVE 0 1
6 [4] MOVE 1 2
7 [4] RETURN 0 1
These are the instructions of the Lua VM for the given script. The local variables are assigned to registers 0 and 1 and then register 2 is used for the swap, much like you'd do manually using a temporary variable. In fact, the Lua script below generates exactly the same VM code:
local a, b
a = 10
b = 30
local c=a; a=b; b=c
The only difference is that the compiler would reuse register 2 in the first case if the script was longer and more complex.
I assume that by internally you don't mean Lua C code?
Basically, in multiple assignment Lua always evaluates all expressions on the right hand side of the assignment before performing the assigment.
So if you use your variables on both side of the assigment, you can be sure that:
local x, y = 5, 10
x, y = doSomeCalc(y), doSomeCalc(x) --x and y retain their original values for both operations, before the assignment is made
I'm new to Lua and trying to get things sorted in my head. I tried this code:
function newCarousel(images)
local slideToImage = function()
print("ah!")
end
end
local testSlide = newCarousel(myImages)
testSlide.slideToImage()
Which gave me this error:
Attempt to index local "testSlide" (a nil value)...
Why is this?
Because newCarousel returns nothing, so testSlide is nil, so when you try to index it (testSlide.slideToImage is exactly equivalent to testSlide["slideToImage"]) you get an error.
I would recommend reading Programming in Lua. You may be able to work out the language's syntax, semantics, and idioms by trial and error, but it'll take you a lot longer.
If you want to be able to do testSlide.slideToImage() you have to modify newCarousel so that it returns a table with a function inside it. The simplest implementation is the following:
function newCarousel(images)
local t = {}
t.slideToImage = function()
print("ah!")
end
return t
end
You can even build t and return it on a single step; the following code is equivalent to the one above:
function newCarousel(images)
return {
slideToImage = function()
print("ah!")
end
}
end
The code you've got now, as Mud stated, doesn't return anything. (This is not Scheme or Ruby or the like where the last expression is the return value.) Further, you seem to be thinking that newCarousel is an object. It isn't. It's a function. When you've finished calling newCarousel it's over. It's done its work, whatever that may be (which in your case is creating a local variable that is promptly dropped and returning nil).
Correct code for this would look more like:
function newCarousel(images)
return function()
print("ah!")
end
end
local testSlide = newCarousel(myImages)
testSlide()
Here I now have newCarousel creating an (anonymous) function and immediately returning it. This anonymous function is bound to testSlide so I can invoke it any time I like for as long as testSlide remains in scope.
It's instructive to look at the generated code when playing with Lua. First let's look at what luac churns out for your code:
main <junk.lua:0,0> (8 instructions, 32 bytes at 0xeb6540)
0+ params, 2 slots, 0 upvalues, 1 local, 3 constants, 1 function
1 [5] CLOSURE 0 0 ; 0xeb6720
2 [1] SETGLOBAL 0 -1 ; newCarousel
3 [7] GETGLOBAL 0 -1 ; newCarousel
4 [7] GETGLOBAL 1 -2 ; myImages
5 [7] CALL 0 2 2
6 [8] GETTABLE 1 0 -3 ; "slideToImage"
7 [8] CALL 1 1 1
8 [8] RETURN 0 1
function <junk.lua:1,5> (2 instructions, 8 bytes at 0xeb6720)
1 param, 2 slots, 0 upvalues, 2 locals, 0 constants, 1 function
1 [4] CLOSURE 1 0 ; 0xeb6980
2 [5] RETURN 0 1
function <junk.lua:2,4> (4 instructions, 16 bytes at 0xeb6980)
0 params, 2 slots, 0 upvalues, 0 locals, 2 constants, 0 functions
1 [3] GETGLOBAL 0 -1 ; print
2 [3] LOADK 1 -2 ; "ah!"
3 [3] CALL 0 2 1
4 [4] RETURN 0 1
In your code the mainline creates a closure, binds it to the name newCarousel, gets that value, gets the value of myImages and does a call. This corresponds to local testSlide = newCarousel(myImages). Next it gets the slideToImage value from the local table (testSlide). The problem here is that testSlide isn't a table, it's nil. This is where your error message is coming from. This isn't the only error, mind you, but it's the first one the runtime sees and is what's making everything choke. If you'd returned an actual function from newCarousel you'd get a different error. If, for example, I'd added the line return slideToImage to the newCarousel function, the error message would have been "attempt to index local 'testSlide' (a function value)".