I've been writing some scripts for a game, the scripts are written in Lua. One of the requirements the game has is that the Update method in your lua script (which is called every frame) may take no longer than about 2-3 milliseconds to run, if it does the game just hangs.
I solved this problem with coroutines, all I have to do is call Multitasking.RunTask(SomeFunction) and then the task runs as a coroutine, I then have to scatter Multitasking.Yield() throughout my code, which checks how long the task has been running for, and if it's over 2 ms it pauses the task and resumes it next frame. This is ok, except that I have to scatter Multitasking.Yield() everywhere throughout my code, and it's a real mess.
Ideally, my code would automatically yield when it's been running too long. So, Is it possible to take a Lua function as an argument, and then execute it line by line (maybe interpreting Lua inside Lua, which I know is possible, but I doubt it's possible if all you have is a function pointer)? In this way I could automatically check the runtime and yield if necessary between every single line.
EDIT:: To be clear, I'm modding a game, that means I only have access to Lua. No C++ tricks allowed.
check lua_sethook in the Debug Interface.
I haven't actually tried this solution myself yet, so I don't know for sure how well it will work.
debug.sethook(coroutine.yield,"",10000);
I picked the number arbitrarily; it will have to be tweaked until it's roughly the time limit you need. Keep in mind that time spent in C functions etc will not increase the instruction count value, so a loop will reach this limit far faster than calls to long-running C functions. It may be viable to set a far lower value and instead provide a function that sees how much os.clock() or similar has increased.
Related
i'm currently trying to find a way to return a static value which would represent how much memory standart a function takes or its time of execution (as a static thread), I thought about using coroutines, however I cannot make any working prototypes, thanks for help in advance ! (:
The Lua function collectgarbage with the string "count" as an argument returns a number that reflects the amount of memory currently in use by the interpreter. Here is a link to an example and more information; I will reproduce the example here:
function memuse()
local i = math.modf(collectgarbage("count") * 1024)
return i
end
This function returns the amount of memory, in kilobytes, currently in use by Lua.
As for time, the simplest way is to call os.time(), which returns the current system time. Note, though, that this only returns the number of seconds to the nearest whole number. If you need greater precision, there are a few options: one, place a system call with io.popen to retrieve the current system time, which includes the non-integer portion; or two, implement some time-related functions in C/C++ and call them from Lua. I have used both options and the second one produces excellent accuracy, but for the sake of simplicity I will just show the first one.
-- Function called 'tick' to retrieve the current OS time.
function tick()
local fil = assert(io.popen("date +%s.%N"))
local str = fil:read("*all")
return tonumber(str)
end
Lua file handles--one of which is produced by the call to io.popen--have their own destructors so they need not be explicitly closed; however, you may want to call fil:close() for the sake of garbage collection and avoiding any open file-related errors.
If you want to pursue the second, more complicated option, I suggest creating a timer class in C++ that makes use of the chrono library to retrieve the system time.
I am not sure that these two functions are of relevance to you, but I hope they help.
I am trying to find out a way to increase the computation time of a function to 1 second without using the sleep function in xilinx microblaze, using the xilkernel.
Hence, may i know how many iterations do i need to do in a simple for loop to increase the computation time to 1 second?
You can't do this reliably and accurately. If you want do a bodge like this, you'll have to calibrate it yourself for your particular system as Microblaze is so configurable, there isn't one right answer. The bodgy way is:
Set up a GPIO peripheral, set one of the pins to '1', run a loop of 1000 iterations (make sure the compiler doesn't optimise it away!) set the pins to '0'. Hang a scope off that pin (you're doing work on embedded systems, you do have a scope, right?) and see how long it takes to run the loop.
But the right way to do it is to use a hardware timer peripheral. Even at a very simple level, you could clear the timer at the start of the function, then poll it at the end until it reaches whatever value corresponds to 1 second. This will still have some imperfections, but given that you haven't specified how close to 1 sec you need to be, it is probably adequate.
I have a cluster app that uses a distributed Redis back-end, with dynamically generated Lua scripts dispatched to the redis instances. The Lua component scripts can get fairly complex and have a significant runtime, and I'd like to be able to profile them to find the hot spots.
SLOWLOG is useful for telling me that my scripts are slow, and exactly how slow they are, but that's not my problem. I know how slow they are, I'd like to figure out which parts of them are slow.
The redis EVAL docs are clear that redis does not export any timekeeping functions to lua, which makes it seem like this might be a lost cause.
So, short a custom fork of Redis, is there any way to tell which parts of my Lua script are slower than others?
EDIT
I took Doug's suggestion and used debug.sethook - here's the hook routine I inserted at the top of my script:
redis.call('del', 'line_sample_count')
local function profile()
local line = debug.getinfo(2)['currentline']
redis.call('zincrby', 'line_sample_count', 1, line)
end
debug.sethook(profile, '', 100)
Then, to see the hottest 10 lines of my script:
ZREVRANGE line_sample_count 0 9 WITHSCORES
If your scripts are processing bound (not I/O bound), then you may be able to use the debug.sethook function with a count hook:
The count hook: is called after the interpreter executes every
count instructions. (This event only happens while Lua is executing a
Lua function.)
You'll have to build a profiler based on the counts you receive in your callback.
The PepperfishProfiler would be a good place to start. It uses os.clock which you don't have, but you could just use hook counts for a very crude approximation.
This is also covered in PiL 23.3 – Profiles
In standard Lua C, you can't. It's not a built-in function - it only returns seconds. So, there are two options available: You either write your own Lua extension DLL to return the time in msec, or:
You can do a basic benchmark using a millisecond-resolution time. You can access the current millisecond time with LuaSocket. Though this adds a dependency to your project, it's an effective way to do trivial benchmarking.
require "socket"
t = socket.gettime();
I've tried every possible fields but can not find the number of times functions are called.
Besides, I don't get Self and # Self. What do these two numbers mean?
There are several other ways to accomplish this. One is obviously to create a static hit counter and an NSLog that emits and increments a counter. This is intrusive though and I found a way to do this with lldb.
Set a breakpoint
Execute the program until you hit the breakpoint the first time and note the breakpoint number on the right hand side of the line you hit (e.g. "Thread 1: breakpoint 7.1", note the 7.1)
Context click on the breakpoint and choose "Edit Breakpoint"
Leave condition blank and choose "Add Action"
Choose "Debugger Command"
In the command box, enter "breakpoint list 7.1" (using the breakpoint number for your breakpoint from step 2). I believe you can use "info break " if you are using gdb.
Check Options "Automatically Continue after evaluating"
Continue
Now, instead of stopping, llvm will emit info about the breakpoint including the number of times it has been passed.
As for the discussion between Glenn and Mike on the previous answer, I'll describe a performance problem where function execution count was useful: I had a particular action in my app where performance degraded considerably with each execution of the action. The Instruments time profiler showed that each time the action was executed, a particular function was taking twice as long as the time before until quickly the app would hang if the action was performed repeatedly. With the count, I was able to determine that with each execution, the function was called twice as many times as it was during the previous execution. It was then pretty easy to look for the reason, which turned out to be that someone was re-registering for a notification in NotificationCenter on each event execution. This had the effect of doubling the number of response handler calls on each execution and thus doubling the "cost" of the function each time. Knowing that it was doubling because it was called twice as many times and not because the performance was just getting worse caused me to look at the calling sequence rather than for reasons the function itself could be degrading over time.
While it's interesting, knowing the number of times called doesn't have anything to do with how much time is spent in them. Which is what Time Profiler is all about. In fact, since it does sampling, it cannot answer how many times.
It seems you cannot use Time Profiler for counting function calls. This question seems to address potential methods for counting.
W/ respect to self and #self:
Self is "The number of times the symbol calls itself." according to the Apple Docs on the Time Profiler.
From the way the numbers look though, it seems self is the summed duration of samples that had this symbol at the bottom of its stack trace. That would make:
# self: the number of samples where this symbol was at the bottom of the stack trace
% self: the percent of self samples relative to total samples of currently displayed call tree
(eg - #self / total samples).
So this wouldn't tell you how many times a method was called. But it would give you an idea how much time is spent in a method or lower in the call tree.
NOTE: I too am unsure about the various 'self' meanings though. Would love to see someone answer this authoritatively. Arrived here searching for that...
IF your objective is to find out what you need to fix to make the program as fast as possible,
Number of calls and self time may be interesting but are irrelevant.
Look at my answer to this question, in particular points 6 and 8.
EDIT: To clarify the point further, suppose the following is the timeline of execution of the program. Some of that time (in this case about 50%) is spent in an activity that can be removed, if you know what it is, such as needless buried I/O, excessive calls to new, runaway notifications, or "insignificant" data validation. If a random-time sample is taken, it has a 50% chance of occurring in that activity, and an examination of the call stack and/or program variables shows that it is doing something that can be removed. Then, if 10 such samples are taken, the activity will be seen on roughly 5 of them, regardless of whether the activity occurs in a few large chunks of time, or many small ones. The activity may be a few lines of code in a function doing something unnecessary, or it may be something much more generalized. Regardless, you recognize it, fix it, and get roughly a factor of 2 speedup. Call counts and self time contribute nothing to this process.
I am running into the following issue while profiling an application under VC6. When I profile the application, the profiler is indicating that a simple getter method similar to the following is being called many hundreds of thousands of times:
int SomeClass::getId() const
{
return m_iId;
};
The problem is, this method is not called anywhere in the test app. When I change the code to the following:
int SomeClass::getId() const
{
std::cout << "Is this method REALLY being called?" << std::endl;
return m_iId;
};
The profiler never includes getId in the list of invoked functions. Comment out the cout and I'm right back to where I started, 130+ thousand calls! Just to be sure it wasn't some cached profiler data or corrupted function lookup table, I'm doing a clean and rebuild between each test. Still the same results!
Any ideas?
I'd guess that what's happening is that the compiler and/or the linker is 'coalescing' this very simple function to one or more other functions that are identical (the code generated for return m_iId is likely exactly the same as many other getters that happen to return a member that's at the same offset).
essentially, a bunch of different functions that happen to have identical machine code implementations are all resolved to the same address, confusing the profiler.
You may be able to stop this from happening (if this is the problem) by turning off optimizations.
I assume you are profiling because you want to find out if there are ways to make the program take less time, right? You're not just profiling because you like to see numbers.
There's a simple, old-fashioned, tried-and-true way to find performance problems. While the program is running, just hit the "pause" button and look at the call stack. Do this several times, like from 5 to 20 times. The bigger a problem is, the fewer samples you need to find it.
Some people ask if this isn't basically what profilers do, and the answer is only very few. Most profilers fall for one or more common myths, with the result that your speedup is limited because they don't find all problems:
Some programs are spending unnecessary time in "hotspots". When that is the case, you will see that the code at the "end" of the stack (where the program counter is) is doing needless work.
Some programs do more I/O than necessary. If so, you will see that they are in the process of doing that I/O.
Large programs are often slow because their call trees are needlessly bushy, and need pruning. If so, you will see the unnecessary function calls mid-stack.
Any code you see on some percentage of stacks will, if removed, save that percentage of execution time (more or less). You can't go wrong. Here's an example, over several iterations, of saving over 97%.