Way to write code "in the debugger" in Lua? - lua

I just played around a bit with Lua and tried the Koneki eclipse plugin, which is quite nice. Problem is that when I make changes in a function I'm debugging at the moment the changes do not become effective when saving the changes. So I'm forced to restart the application. Would be so nice if I could make changes in the debugger and they would become effective on the fly as for example with Smalltalk or to some extend as in hot code replacement in Java. Anybody has a clue whether this is possible?

It is possible to some degree with some limitations. I've been developing an IDE/debugger that provides this functionality. It gives you access to a remote console to execute commands in the context/environment of your running application. The IDE also supports live coding, which reloads modified code as you make changes to it; see demos here.
The main limitation is that you can't modify a currently running function (at least without changes to Lua VM). This means that the effect of your changes to the currently running function will only be seen after you exit and re-enter that function. It works well for environments that call the same function repeatedly (for example a game engine calling draw), but may not work in your case.
Another challenge is dealing with upvalues (values that are created outside of your function and are referenced inside it). There are methods to "read" current upvalues and re-create them when the (new) function is created, but it requires some code analysis to find what functions will be recreated to query them for upvalues, to get the current values, and then to create a new environment with those upvalue and assign proper values to them. My current implementation doesn't do this, which means you need to use global variables as a workaround.
There was also relevant discussion just the other day on the Lua mailing list.

Related

Running untrusted code in Lua via loadstring

I'm working with a modding api for a game, for those curious it's factorio but it's not really relevant, and the Lua environment is HEAVILY limited, blocking functions like setfenv, it's a 5.1 environment and I do have access to loadstring, pcall, etc. My question is how would you recommend running 'unsafe' code that is provided by a user and limiting what functions they can access without access to environment modification functions? (Preferably whitelist functions/values instead of blacklist, but I'll take whatever I can get)
In Lua 5.1 you need setfenv to create a secure sandbox (see this answer for a typical procedure). So if you don't have access to setfenv, then I don't think it can't be done.
Then again, if the environment you're working in has disabled setfenv and has put a wrapper around loadstring to avoid malicious bytecode loading (again, see the answer I linked) then you might be able to run the script without setting up a special environment for it. It really depends on the details of your current environment as to whether it's safe or not.
I apologize for a late answer (you've probably moved on by now) but it is possible to do this using the built in load function. You can supply a fourth argument to the function which is a custom environment and it returns a function. You can pass a function, a string, or possibly even a thread (I think) to load and get the result you want. I was also having this problem and I thought I'd answer it for future users.
Here is a link to the documentation on the lua site for load: https://www.lua.org/manual/5.2/manual.html#pdf-load
I have tested this to ensure it works properly in Factorio and it appears to work as intended.

Loading additional javascript code from firefox add-on content script

I'm writing something that I want to release as both a chrome extension and a firefox add-on.
The chrome extension is already available on github. I've factored my code into several modules using a module load format similar to what requirejs uses; I did this to separate the parts that were chrome-specific from the parts I hoped to re-use in the firefox add-on.
Specifically, I split up not only the backend work, but also the content scripts.
In chrome, when my content script needs to load another module, it sends a message to the background page saying "please load this module"; the script on the background page then does:
function onLoadLibrary(request, sender, sendResponse) {
var allFrames = request.allFrames || false;
chrome.tabs.executeScript(
sender.tab.id, {file: request.library.toLowerCase() + '.js',
allFrames: allFrames},
function () {
sendResponse({});
});
return true;
}
That is, I'm able to load additional javascript into the same sandbox as the content script that asked for that code. This is necessary to make module dependencies work.
In firefox, I can't figure out how to do this. I'll attach my initial content scripts through pageMods and by calling tab.attach from the "ready" event of tabs. That seems straightforward, but then if that content script needs to load more code I can't see how to do it.
There doesn't seem to be a way to access the sandbox my content script is running in from the main.js file so that I might inject more code into it. Even if I somehow kept a reference to the relevant tab instance (which only lets me inject into the top frame in any case), it appears that each new call to tab.attach puts injected code into a new sandbox. The object tab that's passed to my ready event handle isn't a real XUL tab that I could pass to require("tabs/util").getBrowserForTab; if it were, then I think I can follow through enough of the sdk code to create my own sandbox, though I'd worry about leaving accidental memory leaks behind.
I considered passing the code back to the content script through a "eval-this-code" message, but I really don't want to use eval in my extension because of security concerns; I also worry that using eval would make it difficult to impossible to get my firefox add-on approved for AMO. (Also, how would that interact when my add-on runs on sites with a Content Security Policy?)
The usage of traits to define the add-on API seems to close off access to objects such that I can't reach inside a Worker to get a reference to the sandbox my content script is executing in. At this point, it appears that I'd need to include nearly a full copy of the sdk in my add-on just to expose one method on WorkerSandbox.
Note: I'm using the Add-On sdk (the project formerly known as JetPack). I'm willing to use Components.utils.import if someone can tell me how to use that from inside an Add-On SDK-managed content script.
Content-scripts do not expose a public API to attach more scripts to a content-script sandbox after it was initialized. You should probably file an enhancement bug and state your use case, if there isn't one filed already (search first), and/or even come up with some patches yourself.
In cases where there is a DOM that your add-on own (widget), then it's just a matter of attaching another script tag.
For things like page-mods where there is no DOM you own, you're left with a couple of options, none of which is really satisfying. As you already found out yourself, the use of traits prohibits you from accessing "private" properties/methods.
Fork page-mod/tab/content-worker to provide the functionality you need. That would require creating your own copies of the modules and expose the necessary APIs to inject scripts into existing workers.
This is has a steep learning curve (but given that you already figured out details such as traits, should be doable for you), but more importantly hard to maintain as you need to make sure you keep up with the upstream. And AMO editors will not like you very much for it :p
On the plus side, you could try to get your stuff committed upstream, fixing this problem for everybody and become a hero to many authors using the Add-on SDK.
The eval method you propose. Not only is this eval a major source for security issues, but it also may be a performance killer, as right now IIRC evaled code will not use the JIT. And, of course, it will make us AMO editors cringe, even if used "correctly".
Do not use lazy loading at all, and specify all content scripts from the very beginning. This is what add-ons usually do (I'm almost inclined to say "always"). However, this conflicts with your current design, and depending on your add-on may pose a serious performance penalty for loading stuff you didn't really need in the end.
You could use the require mechanism to have most scripts as SDK module and not content-scripts. This is not always feasible, of course, e.g. when dealing with code that would normally modify the DOM in your content-script, but might work for some other stuff.
Replace page-mod, etc with your own Greasemonkey-like, enhanced API. This means lots of work, it is error-prone, security-sensitive and has to be maintained. So, it's not really a viable solution, IMO...
Components.utils.import does not help you. It isn't available to content-scripts anyway.

How would someone create a preemptive scheduler for the Lua VM?

I've been looking at lua and lvm.c. I'd very much like to implement an interface to allow me to control the VM interpreter state.
Cooperative multitasking from within lua would not work for me (user contributed code)
The debug hook gets me only about 50% of the way there, instruction execution limits, but it raises an exception which just crashes the running lua code - but I need to be able to tweak it even further.
I want to create a system where 10's of thousands of lua user scripts are running - individual threads would not work, and the execution limits would cause headache for beginning developers, I'm going to control execution speeds too. but ultimately
while true do
end
will execute forever, and I really don't care that it is.
Any ideas, help or other implementations that I could look at?
EDIT: This is not about sandboxing pretend I'm an expert in that field for this conversation
EDIT: I do not want to use an internally ran lua code coroutine based controller.
EDIT: I want to run one thread, and manage a large number of user contributed lua scripts, an external process level control mechansim would not scale at all.
You can search for Lua Sandbox implementations; for example, this wiki page and SO question provide some pointers. Note that most of the effort in sandboxing is focused on not allowing you to execute bad code, but not necessarily on preventing infinite loops. For better control you may need to combine Lua sandboxing with something like LXC or cpulimit. (not relevant based on the comments)
If you are looking for something Lua-based, lightweight, but not necessarily 100% foolproof, then you can try running your client code in a separate coroutine and set a debug hook on that coroutine that will be triggered every N-th line. In that hook you can check if the process you are running exceeded its quotes. You also need to take care of new coroutines started as those need to have their own hooks set (you either need to disable coroutine.create/wrap or to replace them with something that sets the debug hook you need).
The code in this case may look like:
local coro = coroutine.create(client_func)
debug.sethook(coro, debug_hook, "l", 1000) -- trigger hook on every 1000th line
It's not foolproof, because it may block on some IO operation and the debug hook will not help there.
[Edit based on updated question and comments]
Between "no lua code coroutine based controller" and "no external process control mechanism" I don't think you are left with much choice. It may be that your only option is to run one VM per user script and somehow give ticks to those VMs (there was a recent question on SO on this, but I can't find it). Before going this route, I would still try to do this with coroutines (which should scale to tens of thousands easily; Tir claims supporting 1M active users with coroutine-based architecture).
The mechanism would roughly look like this: you install the debug hook as I shown above and from that hook you yield back to your controller, which then decides what other coroutine (user script) to resume. I have this very mechanism working in the Lua debugger I've been developing (although it only does it for one client script). This doesn't protect you from IO calls that can block and for that you may still need to have a watchdog at the VM level to see if it's been blocked for longer than needed.
If you need to serialize and deserialize running code fragments that preserve upvalues and such, then Pluto is probably your only option.
Look at implementing lua_lock and lua_unlock.
http://www.lua.org/source/5.1/llimits.h.html#lua_lock
Take a look at lulu. It is lua VM written on lua. It's for Lua 5.1
For newer version you need to do some work. But it's then you really can make a schelduler.
Take a look at this,
https://github.com/amilamad/preemptive-task-scheduler-for-lua
I maintain this project. It,s a non blocking preemptive scheduler for running lua code. Suitable for long running game scripts.

Clone a lua state

Recently, I have encountered many difficulties when I was developing using C++ and Lua. My situation is: for some reason, there can be thousands of Lua-states in my C++ program. But these states should be same just after initialization. Of course, I can do luaL_loadlibs() and lua_loadfile() for each state, but that is pretty heavy(in fact, it takes a rather long time for me even just initial one state). So, I am wondering the following schema: What about keeping a separate Lua-state(the only state that has to be initialized) which is then cloned for other Lua-states, is that possible?
When I started with Lua, like you I once wrote a program with thousands of states, had the same problem and thoughts, until I realized I was doing it totally wrong :)
Lua has coroutines and threads, you need to use these features to do what you need. They can be a bit tricky at first but you should be able to understand them in a few days, it'll be well worth your time.
take a look to the following lua API call I think it is what you exactly need.
lua_State *lua_newthread (lua_State *L);
This creates a new thread, pushes it on the stack, and returns a pointer to a lua_State that represents this new thread. The new thread returned by this function shares with the original thread its global environment, but has an independent execution stack.
There is no explicit function to close or to destroy a thread. Threads are subject to garbage collection, like any Lua object.
Unfortunately, no.
You could try Pluto to serialize the whole state. It does work pretty well, but in most cases it costs roughly the same time as normal initialization.
I think it will be hard to do exactly what you're requesting here given that just copying the state would have internal references as well as potentially pointers to external data. One would need to reconstruct those internal references in order to not just have multiple states pointing to the clone source.
You could serialize out the state after one starts up and then load that into subsequent states. If initialization is really expensive, this might be worth it.
I think the closest thing to doing what you want that would be relatively easy would be to put the states in different processes by initializing one state and then forking, however your operating system supports it:
http://en.wikipedia.org/wiki/Fork_(operating_system)
If you want something available from within Lua, you could try something like this:
How do you construct a read-write pipe with lua?

How to implement a code coverage tool using Win32 Debugging API

I am trying to understand how to implement a Code Coverage tool using the Win32 Debugging API.
My thinking has been to utilize the Win32 Debugging API to launch a process in debug mode - and track what CPU instructions has been executed. After having tracked all CPU instructions I would then use the map file to map it to what source code lines were executed.
As far as I understand, there would be two ways of knowing what CPU instructions have been executing.
Would be to launch the process in debug mode - set all threads in single step mode and let the debugging app note all instructions that has been executed
Would be make a more intelligent approach where you would know a lot more about x86 instructions and basically replace the next branch instruction with a breakpoint. Then keeping track of the delta instructions between the two breakpoints.
Update - new suggested approaches inspired by Michael's response:
Start with the map file and insert breakpoints for the beginning of each line and let the debug framework be notified every time a breakpoint hits.
Start with the map file - binary instrumentation to insert a "hook" that get called at entry of each source line - avoiding the callback through the debugger framework.
Using a VM Technology - such as VMware to find out what instructions in a particular process was executed - I don't fully understand this approach...
Could someone validate one of the approaches above or maybe suggest an alternative - please note that the use case is line-by-line code coverage and not performance profiling - thus we need to know if each single source line is visited.
My primary goal (although no particular plan is in place...) would be to create a simple code coverage tool for Delphi primarily.
Thanks!
One approach is hooking all api calls and function calls to compare with table made from the source. Thus you discovers what is covered.
There is many api for hooking, one is Trappola API hooking
This could work - each single step event will create an exception and you could record the hit IP address in your map of executed code lines.
Unfortunately, I imagine this would be glacially slow. It'd be incredibly inefficient, as each single line of code results in 1000's of times more work, as an exception is generated, trapped, a message sent to your debugger, and then a round trip back after you record the hit. It might be better to try to set breakpoints instead for each covered line and clear them after they are hit. That'd be faster, but most likely still very slow.
The core problem is you're trying to use the debugger as a code coverage tool which it is not intended for. A quick search shows several code coverage tools for Delphi on the Internet.
I would suggest, in stead of hooking for each line of code, you can go for the each block. What I mean to say hook for block of codes. It will be faster and you can get the count of lines as well from the blocks count.

Resources