Note: There are some questions below that illustrate my thinking, but the only answer I'm looking for is the answer to the actual question, in the title. Not asking for "a book" here, or itemized responses to all of those.
I'm trying to start a coroutine from the C API, let it yield, and continue it later (possibly after executing resumes from other coroutines). This is a fairly simple use case, but the documentation for lua_resume() is extremely confusing:
int lua_resume (lua_State *L, lua_State *from, int nargs);
Starts and resumes a coroutine in the given thread L.
To start a coroutine, you push onto the thread stack the main function plus
any arguments; then you call lua_resume, with nargs being the number of
arguments. This call returns when the coroutine suspends or finishes its
execution. When it returns, the stack contains all values passed to
lua_yield, or all values returned by the body function. lua_resume returns
LUA_YIELD if the coroutine yields, LUA_OK if the coroutine finishes its
execution without errors, or an error code in case of errors (see
lua_pcall).
In case of errors, the stack is not unwound, so you can use the debug API
over it. The error message is on the top of the stack.
To resume a coroutine, you remove any results from the last lua_yield, put
on its stack only the values to be passed as results from yield, and then
call lua_resume.
The parameter from represents the coroutine that is resuming L. If there is
no such coroutine, this parameter can be NULL.
The meaning of "represents the coroutine that is resuming L" is extremely unclear here. When exactly is "from" nil, which of L and from are the "thread stack", and what are the exact requirements for resuming a coroutine? Can the state L be modified between the initial lua_resume() and the second one which is actually resuming? If so, how does the state know where/what function to resume? If not (ie. one thread per lua_State) what is the proper way to create a new thread, so that it shares an execution context (globals, environment, etc) with the parent thread, and what is the proper way to call lua_resume() and unwind the call for each start/resume in this case? Could the 'from' argument do / be related to any of these things?
And finally, in either case - how can I get a full stack trace for debugging in an error handler (for example invoked on error from lua_pcall) which respects / is aware of calls across resumes? Does lua_getinfo() report correctly through resumes? Is that what the 'from' argument is for?
I'd really like a complete, working example of a coroutine start/resume from the C API that illustrates the use of that second argument. There's this example on gist, but it's years old and the API has changed since then - lua_resume() takes 3 arguments now, not 2...
Related
Today I have read an article which is about golang goroutine. It says that if there are too many recursive call, the goroutine space will be extended. In my mind, while program running, each function call will create a new stack, system only push a pointer-like object(some mechine code) which point to the created stack to caller function stack top. When cpu load this object, it will save current context and jump to created stack. After called function having returned, cpu will write the returned value which is in register back to the object. If my understanding is right, there is only a little space cast for recursive function call at calling stack. As reading that article, I doubt my understanding is wrong. May each function call make the whole called function code been pushed into calling function stack but not a pointer-like object. If this is true, a function call will cast much space at calling stack. I searched this question by google, but no result. Is there anyone can help me?
Update: I found the answer https://www.geeksforgeeks.org/memory-layout-of-c-program/.
How does a function(A) defined inside another function(B) in which function(B) is a registered process, access the mailbox of function(B)?
Can i define multiple functions in function(B), the function in which the registered process is defined, which can have receive clauses that will access messages sent to function(B) by other processes?
In the second paragraph you have answered the first one. The way you get messages (if A and B are in the same process) it's by the receive clause. As long as they are in the same process, they will access the same mailbox.
In function(B) you can certainly have any number of function calls with any number of receive clauses. Now, if you want them to have the same mailbox, they are to be in the same process and therefore they will be executed in sequence. Also note that the ´receive` clause suspends the execution until something it's received (or the timeout it's reached, if defined).
So in this hypothetical scenario, these many functions will be executed one after another and each one will block the whole process until something it's received (or the timeout it's reached, if defined). Then continue.
From the wikibook on F# there is a small section where it says:
What does let! do?#
let! runs an async<'a> object on its own thread, then it immediately
releases the current thread back to the threadpool. When let! returns,
execution of the workflow will continue on the new thread, which may
or may not be the same thread that the workflow started out on.
I have not found anywhere else in books or on the web where this fact (highlighted in bold) is stated.
Is this true for all let!/do! regardless of what the async object contains (e.g. Thread.Sleep()) and how it is started (e.g. Async.Start)?
Looking in the F# source code on github, I wasn't able to find the place where a call to bind executes on a new (TP) thread. Where in the code is the magic happening?
Which part of that statement do you find surprising? That parts of a single async can execute on different threadpool threads, or that a threadpool thread is necessarily being released and obtained on each bind?
If it's the latter, then I agree - it sounds wrong. Looking at the code, there are only a few places where a new work item is being queued on the threadpool (namely, the few Async module functions that use queueAsync internally), and Async.SwitchToNewThread spawns a non-threadpool thread and runs the continuation there. A bind alone doesn't seem to be enough to switch threads.
The spirit of the statement however seems to be about the former - no guarantees are made that parts of an async block will run on the same thread. The exact thread that you run on should be treated as an implementation detail, and when you yield control and await some result, you can be pretty sure that you'll land on a different thread at least some of the time.
No. An async operations might execute synchronously on the current thread, or it might wind up completing on a different thread. It depends entirely on how the async API in question is implemented.
See Do the new C# 5.0 'async' and 'await' keywords use multiple cores? for a decent explanation. The implementation details of F# and C# async are different, but the overall principles are the same.
The builder that implements the F# async computation expression is here.
I'm having some trouble understanding how to use coroutines properly with luabind. There's a templated function:
template<class Ret>
Ret resume_function(object const& obj, ...)
Where (Ret) is supposed to contain the values passed to yield by Lua.
My current points of confusion are:
What happens if the function returns rather than calling yield? Does resume_function return the function's return value?
How are you supposed to use this function if you don't know ahead of time which (or how many) parameters will be passed to yield? For example, if there are multiple possible yielding functions the function may call.
What is the type of Ret if multiple values are passed to yield?
Am I just entirely mistaken as to how all this works? I envision something like this. On the Lua side:
local img = loadImage("foo.png")
loadImage would be a C++ function which requests the image to be loaded in a different thread and then calls lua_yield, and some time later luabind::resume_function gets called with img as a parameter.
Should I pass "foo.png" to yield as a parameter? To a different function before I call yield, and then never pass any values to yield? What's the right way to structure this? I'm obviously misunderstanding something here.
Where (Ret) is supposed to contain the values passed to yield by Lua.
Luabind only supports single return values, so it only will return the first value passed to coroutine.yield.
What happens if the function returns rather than calling yield? Does resume_function return the function's return value?
Yes, you get its return value.
How are you supposed to use this function if you don't know ahead of time which (or how many) parameters will be passed to yield? For example, if there are multiple possible yielding functions the function may call.
That's up to you; they're your functions. You have to develop conventions about what the yielding function(s) receive as parameters, and what the function resuming the coroutine provides.
What is the type of Ret if multiple values are passed to yield?
Whatever you want it to be. It's the template parameter. The number of parameters to a function has no bearing on the return values that the function provides.
Remember: Lua functions take any number of parameters and can return anything. All Luabind can do is pass along the parameters you give it and convert the return value from Lua functions into what you expect that return value to be. Luabind will do type-checking on the return value of course. But it is your responsibility to make sure that the functions yielding/returning will return something that is convertable to the type the user provides for Ret.
loadImage would be a C++ function which requests the image to be loaded in a different thread and then calls lua_yield, and some time later luabind::resume_function gets called with img as a parameter.
If you're using Luabind, never call lua_yield directly. The proper way to yield in Luabind is to add an attribute to a function you register that will yield whenever you return from the function. The syntax is as follows:
module(L)
[
def("do_thing_that_takes_time", &do_thing_that_takes_time, yield)
];
That is, a C++ function that yields must always yield. This is a limitation of Luabind, as with regular Lua, you can choose whether to yield or not as you see fit.
Also, don't forget that Lua coroutines are not the same thing as actual threads. They are not preemptive; they will only execute when you explicitly tell them to with coroutine.resume or an equivalent resume call.
Also, you should never run the same Lua instance from multiple C/C++ threads; Lua is not thread-safe within the same instance (which more or less means the same lua_State object).
What you seem to want to do is have Lua call some function in C++ that itself spawns a thread to do some process, then have the Lua code wait until that thread is complete and then receives its answer.
To do that, you need to give to the Lua script an object that represents the C++ thread. So your loadImage function should not be using coroutine logic; it should return an object that represents the C++ thread. The Lua script can ask the object if it has completed, and if it has, it can query data from it.
The place where coroutines can come into play here is if you don't want the Lua script to wait until this is finished. That is, you're calling the Lua script every so often, but if the C++ thread isn't done, then it should just return. In which case, you can do something like this:
function loadImageAsCoroutine(imageFilename)
local cppThread = cpp.loadImage(imageFilename);
local function threadFunc(cppThread)
if(cppThread:isFinished()) then
local data = cppThread:GetImage();
return data;
else
coroutine.yield();
end
end
local thread = coroutine.create(threadFunc);
local errors, data = assert(coroutine.resume(thread, cppThread));
if(coroutine.status(thread) == "dead") then
return data;
else
return thread;
end
end
This function returns a coroutine or the image data itself. The caller of this function should check the type; if the type is "thread", then the C++ thread hasn't finished yet. Otherwise, it is the image data.
The caller of this function can pump the coroutine however much they want with some equivalent of coroutine.resume (whether it's luabind::resume_function or whatever). Each time, check the return value. It will be nil if the C++ thread hasn't finished, and not nil otherwise.
One of the things that attracted me to Erlang in the first place is the Actor model; the idea that different processes run concurrently and interact via asynchronous messaging.
I'm just starting to get my teeth into OTP and in particular looking at gen_server. All the examples I've seen - and granted they are tutorial type examples - use handle_call() rather than handle_cast() to implement module behaviour.
I find that a little confusing. As far as I can tell, handle_call is a synchronous operation: the caller is blocked until the callee completes and returns. Which seems to run counter to the async message passing philosophy.
I'm about to start a new OTP application. This seems like a fundamental architectural decision so I want to be sure I understand before embarking.
My questions are:
In real practice do people tend to use handle_call rather than handle_cast?
If so, what's the scalability impact when multiple clients can call the same process/module?
Depends on your situation.
If you want to get a result, handle_call is really common. If you're not interested in the result of the call, use handle_cast. When handle_call is used, the caller will block, yes. This is most of time okay. Let's take a look at an example.
If you have a web server, that returns contents of files to clients, you'll be able to handle multiple clients. Each client have to wait for the contents of files to be read, so using handle_call in such a scenario would be perfectly fine (stupid example aside).
When you really need the behavior of sending a request, doing some other processing and then getting the reply later, typically two calls are used (for example, one cast and the one call to get the result) or normal message passing. But this is a fairly rare case.
Using handle_call will block the process for the duration of the call. This will lead to clients queuing up to get their replies and thus the whole thing will run in sequence.
If you want parallel code, you have to write parallel code. The only way to do that is to run multiple processes.
So, to summarize:
Using handle_call will block the caller and occupy the process called for the duration of the call.
If you want parallel activities to go on, you have to parallelize. The only way to do that is by starting more processes, and suddenly call vs cast is not such a big issue any more (in fact, it's more comfortable with call).
Adam's answer is great, but I have one point to add
Using handle_call will block the process for the duration of the call.
This is always true for the client who made the handle_call call. This took me a while to wrap my head around but this doesn't necessarily mean the gen_server also has to block when answering the handle_call.
In my case, I encountered this when I created a database handling gen_server and deliberately wrote a query that executed SELECT pg_sleep(10), which is PostgreSQL-speak for "sleep for 10 seconds", and was my way of testing for very expensive queries. My challenge: I don't want the database gen_server to sit there waiting for the database to finish!
My solution was to use gen_server:reply/2:
This function can be used by a gen_server to explicitly send a reply to a client that called call/2,3 or multi_call/2,3,4, when the reply cannot be defined in the return value of Module:handle_call/3.
In code:
-module(database_server).
-behaviour(gen_server).
-define(DB_TIMEOUT, 30000).
<snip>
get_very_expensive_document(DocumentId) ->
gen_server:call(?MODULE, {get_very_expensive_document, DocumentId}, ?DB_TIMEOUT).
<snip>
handle_call({get_very_expensive_document, DocumentId}, From, State) ->
%% Spawn a new process to perform the query. Give it From,
%% which is the PID of the caller.
proc_lib:spawn_link(?MODULE, query_get_very_expensive_document, [From, DocumentId]),
%% This gen_server process couldn't care less about the query
%% any more! It's up to the spawned process now.
{noreply, State};
<snip>
query_get_very_expensive_document(From, DocumentId) ->
%% Reference: http://www.erlang.org/doc/man/proc_lib.html#init_ack-1
proc_lib:init_ack(ok),
Result = query(pgsql_pool, "SELECT pg_sleep(10);", []),
gen_server:reply(From, {return_query, ok, Result}).
IMO, in concurrent world handle_call is generally a bad idea. Say we have process A (gen_server) receiving some event (user pressed a button), and then casting message to process B (gen_server) requesting heavy processing of this pressed button. Process B can spawn sub-process C, which in turn cast message back to A when ready (of to B which cast message to A then). During processing time both A and B are ready to accept new requests. When A receives cast message from C (or B) it e.g. displays result to the user. Of course, it is possible that second button will be processed before first, so A should probably accumulate results in proper order. Blocking A and B through handle_call will make this system single-threaded (though will solve ordering problem)
In fact, spawning C is similar to handle_call, the difference is that C is highly specialized, process just "one message" and exits after that. B is supposed to have other functionality (e.g. limit number of workers, control timeouts), otherwise C could be spawned from A.
Edit: C is asynchronous also, so spawning C it is not similar to handle_call (B is not blocked).
There are two ways to go with this. One is to change to using an event management approach. The one I am using is to use cast as shown...
submit(ResourceId,Query) ->
%%
%% non blocking query submission
%%
Ref = make_ref(),
From = {self(),Ref},
gen_server:cast(ResourceId,{submit,From,Query}),
{ok,Ref}.
And the cast/submit code is...
handle_cast({submit,{Pid,Ref},Query},State) ->
Result = process_query(Query,State),
gen_server:cast(Pid,{query_result,Ref,Result});
The reference is used to track the query asynchronously.