I am new to erlang.
I wonder if it is possible to interrupt a processor in erlang. Assume we have processor x executing a function f1() that takes a long time to execute. I would like to find an efficient way to interrupt the processor x to execute function f2() and after the execution of f2() it goes back to executing f1() from it was interrupted.
One way of doing this (although not exactly what I want) is to let f1() be executed by a processor (name it, f1_proc), while the creator of f1_proc wait for messages such as [interrupt, f1_terminated, etc ..] where if interrupt is received f2() is executed.
However, this is not exactly what I want. What if f2() depends on f1() ? in this case, f1() is paused, f2() is executed and then f1() should start from it stopped. I know we can terminate a process, but can we pause them ?
The answer to your question is no, this can't be done. There is no way to pause a process from the "outside" without any hook (e.g. receive clause) inside the process.
I think your question title (processor) is a bit misleading considering you are trying to work with erlang processes.
You should trying working with erlang hibernate command.
Directly from the above doc link:
Puts the calling process into a wait state where its memory allocation
has been reduced as much as possible, which is useful if the process
does not expect to receive any messages in the near future.
Using timers and message passing between processes you can force your workflow.
i.e. pausing one if it takes too much time, while other continues doing it work.
Though your use case is not so clear in the question, you also can have both (infact more) processes working in parallel without having to wait for one another, and also getting notified once a process has finished it's job.
One way to do it is to simply start both functions in different processes. When f2() is dependent on a result from f1(), it receives a message with the needed data. When f1() is done calculating that data, it sends it to the f2() process.
If f2() reaches the receive clause too early, it will automatically pause and wait until the message arrives (hence letting f1() continue its work). If f1(), however, is done first, it will carry on with it's other tasks until preempted automatically by the Erlang process scheduler.
You can also make f1() pause by letting it wait for a message from f2() as well. In that case, make sure that f1() waits AFTER it has sent its message to avoid deadlocks.
Example:
f1(F2Pid) ->
Data = ...,
F2Pid ! {f1data, Data},
... continue other tasks ....
f2() ->
... do some work ...,
Data = receive
{f1data, F1Data} -> F1Data
end,
... do some work with Data ....
main() ->
F2Pid = spawn_link(?MODULE, f2, []),
f1(F2Pid).
This message passing is fundamental to the Erlang programming model. You donät need to invent synchronisation or locks. Just receive a message and Erlang will make sure you get that message (and that message only).
I don't know how you are learning Erlang, but I recommend the book Erlang Programming by Cesarini & Thompson (O'Reilly). The book covers, in great detail and with good examples, all you need to know about message passing and concurrency.
Related
E.g. suppose I have a module that implements gen_server behavior, and it has
handle_call({foo, Foo}, _From, State) ->
{reply, result(Foo), State}
;
I can reach this handler by doing gen_server:call(Server, {foo, Foo}) from some other process (I guess if a gen_server tries to gen_server:call itself, it will deadlock). But gen_server:call blocks on response (or timeout). What if I don't want to block on the response?
Imaginary use-case: Suppose I have 5 of these gen_servers, and a response from any 2 of them is enough for me. What I want to do is something like this:
OnResponse -> fun(Response) ->
% blah
end,
lists:foreach(
fun(S) ->
gen_server:async_call(S, {foo, Foo}, OnResponse)
end,
Servers),
Result = wait_for_two_responses(Timeout),
lol_i_dunno()
I know that gen_server has cast, but cast has no way to provide any response, so I don't think that that's what I want in this case. Also, seems like it should not be the gen_server's concern whether caller wants to handle response synchronously (using gen_server:call) or async (does not seem to exist?).
Also, the server is allowed to provide response asynchronously by having handle_call return no_reply and later calling gen_server:reply. So why not also support handling response asynchronously on the other side? Or does that exist, but I'm just failing to find it??
gen_server:call is basically a sequence of
send a message to the server (with identifier)
wait for the response of that particular message
wrapped in a single function.
for your example you can decompose the behavior in 2 steps: a loop that uses gen_server:cast(Server,{Message,UniqueID,self()} with all servers, and then a receive loop that wait for a minimum of 2 answers of the form {UniqueID,Answer}. But you must take care to empty your mail box at some point in time. A better solution should be to delegate this to a separate process which will simply die when it has received the required number of answers:
[edit] make some correction in the code now it should work :o)
get_n_answers(Msg,ServerList,N) when N =< length(ServerList) ->
spawn(?MODULE,get_n_answers,[Msg,ServerList,N,[],self()]).
get_n_answers(_Msg,[],0,Rep,Pid) ->
Pid ! {Pid,Rep};
get_n_answers(_Msg,[],N,Rep,Pid) ->
NewRep = receive
Answ -> [Answ|Rep]
end,
get_n_answers(_Msg,[],N-1,NewRep,Pid);
get_n_answers(Msg,[H|T],N,Rep,Pid) ->
%gen_server:cast(H,{Msg,Pid}),
H ! {Msg,self()},
get_n_answers(Msg,T,N,Rep,Pid).
and you cane use it like this:
ID = get_n_answers(Msg,ServerList,2),
% insert some code here
Answer = receive
{ID,A} -> A % tagged with ID to do not catch another message in the mailbox
end
You can easily implement that by sending each call in a separate process and waiting for responses from as many as required (in essence this is what async is about, isn't? :-)
Have a look at this simple implementation of parallel call which is based on the async_call from rpc library in OTP.
This is how it works in plain English.
You need to make 5 calls so (in the parent process) you spawn 5 child Erlang processes.
Each process sends back to the parent process a tuple containing its PID and the result of the call.
The tuple can be only constructed and send back only when the desired call has been completed.
In the parent process you loop through responses in the receive loop.
You can wait for all responses or just 2 or 3 out of the started 5.
The parent process (which spawns the worker processes) will eventually receive all responses (I mean those you want to ignore). You need a way to discard them if you don't want the message queue to grow infinitely. There are two options:
The parent process itself can be a transient process, created only for the call to spawn the other 5 child processes. Once the desired amount of responses is collected it can send the response back to a caller and die. Messages send to the died process will be discarded.
The parent process can continue receiving messages after it has received the desired amount of responses and simply discard them.
gen_server do not have a concept of async calls on client side. It is not trivial how to implement in consistently because gen_server:call is a combination of monitor for server process, send request message and wait for either answer or monitor down or timeout. If you do something like what you mentioned you will need to deal with DOWN messages from server somehow ... so hypothetical async_call should return some key for yeld and also an internal monitor reference for a case you are processing DONW messages from other processes... and do not want to mix it with yeld errors.
Not that good but possible alternative is to use rpc:async_call(gen_server, call, [....])
But this approach have a limitation in calling process will be a short lived rex child, so if your gen server use caller pid somehow other than send it a reply logic will be broken.
gen_sever:call to the process itself would surely block until timeout. To understand the reason, one should be aware of the fact that gen_server framework actually combine your specific code together into one single module, and gen_server:call would be "translated" as "pid ! Msg" form.
So imagine how this block of code takes effect, the process actually stay in a loop keeping receiving messages, and when the control flow run into a handling function, the receiving process is temporarily interrupted, so if you call gen_server:call to the process itself, since it is a synchronous function, it waits for response, which however would never come in until the handing function returns so that the process can continue to receive messages, so the code is in a deadlock.
If I have a process A that makes call to a function in process B (procB:func().), and func() generates an error during execution. Process B would terminate, but what about process A? Consider the following in process A:
Case 1:
{ok, Reply} = procB:func().
Case 2:
procB:func().
Will process A terminate in both cases? Or just in case 1 because of mismatch? Please note that the two processes are not linked.
Thanks in advance!
There is no such thing as calling a function in another process, you can send a message to a process that it then may choose to call a function based on message content.
gen_servers work this way, you send a message to the gen_server, and it does a match on the message and chooses if it should invoke call/cast/info/terminate functions.
Assuming you are really talking about sending a message from A to B and B decides to exit, it's all about if process A is linked/monitoring process B.
If you monitor B, you are sent a message saying that B went down and the reason.
If you are linked to B, I believe the rule is you are killed if B died with a status other than 'normal'
A could also have set the flag trap_exit, which means that even if linked and B dies, A is sent a message that he should die and you get to interact with that message (ie: you may restart B, if you choose)
learn you some erlang has a good tutorial on how this works.
You are not able to call function in another process. That is the beauty of Erlang: all communication between processes is via message passing. People sometimes confuse modules with processes. I even wrote article about it.
For example process A:
spawns process B
sends message which is for example tuple {fun_to_call, Args, self()} (you need the self() to know, where to respond
waits for reply using receive
Process B:
immediately after start waits for message
when receives message, does some computation and sends response back
This looks like a lot of boilerplate, so this exact pattern is abstracted in gen_server
When I need to create a process,I will use the customary spawn bif.But there is one more bif spawn_link that is often used to do the same thing.
So basically when should one use spawn and spawn_link?
Doing spawn and then link manually is equivalent in operation to spawn_link, but it is not equivalent in time; in particular it is not atomic (as in, two independent operations, not a single, indivisible one). If you spawn a process and it dies in its initialization (whatever your start or init functions do) then it might die before the call to link completes, and the linked process will never get notified that the process died since it died before it was linked. Oops!
From Joe Armstrong's Programming Erlang Ch.13 "Why Spawning and Linking Must be an Atomic Operation":
Once upon a time Erlang had two primitives, spawn and link, and spawn_link(Mod, Func, Args) was defined like this:
spawn_link(Mod, Func, Args) ->
Pid = spawn(Mod, Func, Args),
link(Pid),
Pid.
Then an obscure bug occurred. The spawned process died before the link statement was called, so the process died but no error signal was generated. This bug took a long time to find. To fix this, spawn_link was added as an atomic operation. Even simple-looking programs can be tricky when concurrency is involved.
spawn just spawns a new process.
spawn_link spawns a new process and automatically creates a link between the calling process and the new process. So it can be implemented as:
my_spawn_link(Module, Function, Args) ->
Pid = spawn(Module, Function, Args),
link(Pid),
Pid.
I have a project which has lots of modules, each one has different running threads. I wrote a little script which goes through each one and safely reloads the code (for hot swaps):
reload_all() ->
?MODULE:reload_all(?MODULE_LIST).
reload_all([]) -> ok;
reload_all([T|C]) ->
io:fwrite("Purging ~w\n",[T]),
try_purge(T),
{module,T} = code:load_file(T),
?MODULE:reload_all(C).
try_purge(T) -> try_purge(T,1).
try_purge(T,Wait) ->
case code:soft_purge(T) of
true -> ok;
false ->
io:fwrite("* Waiting ~w seconds for ~w module\n",[Wait,T]),
timer:sleep(Wait*1000),
try_purge(T,Wait+1)
end.
It uses the soft_purge() function which only purges the code if there are no threads running the "old" code that would be killed by the normal purge command. It will wait in increasing intervals and keep trying. I've designed the project so that the wait should never be more then a minute total, but realistically it should always be more or less instant.
The problem I'm running into is that sometimes a module will have a bug causing it to block indefinitely for one reason or another, and my reload_all() script never completes. This is the desired behavior, it lets me know that something is wrong. The problem is that to track down the bug involves lots and lots of testing and analyzing of the code, which sometimes doesn't even work because the bug only shows up in the production environment and not in the testing one.
My question is: Is there a way to identify which threads are running the "old" code in a module, and see which function they are currently stuck in?
You can check if you are using the old or the new version of the module using erlang:check_old_code/1 and erlang:check_process_code/2. Just see Erlang manual.
This could be a very basic question but is Erlang capable of calling a method on another prcoess and wait for it to repond back with some object type without sleeping threads?
Well, if you're waiting for an answer, the calling process will have to sleep eventually… But that's no big deal.
While processes are stuck on the receive loop, other processes can work. In fact, it's not uncommon to have thousands of processes just waiting for messages. And since Erlang processes are not true OS threads, they're very lightweight so the performance loss is minimal.
In fact, the way sleep is implemented looks like:
sleep(Milliseconds) ->
receive
% Intentionally left empty
after Milliseconds -> ok
end.
Yes, it is possible to peek into the mailbox if that is what you mean. Say we have sent a message to another process and now we want to see if the other process has sent something back to us. But we don't want to block on the receive:
receive
Pattern -> Body;
Pattern2 -> Body2
after 0 ->
AfterBody
end
will try to match against the Pattern and Pattern2 in the mailbox. If none matches, it will immediately time out and go to AfterBody. This allows you to implement a non-blocking peek into the mailbox.
If the process is a gen_server the same thing can be had by playing with the internal state and the Timeout setting when a callback returns to the gen_server's control. You can set a Timeout of 0 to achieve this.
What am getting from the question is that we are talking of Synchronous Message Passing. YES ! Erlang can do this perfectly well, its the most basic way of handling concurrency in Erlang. Consider this below:
rpc(Request, To)->
MySelf = self(),
To ! {MySelf,Request},
receive
{To,Reply} -> Reply
after timer:seconds(5) -> erlang:exit({error,timedout})
end.
The code above shows that a processes sends a message to another and immediately goes into waiting (for a reply) without having to sleep. If it does not get a reply within 5 seconds, it will exit.