Processes with unidirectional links - erlang

I need to create processes with unidirectional links. Say I have three linked processes
grandparent <- parent <- child
The processes do their job and die, but are blocked by the processes generated by them.
When the child dies, the parent and grandparent also die, but not the other way around.
The toy example would be processed that just create children and wait for them to die:
(init)
(wait) <- (init)
(wait) <- (wait) <- (init)
(wait) <- (wait) <- (die)
(wait) <- (die)
(die)
For the example not to be infinitely recursive, let's say that each process has only one job: on (init) it tosses a coin, on heads it spawns a child process, or on tails, it dies.
What is the nice way of doing this in Erlang/OTP?

Each parent should monitor its immediate children (for example, using spawn_monitor.
When a monitored process dies, the VM sends a {'DOWN', MonitorRef, process, Pid, Info} message to the monitoring process, then the monitoring process can just exit or throw an error.
E.g:
{Pid, MonitorRef} = spawn_monitor(?MODULE, child, []),
receive {'DOWN', MonitorRef, process, Pid, Info} -> exit(Info) end.

Related

idiomatic process synchronisation in Erlang

I am looking at how to code "map reduce" type scenarios directly in erlang. As a toy example, imagine I want to decide which of several files is the biggest. Those files might be anywhere on the internet, so getting each one might take some time; so I'd like to gather them in parallel. Once I have them all, I can compare their sizes.
My assumed approach is as follows:
A 'main' process to co-ordinate the work and determine which is biggest;
A 'worker' process for each file, which fetches the file and returns the size to the main process.
Here's a clunky but functioning example (using local files only, but it shows the intent):
-module(cmp).
-export([cmp/2]).
cmp(Fname1, Fname2) ->
Pid1 = fsize(Fname1),
Pid2 = fsize(Fname2),
{Size1, Size2} = collect(Pid1, Pid2),
if
Size1 > Size2 ->
io:format("The first file is bigger~n");
Size2 > Size1 ->
io:format("The second file is bigger~n");
true ->
io:format("The files are the same size~n")
end.
fsize(Fname) ->
Pid = spawn(?MODULE, fsize, [self(), Fname]),
Pid.
fsize(Sender, Fname) ->
Size = filelib:file_size(Fname),
Sender ! {self(), Fname, Size}.
collect(Pid1, Pid2) ->
receive
{Pida, Fnamea, Sizea} ->
io:format("Pid: ~p, Fname: ~p, Size: ~p~n", [Pida, Fnamea, Sizea])
end,
receive
{Pidb, Fnameb, Sizeb} ->
io:format("Pid: ~p, Fname: ~p, Size: ~p~n", [Pidb, Fnameb, Sizeb])
end,
if
Pida =:= Pid1 -> {Sizea, Sizeb};
Pida =:= Pid2 -> {Sizeb, Sizea}
end.
Specific Questions
Is the approach idiomatic? i.e. hiving off each 'long running' task into a separate process, then collecting results back in a 'master'?
Is there a library to handle the synchronisation mechanics? Specifically, the collect function in the example above?
Thanks.
--
Note: I know the collect function in particular is clunky; it could be generalised by e.g. storing the pids in a list, and looping until all had completed.
In my opinion it's best to learn from an example, so I had a look at how they do that in otp/rpc and based on that I implemented a bit shorter/simpler version of the parallel eval call.
call(M, F, ArgL, Timeout) ->
ReplyTo = self(),
Keys = [spawn(fun() -> ReplyTo ! {self(), promise_reply, M:F(A)} end) || A <- ArgL],
Yield = fun(Key) ->
receive
{Key, promise_reply, {error, _R} = E} -> E;
{Key, promise_reply, {'EXIT', {error, _R} = E}} -> E;
{Key, promise_reply, {'EXIT', R}} -> {error, R};
{Key, promise_reply, R} -> R
after Timeout -> {error, timeout}
end
end,
[Yield(Key) || Key <- Keys].
I am not a MapReduce expert but I did had some experience using this 3rd party mapreduce module. So I will try to answer your question based on my current knowledge.
First, your input should be arranged as pairs of keys and values in order to properly use the mapreduce model. In general, your master process should first start workers processes (or nodes). Each worker receives a map function and a pair of key and value, lets name it {K1,V1}. It then executes the map function with the key and value and emits a new pair of key and value {K2,V2}. The master process collects the results and waits for all workers to finish their jobs. After all workers are done, the master starts the reduce part on the pairs {K2,List[V2]} that were emited by the workers. This part can be executed in parallel or not, it used to combine all the results into a single output. Note that the List[V2] is because there can be more then one value that was emited by the workers for a single K2 key.
From the 3rd party module I mentioned above:
%% Input = [{K1, V1}]
%% Map(K1, V1, Emit) -> Emit a stream of {K2,V2} tuples
%% Reduce(K2, List[V2], Emit) -> Emit a stream of {K2,V2} tuples
%% Returns a Map[K2,List[V2]]
If we look into Erlangs' lists functions, the map part is actually equal for doing lists:map/2 and the reduce part is in some way similar to lists:foldl/3 or lists:foldr/3 and the combination between them are: lists:mapfoldl/3, lists:mapfoldr/3.
If you are using this pattern of mapreduce using sets of keys and values, there is no need for special synchronization if that is what you mean. You just need to wait for all workers to finish their job.
I suggest you to go over the 3rd party module I mentioned above. Take also a look at this example. As you can see, the only things you need to define are the Map and Reduce functions.

Purpose of `receive after 0` (also known as Selective Receives)

From the Learn You Some Erlang for Great Good!
Another special case is when the timeout is at 0:
flush() ->
receive
_ -> flush()
after 0 ->
ok
end
.
When that happens, the Erlang VM will try and find a message that fits
one of the available patterns. In the case above, anything matches. As
long as there are messages, the flush/0 function will recursively call
itself until the mailbox is empty. Once this is done, the after 0 ->
ok part of the code is executed and the function returns.
I don't understand purpose of after 0. After reading above text I thought it was like after infinity (waiting forever) but I changed a little the flush function:
flush2() ->
receive
_ -> timer:sleep(1000), io:format("aa~n"), flush()
after 0 ->
okss
end
.
flush3() ->
receive
_ -> io:format("aa~n"), flush()
after 0 ->
okss
end
.
In the first function it waits 1 second and in the second function it doesn't wait.
In both cases it doesn't display a text (aa~n).
So it doesn't work as after infinity.
If block between the receive and the after are not executed then above 2 codes can be simplified to:
flush4() ->
okss
.
What I am missing?
ps. I am on the Erlang R16B03-1, and author of the book was, as fair I remember, was on the Erlang R13.
Every process has a 'mailbox' -- message queue. Messages can be fetched by receive. if there is no messages in the queue. after part specifies how much time 'receive will wait for them. So after 0 -- means process checking (by receive ) if any messages in the queue and if queue is empty immediately continue to next instructions.
It can be used for instance if we want periodically check if any messages here and to do something (hopefully helpful) if there is no messages.
Consider after 0 to be finally.
Consider the use of after 0 to process receives with a priority: http://learnyousomeerlang.com/more-on-multiprocessing#selective-receives
May this different look on things enlighten you.
You can play with the following shell command to understand the effect of the after command:
4> L = fun(G) ->
4> receive
4> stop -> ok;
4> M -> io:format("received ~p~n",[M]), G(G)
4> after 0 ->
4> io:format("no message~n")
4> end
4> end.
#Fun<erl_eval.6.80484245>
5> F = fun() -> timer:sleep(10000),
5> io:format("end of wait for messages, go to receive block~n"),
5> L(L)end.
#Fun<erl_eval.20.80484245>
6> spawn(F).
<0.46.0>
end of wait for messages, go to receive block
no message
7> P1 = spawn(F).
<0.52.0>
8> P1 ! hello.
hello
end of wait for messages, go to receive block
received hello
no message
9> P2 ! hello, P2 ! stop.
* 1: variable 'P2' is unbound
8> P2 = spawn(F).
<0.56.0>
9> P2 ! hello, P2 ! stop.
stop
end of wait for messages, go to receive block
received hello
10>
If you do not intend to use a nested receive, rather than using "after" part, I think a better approach is to use "Unexpected ->" variable to handle all unmatched messages.

erlang's EXIT and DOWN signal

The following code is also from rabbitmq's supervisor2.erl. The code's function is to kill supervisor's children, by for every child:
monitor child send an trappable exit signal
start timer
if timer's arrive, send an untrappable exit signal (kill).
My question about EXIT and DOWN signal.
If the child doesn't trap the exit signal, the supervisor will receive 2 signal, first is exit signal, and then is DOWN signal, is it right? Is the signal sequence is strictly guaranteed?
If the child traps the exit signal, the supervisor will receive only 1 signal, just down signal, is it right?
terminate_simple_children(Child, Dynamics, SupName) ->
Pids = dict:fold(fun (Pid, _Args, Pids) ->
erlang:monitor(process, Pid),
unlink(Pid),
exit(Pid, child_exit_reason(Child)),
[Pid | Pids]
end, [], Dynamics),
TimeoutMsg = {timeout, make_ref()},
TRef = timeout_start(Child, TimeoutMsg),
{Replies, Timedout} =
lists:foldl(
fun (_Pid, {Replies, Timedout}) ->
{Reply, Timedout1} =
receive
TimeoutMsg ->
Remaining = Pids -- [P || {P, _} <- Replies],
[exit(P, kill) || P <- Remaining],
receive {'DOWN', _MRef, process, Pid, Reason} ->
{{error, Reason}, true}
end;
{'DOWN', _MRef, process, Pid, Reason} ->
{child_res(Child, Reason, Timedout), Timedout};
{'EXIT', Pid, Reason} -> %%<==== strict signal, first EXIT, then DOWN.
receive {'DOWN', _MRef, process, Pid, _} ->
{{error, Reason}, Timedout}
end
end,
{[{Pid, Reply} | Replies], Timedout1}
end, {[], false}, Pids),
timeout_stop(Child, TRef, TimeoutMsg, Timedout),
ReportError = shutdown_error_reporter(SupName),
[case Reply of
{_Pid, ok} -> ok;
{Pid, {error, R}} -> ReportError(R, Child#child{pid = Pid})
end || Reply <- Replies],
ok.
There are two things you are confusing here:
First, the child is the one trapping exits, but you are looking at the supervisor code. What the child does with exit signals does not directly affect the supervisor.
the kill exit signal cannot be trapped. It always kills the child.
The supervisor2 has a monitor on the child. This means it is guaranteed to get a 'DOWN' message and this code is concerned about getting that kind of message. If supervisor2 is also trapping exits, it will get the 'EXIT' message in addition.

spawn_link chaining other processes?

If process A spawn_link()'s process B and then process B spawn()'s process C, is the only way for process A to catch an error in process C if we replace "spawn()" for "spawn_link()" in process B?
I believe if this isn't replaced, process A will only know if process B dies?
When process B spawns process C, it basically forgets about it; in this case, if C is dying, process B will have no idea about it. If process B spawns process C using spawn_link, process C will be linked to B as child (C) - parent (B): if C dies, B will be notified and depending on the implementation, it can die (A will be notified) or survive further.

What is the purpose of the one-to-one supervisor in many of the cowboy examples

In the examples provided on the Cowboy github, and in some of the other examples I have found on-line, there is a one-to-one supervisor that does not seem to do anything. I even believe that I saw an example that had the following comment, "like a real supervisor doesn't do anything".
What is the purpose of the supervisor module that seems to be part of so many cowboy examples?
From the echo_get example:
%% Feel free to use, reuse and abuse the code in this file.
%% #private-module(echo_get_sup).
-behaviour(supervisor).
%% API.
-export([start_link/0]).
%% supervisor.
-export([init/1]).
%% API.
-spec start_link() -> {ok, pid()}.
start_link() ->
supervisor:start_link({local, ?MODULE}, ?MODULE, []).
%% supervisor.
init([]) ->
Procs = [],
{ok, {{one_for_one, 10, 10}, Procs}}.
From erlang application behavior documentation:
start is called when starting the application and should create the supervision tree by starting the top supervisor. It is expected to return the pid of the top supervisor and an optional term State, which defaults to []. This term is passed as-is to stop.
He has this dummy supervisor, so he can call it at the end of the start function here. I think it has no practical purpose except to satisfy this condition.
You don't need to specify all the children that a supervisor manages when it starts. You can add/start them dynamically using supervisor:start_child/2 and manage them using supevisor:restart_child/2, supervisor:terminate_child/2 and supervisor:delete_child/2. This means that even if a supervisor has no children from the beginning it does not mean it is just a dummy.
The comment that a real supervisor "doesn't do anything" most likely relates to the fact that a supervisor that can only supervisor processes and not do any work like worker processes. Or at least that's it should mean!

Resources