Processes exiting normally - erlang

Given two linked processes child and parent, how does process child detect that parent exits (terminates) normally?
I, as an absolute Erlang beginner, thought that a process, when it has nothing else to do, exited using exit(normal). This then signals all linked processes, where
the behaviour of processes that have trap_exit set to false is to ignore the signal, and
the behaviour of processes that have trap_exit set to true is to generate the message {'EXIT', pid, normal} where pid is the process id of the terminating process.
My reason for thinking this is Learn You Some Erlang for Great Good and the Erlang documentation which states the following.
A process is said to terminate normally, if the exit reason is the atom normal. A process with no more code to execute terminates normally.
Apparently that is wrong (?), because exit(normal) shows ** exception exit: normal in the command prompt and makes the code below work. Exiting because there is no more code to execute does not generate the exception and does not make my code work.
As an example, consider the following code.
-module(test).
-export([start/0,test/0]).
start() ->
io:format("Parent (~p): started!\n",[self()]),
P = spawn_link(?MODULE,test,[]),
io:format(
"Parent (~p): child ~p spawned. Waiting for 5 seconds\n",[self(),P]),
timer:sleep(5000),
io:format("Parent (~p): dies out of boredom\n",[self()]),
ok.
test() ->
io:format("Child (~p): I'm... alive!\n",[self()]),
process_flag(trap_exit, true),
loop().
loop() ->
receive
Q = {'EXIT',_,_} ->
io:format("Child process died together with parent (~p)\n",[Q]);
Q ->
io:format("Something else happened... (~p)\n",[Q])
after
2000 -> io:format("Child (~p): still alive...\n", [self()]), loop()
end.
This produces output as follows.
(erlide#127.0.0.1)> test:start().
Parent (<0.145.0>): started!
Parent (<0.145.0>): child <0.176.0> spawned. Waiting for 5 seconds
Child (<0.176.0>): I'm... alive!
Child (<0.176.0>): still alive...
Child (<0.176.0>): still alive...
Parent (<0.145.0>): dies out of boredom
ok
(erlide#127.0.0.1)10> Child (<0.176.0>): still alive...
Child (<0.176.0>): still alive...
Child (<0.176.0>): still alive...
Child (<0.176.0>): still alive...
Child (<0.176.0>): still alive...
Child (<0.176.0>): still alive...
Child (<0.176.0>): still alive...
exit(pid(0,176,0),something).
Child process died together with parent ({'EXIT',<0.194.0>,something})
If had to manually execute the exit(pid(0,176,0),something) command to keep the child from staying alive forever.
Changing ok. in start to exit(normal) makes the execution go like this
(erlide#127.0.0.1)3> test:start().
Parent (<0.88.0>): started!
Parent (<0.88.0>): child <0.114.0> spawned. Waiting for 5 seconds
Child (<0.114.0>): I'm... alive!
Child (<0.114.0>): still alive...
Child (<0.114.0>): still alive...
Parent (<0.88.0>): dies out of boredom
Child process died together with parent ({'EXIT',<0.88.0>,normal})
** exception exit: normal
My concrete questions are the following.
How can I make the above code work as expected. That is, how can I make sure the child process dies together with the parent process without changing the parent process?
Why does exit(normal) generate a ** exception exit: normal in the CLI? It is hard for me to think of an exception as something that is normal. What does the scentence in the Erlang documentation mean?
I think these must be extremely basic questions, but I can't seem to figure this out....
I am using Erlang 5.9.3.1 on Windows (x64).

Erlang shell has the worker for evaluating commands as a separate process, and all commands you type run by the same process. When you your start function finished, worker still alive, and when you kill it by exit(), shell understand as worker exception (because worker will never die in normal case).
So:
You should run start as separate process by spawn or spawn_link
CLI logs all worker exits as exception and normal too
P.S. sorry for my english
P.P.S. spawn(fun() -> test:start() end). works as expected
4> spawn(fun() -> test:start() end).
Parent (<0.41.0>): started!
<0.41.0>
Parent (<0.41.0>): child <0.42.0> spawned. Waiting for 5 seconds
Child (<0.42.0>): I'm... alive!
Child (<0.42.0>): still alive...
Child (<0.42.0>): still alive...
Parent (<0.41.0>): dies out of boredom
Child process died together with parent ({'EXIT',<0.41.0>,normal})

A comment to your question on #PetrKozorezov 's answer. The shell is not behaving specially in any way. The shell worker process is just a normal process so if any process to which it is linked crashes then it will also crash. Another worker process will then be started. This is the normal Erlang way.
Your start/0 function just returns and does NOT terminate its process, it just outputs the "dying of boredom" message. That is why the loop keeps on going, it doesn't get an exit signal because no process has died.
When you change the start/0 function to end with exit(normal) then you do terminate the shell process so an exit signal is sent to the loop process which then gets the {'EXIT',...,...} message and dies.
When #PetrKozorezov spawned your original start/0 function in a separate process which then died after executing start/0 it sent an exit normal signal to the loop process which caused it to die.
This is perfectly normal Erlang behaviour and style. You would normally not end a start function with an exit but leave it up to the caller to decide when to die.
One more small point: as the start function does a spawn_link you would normally call it start_link instead. A start function is assumed to just spawn a process. This is, of course, just a convention, but a common one, so you have not made an error.

Related

correct usage of erlang spawn_monitor

Still working through Joe's book, and having hard time fully understanding monitors in general and spawn_monitor in particular. Here's the code I have; the exercise is asking to write a function that will start a process whose job is to print a heartbeat every 5 seconds, and then a function to monitor the above process and restart it. I didn't get to a restart part, because my monitor fails to even detect the process keeling over.
% simple "working" loop
loop_5_print() ->
receive
after 5000 ->
io:format("I'm still alive~n"),
loop_5_print()
end.
% function to spawn and register a named worker
create_reg_keep_alive(Name) when not is_atom(Name) ->
{error, badargs};
create_reg_keep_alive(Name) ->
Pid = spawn(ex, loop_5_print, []),
register(Name, Pid),
{Pid, Name}.
% a simple monitor loop
monitor_loop(AName) ->
Pid = whereis(AName),
io:format("monitoring PID ~p~n", [Pid]),
receive
{'DOWN', _Ref, process, Pid, Why} ->
io:format("~p died because ~p~n",[AName, Why]),
% add the restart logic
monitor_loop(AName)
end.
% function to bootstrapma monitor
my_monitor(AName) ->
case whereis(AName) of
undefined -> {error, no_such_registration};
_Pid -> spawn_monitor(ex, monitor_loop, [AName])
end.
And here's me playing with in:
39> c("ex.erl").
{ok,ex}
40> ex:create_reg_keep_alive(myjob).
{<0.147.0>,myjob}
I'm still alive
I'm still alive
41> ex:my_monitor(myjob).
monitoring PID <0.147.0>
{<0.149.0>,#Ref<0.230612052.2032402433.56637>}
I'm still alive
I'm still alive
42> exit(whereis(myjob), stop).
true
43>
It sure stopped the loop_5_print "worker" - but where's the line that the monitor was supposed to print? The only explanation that I see is that the message emitted by a process quitting in this manner isn't of the pattern on which I am matching inside monitor loop's receive. But that's the only pattern introduced in the book in this chapter, so I'm not buying this explanation..
spawn_monitor is not what you want here. spawn_monitor spawns a process and immediately starts monitoring it. When the spawned process dies, the process that called spawn_monitor gets a message that the process is dead. You need to call erlang:monitor/2 from the process that you want to receive the DOWN messages in, with the second argument being the Pid to monitor.
Just add:
monitor(process, Pid),
after:
Pid = whereis(AName),
and it works:
1> c(ex).
{ok,ex}
2> ex:create_reg_keep_alive(myjob).
{<0.67.0>,myjob}
I'm still alive
I'm still alive
I'm still alive
3> ex:my_monitor(myjob).
monitoring PID <0.67.0>
{<0.69.0>,#Ref<0.2696002348.2586050567.188678>}
I'm still alive
I'm still alive
I'm still alive
4> exit(whereis(myjob), stop).
myjob died because stop
true
monitoring PID undefined

Erlang: spawn a process and wait for termination without using `receive`

In Erlang, can I call some function f (BIF or not), whose job is to spawn a process, run the function argf I provided, and doesn't "return" until argf has "returned", and do this without using receive clause (the reason for this is that f will be invoked in a gen_server, I don't want pollute the gen_server's mailbox).
A snippet would look like this:
%% some code omitted ...
F = fun() -> blah, blah, timer:sleep(10000) end,
f(F), %% like `spawn(F), but doesn't return until 10 seconds has passed`
%% ...
The only way to communicate between processes is message passing (of course you can consider to poll for a specific key in an ets or a file but I dont like this).
If you use a spawn_monitor function in f/1 to start the F process and then have a receive block only matching the possible system messages from this monitor:
f(F) ->
{_Pid, MonitorRef} = spawn_monitor(F),
receive
{_Tag, MonitorRef, _Type, _Object, _Info} -> ok
end.
you will not mess your gen_server mailbox. The example is the minimum code, you can add a timeout (fixed or parameter), execute some code on normal or error completion...
You will not "pollute" the gen_servers mailbox if you spawn+wait for message before you return from the call or cast. A more serious problem with this maybe that you will block the gen_server while you are waiting for the other process to terminate. A way around this is to not explicitly wait but return from the call/cast and then when the completion message arrives handle it in handle_info/2 and then do what is necessary.
If the spawning is done in a handle_call and you want to return the "result" of that process then you can delay returning the value to the original call from the handle_info handling the process termination message.
Note that however you do it a gen_server:call has a timeout value, either implicit or explicit, and if no reply is returned it generates an error in the calling process.
Main way to communicate with process in Erlang VM space is message passing with erlang:send/2 or erlang:send/3 functions (alias !). But you can "hack" Erlang and use multiple way for communicating over process.
You can use erlang:link/1 to communicate stat of the process, its mainly used in case of your process is dying or is ended or something is wrong (exception or throw).
You can use erlang:monitor/2, this is similar to erlang:link/1 except the message go directly into process mailbox.
You can also hack Erlang, and use some internal way (shared ETS/DETS/Mnesia tables) or use external methods (database or other things like that). This is clearly not recommended and "destroy" Erlang philosophy... But you can do it.
Its seems your problem can be solved with supervisor behavior. supervisor support many strategies to control supervised process:
one_for_one: If one child process terminates and is to be restarted, only that child process is affected. This is the default restart strategy.
one_for_all: If one child process terminates and is to be restarted, all other child processes are terminated and then all child processes are restarted.
rest_for_one: If one child process terminates and is to be restarted, the 'rest' of the child processes (that is, the child processes after the terminated child process in the start order) are terminated. Then the terminated child process and all child processes after it are restarted.
simple_one_for_one: A simplified one_for_one supervisor, where all child processes are dynamically added instances of the same process type, that is, running the same code.
You can also modify or create your own supervisor strategy from scratch or base on supervisor_bridge.
So, to summarize, you need a process who wait for one or more terminating process. This behavior is supported natively with OTP, but you can also create your own model. For doing that, you need to share status of every started process, using cache or database, or when your process is spawned. Something like that:
Fun = fun
MyFun (ParentProcess, {result, Data})
when is_pid(ParentProcess) ->
ParentProcess ! {self(), Data};
MyFun (ParentProcess, MyData)
when is_pid(ParentProcess) ->
% do something
MyFun(ParentProcess, MyData2) end.
spawn(fun() -> Fun(self(), InitData) end).
EDIT: forgot to add an example without send/receive. I use an ETS table to store every result from lambda function. This ETS table is set when we spawn this process. To get result, we can select data from this table. Note, the key of the row is the process id of the process.
spawner(Ets, Fun, Args)
when is_integer(Ets),
is_function(Fun) ->
spawn(fun() -> Fun(Ets, Args) end).
Fun = fun
F(Ets, {result, Data}) ->
ets:insert(Ets, {self(), Data});
F(Ets, Data) ->
% do something here
Data2 = Data,
F(Ets, Data2) end.

Link two process in Erlang?

To exchange data,it becomes important to link the process first.The following code does the job of linking two processes.
start_link(Name) ->
gen_fsm:start_link(?MODULE, [Name], []).
My Question : which are the two processes being linked here?
In your example, the process that called start_link/1 and the process being started as (?MODULE, Name, Args).
It is a mistake to think that two processes need to be linked to exchange data. Data links the fate of the two processes. If one dies, the other dies, unless a system process is the one that starts the link (a "system process" means one that is trapping exits). This probably isn't what you want. If you are trying to avoid a deadlock or do something other than just timeout during synchronous messaging if the process you are sending a message to dies before responding, consider something like this:
ask(Proc, Request, Data, Timeout) ->
Ref = monitor(process, Proc),
Proc ! {self(), Ref, {ask, Request, Data}},
receive
{Ref, Res} ->
demonitor(Ref, [flush]),
Res;
{'DOWN', Ref, process, Proc, Reason} ->
some_cleanup_action(),
{fail, Reason}
after
Timeout ->
{fail, timeout}
end.
If you are just trying to spawn a worker that needs to give you an answer, you might want to consider using spawn_monitor instead and using its {pid(), reference()} return as the message you're listening for in response.
As I mentioned above, the process starting the link won't die if it is trapping exits, but you really want to avoid trapping exits in most cases. As a basic rule, use process_flag(trap_exit, true) as little as possible. Getting trap_exit happy everywhere will have structural effects you won't intend eventually, and its one of the few things in Erlang that is difficult to refactor away from later.
The link is bidirectional, between the process which is calling the function start_link(Name) and the new process created by gen_fsm:start_link(?MODULE, [Name], []).
A called function is executed in the context of the calling process.
A new process is created by a spawn function. You should find it in the gen_fsm:start_link/3 code.
When a link is created, if one process exit for an other reason than normal, the linked process will die also, except if it has set process_flag(trap_exit, true) in which case it will receive the message {'EXIT',FromPid,Reason} where FromPid is the Pid of the process that came to die, and Reason the reason of termination.

how to create a keep-alive process in Erlang

I'm currently reading Programming Erlang! , at the end of Chapter 13, we want to create a keep-alive process,
the example likes:
on_exit(Pid, Fun) ->
spawn(fun() ->
Ref = monitor(process, Pid),
receive
{'DOWN', Ref, process, Pid, Info} ->
Fun(Info)
end
end).
keep_alive(Name, Fun) ->
register(Name, Pid = spawn(Fun)),
on_exit(Pid, fun(_Why) -> keep_alive(Name, Fun) end).
but when between register/2 and on_exit/2 the process maybe exit, so the monitor will failed, I changed the keep_alive/2 like this:
keep_alive(Name, Fun) ->
{Pid, Ref} = spawn_monitor(Fun),
register(Name, Pid),
receive
{'DOWN', Ref, process, Pid, _Info} ->
keep_alive(Name, Fun)
end.
There also an bug, between spawn_monitor/2 and register/2, the process maybe exit. How could this come to run successfully? Thanks.
I'm not sure that you have a problem that needs solving. Monitor/2 will succeed even if your process exits after register/2. Monitor/2 will send a 'DOWN' message whose Info component will be noproc. Per the documentation:
A 'DOWN' message will be sent to the monitoring process if Item dies, if Item does not exist, or if the connection is lost to the node which Item resides on. (see http://www.erlang.org/doc/man/erlang.html#monitor-2).
So, in your original code
register assocates Name to the Pid
Pid dies
on_exit is called and monitor/2 is executed
monitor immediately sends a 'DOWN' message which is received by the function spawned by on_exit
the Fun(Info) of the received statement is executed calling keep_alive/2
I think all is good.
So why you did't want to use erlang supervisor behaviour? it's provides useful functions for creating and restarting keep-alive processes.
See here the example: http://www.erlang.org/doc/design_principles/sup_princ.html
In your second example, if process exits before registration register will fail with badarg. The easiest way to get around that would be surrounding register with try ... catch and handle error in catch.
You can even leave catch empty, because even if registration failed, the 'DOWN' message, will be sent.
On the other hand, I wouldn't do that in production system. If your worker fails so fast, it is very likely, that the problem is in its initialisation code and I would like to know, that it failed to register and stopped the system. Otherwise, it could fail and be respawned in an endless loop.

gen_server named timer_server caused timer module functions to not return

I created a supervisor that spawned a gen_server I called timer_server. One of the tasks of this timer_server is to manage registration and call timer:send_interval to send a message to a pid on a certain interval.
However, in the init of the gen_server, where I call timer:send_interval I was getting a lockup. The documentation said the timer: functions return immediately, so this was very troubling.
When I renamed my gen_server to record_timer_server this problem cleared up. My question is two fold then:
Why could I create a registered process timer_server, if there already was one when timer:start() was called by my application starting up?
Once started, why would this function not cause a badmatch finding the name, if it was calling in to my timer_server using the send_interval function?
I don't think code is necessary but I can update to add some if requested.
This can be recreated simply by doing the following which hangs on the call to timer:send_interval.
1> register(timer_server, self()).
true
2> timer:send_interval(5000, self(), hello).
While this fails...
1> timer:send_interval(5000, self(), hello).
{ok,{interval,#Ref<0.0.0.32>}}
2> register(timer_server, self()).
** exited: {badarg,[{erlang,register,[timer_server,<0.30.0>]},
So, it seems that the first call to timer tries to start a process called timer_server, and hangs if you've taken this name first.
As to why it hangs timer.erl does:
ensure_started() ->
case whereis(timer_server) of
undefined ->
C = {timer_server, {?MODULE, start_link, []}, permanent, 1000,
worker, [?MODULE]}
supervisor:start_child(kernel_safe_sup, C), % kernel_safe_sup
ok;
_ -> ok
end.
which returns fine, followed by a gen_server:call to timer_server. Your process then gets stuck waiting for itself to respond.

Resources