Erlang pids from console - erlang

I saw message:
https://stackoverflow.com/a/4837832/1236509
with supervisor:
-module(root_sup).
-behaviour(supervisor).
-export([start_link/0]).
-export([init/1]).
start_link() ->
{ok, Pid} = supervisor:start_link({local, ?MODULE},
?MODULE, []),
erlang:unlink(Pid),
{ok, Pid}.
init(_Args) ->
RestartStrategy = {simple_one_for_one, 10, 60},
ChildSpec = {ch1, {ch1, start_link, []},
permanent, brutal_kill, worker, [ch1]},
Children = [ChildSpec],
{ok, {RestartStrategy, Children}}.
In console man calls:
{ok, ChildPid1} = root_sup:start_link().
when child pid changes how would ChildPid1 get new pid so always can use ChildPid1 with correct pid? Need way to link to part of supervisor creating child.

I would not try to access the child by Pid but instead register/2 the child process under a name, so it's accessible regardless of the actual Pid.
Using the code from the answer you reference, a simple way of doing this is to add register(ch1, self()), to the init procedure of the child. This would give, for ch1.erl:
init(_Args) ->
io:format("ch1 has started (~w)~n", [self()]),
% register a name to this process
register(child, self()),
{ok, ch1State}.
This registers the pid of the child self() to the name child
We can see it works:
1> root_sup:start_link().
{ok,<0.34.0>}
2> supervisor:start_child(root_sup, []).
ch1 has started (<0.36.0>)
{ok,<0.36.0>}
3> lists:filter(fun(X) -> X == child end, registered()).
[child]
we indeed have a process registered under the name of child.
4> gen_server:cast(child, calc).
result 2+2=4
and it is a correct process running the code from ch1.erl.
Let us crash this process, by invoking the bad code:
5> gen_server:cast(child, calcbad).
result 1/0
ok
ch1 has started (<0.41.0>)
6>
=ERROR REPORT==== 28-Oct-2012::01:31:30 ===
** Generic server <0.36.0> terminating
** Last message in was {'$gen_cast',calcbad}
** When Server state == ch1State
** Reason for termination ==
** {'function not exported',
[{ch1,terminate,
[{badarith,
[{ch1,handle_cast,2,[{file,"ch1.erl"},{line,27}]},
{gen_server,handle_msg,5,
[{file,"gen_server.erl"},{line,607}]},
{proc_lib,init_p_do_apply,3,
[{file,"proc_lib.erl"},{line,227}]}]},
ch1State],
[]},
{gen_server,terminate,6,[{file,"gen_server.erl"},{line,722}]},
{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
So the child, the process <0.36.0> crashed, and a new child <0.41.0> was started to assume the duties of the deceased <0.36.0>. Since this new process is registered under the same name, calc will work again:
6> gen_server:cast(child, calc).
result 2+2=4
ok
Note that this does not guarantee that the gen_server:cast/2s always result in the execution of the corresponding code because the child might have been killed just now and the new process still was not started (registered in fact).
You might want to refer to the excellent Programming Erlang: Software for a Concurrent World by Joe Armstrong for more details about process registration, supervisors, OTP and more. Many details can also be found in the online documentation of OTP.

Related

Running statements in shell gives different output

Simple code:
-module(on_exit).
-export([on_exit/2, test/0]).
on_exit(Pid, Fun) ->
spawn(fun() ->
Ref = erlang:monitor(process, Pid),
receive
{'DOWN', Ref, process, Pid, Why} ->
Fun(Why)
end
end).
test() ->
Fun1 = fun() -> receive Msg -> list_to_atom(Msg) end end,
Pid1 = spawn(Fun1),
Fun2 = fun(Why) -> io:format("~w died with error: ~w~n", [Pid1, Why]) end,
_Pid2 = spawn(on_exit, on_exit, [Pid1, Fun2]),
Pid1 ! hello.
In the shell:
1> c(on_exit).
{ok,on_exit}
2> on_exit:test().
<0.39.0> died with error: noproc
hello
3>
=ERROR REPORT==== 9-Apr-2017::05:16:54 ===
Error in process <0.39.0> with exit value: {badarg,[{erlang,list_to_atom,[hello],[]},{on_exit,'-test/0-fun-0-',0,[{file,"on_exit.erl"},{line,14}]}]}
Expected Output:
5> Pid1 ! hello.
<0.35.0> died with error: {badarg,[{erlang,list_to_atom,[hello],[]}]}
hello
6>
=ERROR REPORT==== 9-Apr-2017::05:15:47 ===
Error in process <0.35.0> with exit value: {badarg,[{erlang,list_to_atom,[hello],[]}]}
In fact, the expected output is what I see if I take each line in test() and paste it into the shell. Why do I get the noproc (no process) error when I run the same lines inside a function?
From the docs:
12.8 Monitors
An alternative to links are monitors. A process Pid1 can create a
monitor for Pid2 by calling the BIF erlang:monitor(process, Pid2). The
function returns a reference Ref.
If Pid2 terminates with exit reason Reason, a 'DOWN' message is sent
to Pid1:
{'DOWN', Ref, process, Pid2, Reason}
If Pid2 does not exist, the 'DOWN' message is sent immediately with
Reason set to noproc.
Your code contains a race condition -- spawn is asynchronous and might return before the process is spawned, and you might end up sending and crashing Pid1 before on_exit:on_exit/2 starts monitoring it, which causes the erlang:monitor/2 call to immediately send a noproc message to the caller:
1> Pid = spawn(fun() -> ok end).
<0.59.0>
2> erlang:monitor(process, Pid).
#Ref<0.0.1.106>
3> flush().
Shell got {'DOWN',#Ref<0.0.1.106>,process,<0.59.0>,noproc}
ok
The code works fine in the shell probably because the Erlang VM executes some things slowly in the shell than when the code is compiled, but this behavior is not guaranteed. This is a classic race condition.
Erlang has a solution for this: erlang:spawn_monitor/{1,3}. This function is guaranteed to attach the monitor as soon as the function is spawned. You'll have to re-arrange your code a bit to use it instead of spawn/3 + erlang:monitor/1.

erlang otp child workers

I'm trying to get an OTP supervisor to start child workers which will (eventually) connect to remote servers. I used Rebar to create a template test application and I'm trying to get the supervisor to fire off function 'hi' in module 'foo'. it compiles OK and runs:
Eshell V5.8.5 (abort with ^G)
1> test_app:start(1,1).
{ok,<0.34.0>}
but when I try to start the worker it goes pear shaped with this error:
2> test_sup:start_foo().
{error,{badarg,{foo,{foo,start_link,[]},
permanent,5000,worker,
[foo]}}}
The problem seems similar, but not the same, to this question: Erlang - Starting a child from the supervisor module
Any ideas?
test_app.erl
-module(test_app).
-behaviour(application).net
-export([start/2, stop/1]).
start(_StartType, _StartArgs) ->
test_sup:start_link().
stop(_State) ->
ok.
Test_sup.erl:
-module(test_sup).
-behaviour(supervisor).
-export([start_link/0]).
-export([init/1, start_foo/0]).
-define(CHILD(I, Type), {I, {I, start_link, []}, permanent, 5000, Type, [I]}).
start_link() ->
supervisor:start_link({local, ?MODULE}, ?MODULE, []).
init([]) ->
{ok, { {one_for_one, 5, 10}, []} }.
start_foo()->
supervisor:check_childspecs(?CHILD(foo, worker)),
supervisor:start_child(?MODULE, ?CHILD(foo, permanent)).
foo.erl:
-module(foo).
-export([hi/0]).
hi()->
io:format("worker ~n").
You check the childspec using the macro call ?CHILD(foo, worker) while you try to start the child with the macro using the macro call ?CHILD(foo, permanent). The second argument of the CHILD macro is the process type which should be either worker or supervisor. So the first macro call is correct. The value permanent is a value for the restart type, which you have already set to permanent, so the second call is wrong and you get a badarg error.
Note: It is quite common that library functions generate badarg errors as well, not just from built-in functions. It is not always obvious why it is a badarg.
I think that Robert answer is incomplete, after replacing permanent by worker you still have an error returned by supervisor:check_childspecs(?CHILD(foo, worker)),, I don't know why.
[edit]
The problem of bard arg comes from ... badarg :o)
check_childspecs extepect a list of child_specs, the correct syntax is supervisor:check_childspecs([?CHILD(foo, worker)]), and then it works fine. the following code is updated.
[end of edit]
But you will get also an error because the supervisor will try to launch the function foo:start_link that does not exist in the foo module.
the following code print an error, but seems to work properly.
-module(foo).
-export([hi/0,start_link/0,loop/0]).
start_link() ->
{ok,spawn_link(?MODULE,loop,[])}.
hi()->
io:format("worker ~n").
loop() ->
receive
_ -> ok
end.
-module(test_sup).
-behaviour(supervisor).
-export([start_link/0]).
-export([init/1, start_foo/0]).
-define(CHILD(I, Type), {I, {I, start_link, []}, permanent, 5000, Type, [I]}).
start_link() ->
supervisor:start_link({local, ?MODULE}, ?MODULE, []).
init([]) ->
{ok, { {one_for_one, 5, 10}, []} }.
start_foo()->
io:format("~p~n",[supervisor:check_childspecs([?CHILD(foo, worker)])]),
supervisor:start_child(?MODULE, ?CHILD(foo, worker)).
[edit]
answering to David comment
in my code the loop/0 does not loop at all, on the receive block, the process waits for any message, and as soon as it receives one, the process dies returning the value ok. So as long as the worker process does not receive any message, it keeps living, which is nice when you make some test with supervisors :o).
On the opposite, the hi/0 function simply prints 'worker' on the console and finishes. As the restart strategy of the supervisor is one_for_one, the max restart is 5 and the child process is permanent, the supervisor will try to start the hi process 5 times, printing five time 'worker' on the console, and then it will give up and terminate itself with an error message ** exception error: shutdown
Generally you should choose permanent for never ending processes (main server of an application for example). For process that normally die as soon as they have done their job, you should use temporary. I never used transient but I read that it should be used for process that must complete a task before dying.

How to determine if a worker in a supervision tree is starting for the very first time or has been restarted

I have a simple supervisor configuration:
-module(my_supervisor).
-behaviour(supervisor).
-export([start_link/0, init/1]).
init(_Args) ->
{ok, { {one_for_one, 5, 10},
[
{my_worker, {my_worker, start_link, []}, permanent, 5000, worker, [my_worker]}
]
}
}.
And even simple worker:
-module(my_worker).
-export([start_link/0]).
start_link() ->
%??? is this the first time the supervisor is starting me or have I crashed and been restarted???
So is it even possible to determine whether this is the first time the start_link function is called by the supervisor or the worker process has crashed sometime in the past and is now being restarted?
For determining whether this is the first time the start_link function is called by the supervisor.
You can use childId parameter and pass the childId from outside as follows, for details, please doc about erlang supervisor.
start_child(ChildId, Mod, Args) ->
{ok, _} = supervisor:start_child(?SERVER,
{ChildId, {Mod, start_link, Args},
transient, ?MAX_WAIT, worker, [Mod]}),
ok.
For determining worker has been restarted. Please read document of monitor and link. When crash happens,your process will receive messages. If you read the supervisor's source code, you will find the supervisor actually use link and monitor to solve the crash monitor task.
3.
init([]) ->
process_flag(trap_exit, true),
...
terminate(_Reason, _State) ->
% may be crash here by check reason above.
ok.
handle_info({'EXIT',Self,Reason},State#state{self=Self)->
error_logger:info_report([crash_now]),
{stop,Reason,State};
[1]: http://www.erlang.org/doc/man/supervisor.html

spawn_link not working?

I'm using spawn_link but doesn't understand its behavior. Consider the following code:
-module(test).
-export([try_spawn_link/0]).
try_spawn_link() ->
spawn(fun() ->
io:format("parent: ~p~n", [Parent = self()]),
Client = spawn_link(fun() ->
io:format("child: ~p~n", [self()]),
spawn_link_loop(Parent)
end),
spawn_link_loop(Client)
end).
spawn_link_loop(Peer) ->
receive
quit ->
exit(normal);
Any ->
io:format("~p receives ~p~n", [self(), Any])
end,
spawn_link_loop(Peer).
From the Erlang documentation, a link is created between the calling process and the new process, atomically. However, I tested as follows and didn't notice the effect of the link.
1> test:try_spawn_link().
parent: <0.34.0>
<0.34.0>
child: <0.35.0>
2> is_process_alive(pid(0,34,0)).
true
3> is_process_alive(pid(0,35,0)).
true
4> pid(0,35,0) ! quit.
quit
5> is_process_alive(pid(0,35,0)).
false
6> is_process_alive(pid(0,34,0)).
true
1> test:try_spawn_link().
parent: <0.34.0>
<0.34.0>
child: <0.35.0>
2> is_process_alive(pid(0,34,0)).
true
3> is_process_alive(pid(0,35,0)).
true
4> pid(0,34,0) ! quit.
quit
5> is_process_alive(pid(0,35,0)).
true
6> is_process_alive(pid(0,34,0)).
false
In my understanding, if one peer of the link exits, the other peer exits (or is notified to exit). But the results seem different from my understanding.
EDIT: thanks to the answers of legoscia and Pascal.
It is because you have chosen to use exit(normal). In this case the other process will not stop. If you use for example exit(killed) then you will get the behavior you are expecting.
You can use monitor to get informed about normal termination.
As described in the "Error handling" section of the Processes chapter of the Erlang reference manual, a linked process exiting causes its linked processes to exit only if the exit reason is not normal. That's why OTP extensively uses the shutdown exit reason.

Added supervisor(s) for a gen_server, shutdown immediately?

EDIT: Below.
Why is my supervised gen_server shutting down so quickly?
I'll give these organizational names to make it more clear the chain of command that I want in my application: First I'm starting with the "assembly_line_worker" then later I'll add the "marketing_specialist" to my supervision tree...
ceo_supervisor.erl
-module(ceo_supervisor).
-behaviour(supervisor).
-export([start_link/1]).
-export([init/1]).
start_link(State) ->
supervisor:start_link({local,?MODULE}, ?MODULE, [State]).
init([Args]) ->
RestartStrategy = {one_for_one, 10, 60},
ChildSpec= {assembly_line_worker_supervisor,
{assembly_line_worker_supervisor, start_link, [Args]},
permanent, infinity, supervisor, [assembly_line_worker_supervisor]},
{ok, {RestartStrategy, [ChildSpec]}}.
assembly_line_worker_supervisor.erl
-module(assembly_line_worker_supervisor).
-behaviour(supervisor).
-export([start_link/1]).
-export([init/1]). %% Internal
start_link(State) ->
supervisor:start_link({local, ?MODULE}, ?MODULE, [State]).
init([Args]) ->
RestartStrategy = {one_for_one, 10, 60},
ChildSpec = {assembly_line_worker, {assembly_line_worker, start_link, [Args]}, permanent,
infinity, worker, [assembly_line_worker]},
{ok, {RestartStrategy, [ChildSpec]}}.
assembly_line_worker.erl
-module(assembly_line_worker).
...
init([State]) ->
process_flag(trap_exit, true),
{ok, State}.
start_link(State) ->
gen_server:start_link({global, ?MODULE}, ?MODULE, [State], []).
handle_cast(...,State} ->
io:format("We're getting this message.~n",[]),
{noreply, State};
...
What's happening is that the assembly line worker does a few bits of work, like receiving a couple of messages that are sent just after the ceo_supervisor:start_link(#innovative_ideas{}) command is called, then it shuts down. Have any idea why? I know that the gen_server is receiving a few messages because it io:format's them to the console.
Thanks!
EDIT: I'm hosting this on Windows via erlsrv.exe and I found that when I start up my program via a function like so:
start() ->
ceo_supervisor:start_link(#innovative_ideas{}),
assembly_line_worker:ask_for_more_pay(), %% Prints out "I want more $$$" as expected,
ok.
...this function exiting immediately causes my supervisors / gen_servers to shut down. I would expect this because all of this is linked via supervision to the original calling process, so when that exits so should the children.
So I guess a better question would be, how can I allow my supervisors to keep running after going through all of the start up configuration? Is there an option other than wrapping all of this in an application? (Which doesn't sound too bad...)
Thanks for the probing questions! I learned more about supervisors that way.
batman
To get more information about what is happening start sasl before you start your supervisor: application:start(sasl).
Another way to debug this would be to start the worker from your erlang shell send the sequence of message that crashed the server. Btw: are you sure that you need 2 levels of supervisors?
Some immediate comments:
In ceo_supervisor:init/1 your supervisor child spec should declare transient instead of permanent.
Run erl -boot start_sasl so you have the error log when something goes bad and you can get the crash report of a crasher.
If you run this in the shell and you make any mistake, then your tree will be forcibly killed. This is because you linked to the shell and the shell crashes upon errors. So you are dragging down your tree. Try something like:
Pid = spawn(fun() -> my_app:start() end).
so you have it split off. You can kill the app by sending an exit message to Pid.

Resources