Erlang: Cannot start supervisor on another node - erlang

I have a simple supervisor that looks like this
-module(a_sup).
-behaviour(supervisor).
%% API
-export([start_link/0, init/1]).
start_link() ->
supervisor:start_link({local,?MODULE}, ?MODULE, []).
init(_Args) ->
RestartStrategy = {simple_one_for_one, 5, 3600},
ChildSpec = {
a_gen_server,
{a_gen_server, start_link, []},
permanent,
brutal_kill,
worker,
[a_gen_server]
},
{ok, {RestartStrategy,[ChildSpec]}}.
When I run this on the shell, it works perfectly fine. But now I want to run different instances of this supervisor on different nodes, called foo and bar (started as erl -sname foo and erl -sname bar, from a separate node called main erl -sname main). This is how I try to initiate this rpc:call('foo#My-MacBook-Pro', a_sup, start_link, [])., but after replying with ok it immediately fails with this message
{ok,<9098.117.0>}
=ERROR REPORT==== 7-Mar-2022::16:05:45.416820 ===
** Generic server a_sup terminating
** Last message in was {'EXIT',<9098.116.0>,
{#Ref<0.3172713737.1597505552.87599>,return,
{ok,<9098.117.0>}}}
** When Server state == {state,
{local,a_sup},
simple_one_for_one,
{[a_gen_server],
#{a_gen_server =>
{child,undefined,a_gen_server,
{a_gen_server,start_link,[]},
permanent,false,brutal_kill,worker,
[a_gen_server]}}},
{maps,#{}},
5,3600,[],0,never,a_sup,[]}
** Reason for termination ==
** {#Ref<0.3172713737.1597505552.87599>,return,{ok,<9098.117.0>}}
(main#Prachis-MacBook-Pro)2> =CRASH REPORT==== 7-Mar-2022::16:05:45.416861 ===
crasher:
initial call: supervisor:a_sup/1
pid: <9098.117.0>
registered_name: a_sup
exception exit: {#Ref<0.3172713737.1597505552.87599>,return,
{ok,<9098.117.0>}}
in function gen_server:decode_msg/9 (gen_server.erl, line 481)
ancestors: [<9098.116.0>]
message_queue_len: 0
messages: []
links: []
dictionary: []
trap_exit: true
status: running
heap_size: 610
stack_size: 29
reductions: 425
neighbours:
From the message it looks like the call expects the supervisor to be a gen_server instead? And when I try to initiat a gen_server on the node like this, it works out just fine, but not with supervisors. I can't seem to figure out if there's something different in trying to initiate supervisor on local/remote nodes, and if yes, what should we do to fix the issue?

As per #JoséM's suggestion, the supervisor in the remote node is also linked to the ephemeral RPC process. However since supervisor does not provide a start method, modifying the start_link() method as
start_link() ->
Pid = supervisor:start_link({local,?MODULE}, ?MODULE, []).
unlink(Pid),
{ok, Pid}.
solves the issue.

Related

where start the poolboy privately?erlang database connection pools

英语不好,请见谅!!!!
I use the poolboy as my database connection pools,i have read the README.md on the github:https://github.com/devinus/poolboy
But at last i do not konw where i have started the poolboy when i want it to start,then i got an error:already_started
My project's files:http://pastebin.com/zus6dGdz
I use the cowboy to be my http server,but you can ignore it.
I start the program like this:
1.I use the rebar to compile
$rebar clean & make
2.then i use the erl to run my program
$ erl -pa ebin/ -pa deps/*/ebin -s start server_start
But i got the errors as follows:
=CRASH REPORT==== 3-Feb-2015::17:47:27 ===
crasher:
initial call: poolboy:init/1
pid: <0.171.0>
registered_name: []
exception exit: {{badmatch,{error,{already_started,<0.173.0>}}},
[{poolboy,new_worker,1,
[{file,"src/poolboy.erl"},{line,260}]},
{poolboy,prepopulate,3,
[{file,"src/poolboy.erl"},{line,281}]},
{poolboy,init,3,[{file,"src/poolboy.erl"},{line,143}]},
{gen_server,init_it,6,
[{file,"gen_server.erl"},{line,306}]},
{proc_lib,init_p_do_apply,3,
[{file,"proc_lib.erl"},{line,237}]}]}
in function gen_server:init_it/6 (gen_server.erl, line 330)
ancestors: [hello_erlang_sup,<0.66.0>]
messages: []
links: [<0.172.0>,<0.173.0>,<0.170.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 610
stack_size: 27
reductions: 205
neighbours:
neighbour: [{pid,<0.173.0>},
{registered_name,db_mongo_handler},
{initial_call,{db_mongo_handler,init,['Argument__1']}},
{current_function,{gen_server,loop,6}},
{ancestors,[<0.172.0>,mg_pool1,hello_erlang_sup,<0.66.0>]},
{messages,[]},
{links,[<0.172.0>,<0.174.0>,<0.171.0>]},
{dictionary,[]},
{trap_exit,false},
{status,waiting},
{heap_size,233},
{stack_size,9},
{reductions,86}]
Please help to solve the problem!Ths!
You are starting a pool of 10 workers with the same registered name. When a process is registered with a name and another process tries to register with the same name, you get the error already_started.
In your example code, the worker module for poolboy is db_mongo_handler. Poolboy tries to start 10 workers by calling db_mongo_handler:start_link/1 which is implemented as
start_link(Args) ->
gen_server:start_link({local, ?SERVER}, ?MODULE, Args, []).
The first worker can start but when the second worker starts it crashes with already_started.
Normally the workers of a pool of many similar workers should not have a registered name. Instead, only the pool has a name and when you need a worker, you ask poolboy to deliver a pid() of one of the workers using poolboy:checkout(mg_pool1).
To fix the code, change gen_server:start_link({local, ?SERVER}, ?MODULE, Args, []) to gen_server:start_link(?MODULE, Args, []). Then it will not be registered with a name.

Erlang application undef error (exited: {bad_return,)

I am trying to run a custom application but get multiple errors. I believe the main egs app gets an error because it calls the egs patch app which has an undefined type. I cant figure out how to get this working I have tried recompiling the code many times in regards to others with a similar problem but nothing seems to work. The cowboy start listener remains undefined.
This is the error I receive.
=CRASH REPORT==== 10-Apr-2013::21:02:00 ===
crasher:
initial call: application_master:init/4
pid: <0.106.0>
registered_name: []
exception exit: {bad_return,
{{egs_patch_app,start,[normal,[]]},
{'EXIT',
{undef,
[{cowboy,start_listener,
[{patch,11030},
10,cowboy_tcp_transport,
[{port,11030}],
egs_patch_protocol,[]],
[]},
{egs_patch_app,start_listeners,1,
[{file,"src/egs_patch_app.erl"},
{line,44}]},
{egs_patch_app,start,2,
[{file,"src/egs_patch_app.erl"},
{line,31}]},
{application_master,start_it_old,4,
[{file,"application_master.erl"},
{line,274}]}]}}}}
in function application_master:init/4 (application_master.erl, line 138)
ancestors: [<0.105.0>]
messages: [{'EXIT',<0.107.0>,normal}]
links: [<0.105.0>,<0.7.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 610
stack_size: 27
reductions: 124
neighbours:
=INFO REPORT==== 10-Apr-2013::21:02:00 ===
application: egs_patch
exited: {bad_return,
{{egs_patch_app,start,[normal,[]]},
{'EXIT',
{undef,
[{cowboy,start_listener,
[{patch,11030},
10,cowboy_tcp_transport,
[{port,11030}],
egs_patch_protocol,[]],
[]},
{egs_patch_app,start_listeners,1,
[{file,"src/egs_patch_app.erl"},{line,44}]},
{egs_patch_app,start,2,
[{file,"src/egs_patch_app.erl"},{line,31}]},
{application_master,start_it_old,4,
[{file,"application_master.erl"},
{line,274}]}]}}}}
type: temporary
=CRASH REPORT==== 10-Apr-2013::21:02:00 ===
crasher:
initial call: application_master:init/4
pid: <0.75.0>
registered_name: []
exception exit: {bad_return,
{{egs_app,start,[normal,[]]},
{'EXIT',
{undef,
[{cowboy,start_listener,
[{login,12030},
10,cowboy_ssl_transport,
[{port,12030},
{certfile,"priv/ssl/servercert.pem"},
{keyfile,"priv/ssl/serverkey.pem"},
{password,"alpha"}],
egs_login_protocol,[]],
[]},
{egs_app,start_login_listeners,1,
[{file,"src/egs_app.erl"},{line,55}]},
{egs_app,start,2,
[{file,"src/egs_app.erl"},{line,38}]},
{application_master,start_it_old,4,
[{file,"application_master.erl"},
{line,274}]}]}}}}
in function application_master:init/4 (application_master.erl, line 138)
ancestors: [<0.74.0>]
messages: [{'EXIT',<0.76.0>,normal}]
links: [<0.74.0>,<0.7.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 987
stack_size: 27
reductions: 185
neighbours:
=INFO REPORT==== 10-Apr-2013::21:02:00 ===
application: egs
exited: {bad_return,
{{egs_app,start,[normal,[]]},
{'EXIT',
{undef,
[{cowboy,start_listener,
[{login,12030},
10,cowboy_ssl_transport,
[{port,12030},
{certfile,"priv/ssl/servercert.pem"},
{keyfile,"priv/ssl/serverkey.pem"},
{password,"alpha"}],
egs_login_protocol,[]],
[]},
{egs_app,start_login_listeners,1,
[{file,"src/egs_app.erl"},{line,55}]},
{egs_app,start,2,
[{file,"src/egs_app.erl"},{line,38}]},
{application_master,start_it_old,4,
[{file,"application_master.erl"},
{line,274}]}]}}}}
type: temporary
Here are the files from which the errors originate.
egs_patch_app.erl
-module(egs_patch_app).
-behaviour(application).
-export([start/2, stop/1]). %% API.
-type application_start_type()
:: normal | {takeover, node()} | {failover, node()}.
%% API.
-spec start(application_start_type(), term()) -> {ok, pid()}.
start(_Type, _StartArgs) ->
{ok, PatchPorts} = application:get_env(patch_ports),
start_listeners(PatchPorts),
egs_patch_sup:start_link().
-spec stop(term()) -> ok.
stop(_State) ->
ok.
%% Internal.
-spec start_listeners([inet:ip_port()]) -> ok.
start_listeners([]) ->
ok;
start_listeners([Port|Tail]) ->
{ok, _Pid} = cowboy:start_listener({patch, Port}, 10,
cowboy_tcp_transport, [{port, Port}],
egs_patch_protocol, []),
start_listeners(Tail).
egs_app.erl
-module(egs_app).
-behaviour(application).
-export([start/2, stop/1]). %% API.
-include("/home/mattk/Desktop/egs-master/apps/egs/include/records.hrl").
-type application_start_type()
:: normal | {takeover, node()} | {failover, node()}.
-define(SSL_OPTIONS, [{certfile, "priv/ssl/servercert.pem"},
{keyfile, "priv/ssl/serverkey.pem"}, {password, "alpha"}]).
%% API.
-spec start(application_start_type(), term()) -> {ok, pid()}.
start(_Type, _StartArgs) ->
{ok, Pid} = egs_sup:start_link(),
application:set_env(egs_patch, patch_ports, egs_conf:read(patch_ports)),
application:start(egs_patch),
start_login_listeners(egs_conf:read(login_ports)),
{_ServerIP, GamePort} = egs_conf:read(game_server),
{ok, _GamePid} = cowboy:start_listener({game, GamePort}, 10,
cowboy_ssl_transport, [{port, GamePort}] ++ ?SSL_OPTIONS,
egs_game_protocol, []),
{ok, Pid}.
-spec stop(term()) -> ok.
stop(_State) ->
ok.
%% Internal.
-spec start_login_listeners([inet:ip_port()]) -> ok.
start_login_listeners([]) ->
ok;
start_login_listeners([Port|Tail]) ->
{ok, _Pid} = cowboy:start_listener({login, Port}, 10,
cowboy_ssl_transport, [{port, Port}] ++ ?SSL_OPTIONS,
egs_login_protocol, []),
start_login_listeners(Tail).
Here's our hint:
.....
{{egs_patch_app,start,[normal,[]]},
{'EXIT',
{undef,
[{cowboy,start_listener, .....
The tuple {egs_patch_app,start,[normal,[]]} tells us that the error occurred in egs_patch_app:start/2. The atom EXIT is the tag of a notification message sent when a process has exited, or the result of an expression like catch error(someerror). Now we get to the interesting part. undef means an attempt was made to call an undefined function. A function is undefined if its Name/Arity doesn't match any known function. In this case, the undefined function is cowboy:start_listener().
Once again, the problem is that Cowboy has evolved while egs has not. Major changes in the Cowboy API have made the two incompatible. Since the last change in egs was about a year ago (assuming you're using essen's branch), you could try reverting to an older Cowboy tag by changing the corresponding rebar.config line to something like this:
{cowboy, ".*", {git, "git://github.com/extend/cowboy.git", {tag, "0.6.0"}}
Notice how "HEAD" changed to {tag, "0.6.0"}. The Cowboy reference may have to be changed in several applications (at least egs and egs_patch). You'll quite possibly need to clear your deps/ first.
Erlang error messages can be difficult to parse, but as a general rule of thumb, you should be on the lookout for a few atoms:
case_clause, meaning no clause in a case expression matched.
function_clause, meaning no function clause matched the arguments.
undef, as noted above, meaning a call to an external (not local to module) function couldn't be resolved.
badarg, which is Erlang's "illegal argument" exception.
badarith, a sneaky bastard that sometimes shows up when you mistype a variable name as an atom in an arithmetic expression, such as 1/x instead of 1/X.
To learn more about Erlang's error handling mechanisms, read the docs.

Erlang Dynamic supervisor start gen_server

I have root supervisor that create other supervisor:
start_link() ->
supervisor:start_link({local, ?MODULE}, ?MODULE, []).
init([]) ->
RestartStrategy = {one_for_one, 5, 600},
ListenerSup =
{popd_listener_sup,
{popd_listener_sup, start_link, []},
permanent, 2000, supervisor, [popd_listener]},
Children = [ListenerSup],
{ok, {RestartStrategy, Children}}.
And i have gen_server - listener. How can i run this gen_server with popd_listener_sup supervisor, when supervisor created?
Thank you.
Root supervisor
-module(root_sup).
-behaviour(supervisor).
-export([start_link/0]).
-export([init/1, shutdown/0]).
start_link() ->
supervisor:start_link({local,?MODULE}, ?MODULE, []).
init(_Args) ->
RestartStrategy = {one_for_one, 10, 60},
ListenerSup = {popd_listener_sup,
{popd_listener_sup, start_link, []},
permanent, infinity, supervisor, [popd_listener_sup]},
Children = [ListenerSup],
{ok, {RestartStrategy, Children}}.
% supervisor can be shutdown by calling exit(SupPid,shutdown)
% or, if it's linked to its parent, by parent calling exit/1.
shutdown() ->
exit(whereis(?MODULE), shutdown).
% or
% exit(normal).
If the child process is another supervisor, Shutdown in child specification should be set to infinity to give the subtree ample time to shutdown, and Type should be set to supervisor, and that's what we did.
Child supervisor
-module(popd_listener_sup).
-behaviour(supervisor).
-export([start_link/0]).
-export([init/1]).
start_link() ->
supervisor:start_link({local,?MODULE}, ?MODULE, []).
init(_Args) ->
RestartStrategy = {one_for_one, 10, 60},
Listener = {ch1, {ch1, start_link, []},
permanent, 2000, worker, [ch1]},
Children = [Listener],
{ok, {RestartStrategy, Children}}.
Here, in a child specification, we set value of Shutdown to 2000. An integer timeout value means that the supervisor will tell the child process to terminate by calling exit(Child,shutdown) and then wait for an exit signal with reason shutdown back from the child process.
Listener
-module(ch1).
-behaviour(gen_server).
% Callback functions which should be exported
-export([init/1]).
-export([handle_cast/2, terminate/2]).
% user-defined interface functions
-export([start_link/0]).
start_link() ->
gen_server:start_link({local, ?MODULE}, ?MODULE, [], []).
init(_Args) ->
erlang:process_flag(trap_exit, true),
io:format("ch1 has started (~w)~n", [self()]),
% If the initialization is successful, the function
% should return {ok,State}, {ok,State,Timeout} ..
{ok, []}.
handle_cast(calc, State) ->
io:format("result 2+2=4~n"),
{noreply, State};
handle_cast(calcbad, State) ->
io:format("result 1/0~n"),
1 / 0,
{noreply, State}.
terminate(_Reason, _State) ->
io:format("ch1: terminating.~n"),
ok.
From Erlang/OTP documentation:
If the gen_server is part of a
supervision tree and is ordered by its
supervisor to terminate, the function
Module:terminate(Reason, State) will
be called with Reason=shutdown if
the following conditions apply:
the gen_server has been set to trap exit signals, and
the shutdown strategy as defined in the supervisor's child specification
is an integer timeout value, not
brutal_kill.
That's why we called erlang:process_flag(trap_exit, true) in Module:init(Args).
Sample run
Starting the root supervisor:
1> root_sup:start_link().
ch1 has started (<0.35.0>)
{ok,<0.33.0>}
Root supervisor is run and automatically starts its child processes, child supervisor in our case. Child supervisor in turn starts its child processes; we have only one child in our case, ch1.
Let's make ch1 evaluate normal code:
2> gen_server:cast(ch1, calc).
result 2+2=4
ok
Now some bad code:
3> gen_server:cast(ch1, calcbad).
result 1/0
ok
ch1: terminating.
=ERROR REPORT==== 31-Jan-2011::01:38:44 ===
** Generic server ch1 terminating
** Last message in was {'$gen_cast',calcbad}
** When Server state == []
** Reason for termination ==
** {badarith,[{ch1,handle_cast,2},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]}
ch1 has started (<0.39.0>)
4> exit(normal).
ch1: terminating.
** exception exit: normal
As you may see child process ch1 was restarted by the child supervisor popd_listener_sup (notice ch1 has started (<0.39.0>)).
Since our shell and root supervisor are bidirectionally linked (call supervisor:start_link, not supervisor:start in the root supervisor function start_link/0), exit(normal) resulted in the root supervisor shutdown, but its child processes had some time to clean up.

gen_server and the run-time errors

I have a run-time error in the init part of a gen_server.
- Init begin by process_flag(trap_exit,true)
- gen_server is part of a supervision tree
I try to print the reason in the terminate module but it seems to exit elsewhere.
- why terminate is not called ?
The application stops with shutdown as reason.
- How and where to catch the run-time error ?
The terminate callback is normally called in this situation, namely because you have trapped exits.
The only place where this is not the case is if the crash happens in the init-function. In that case, the responsibility is on the supervisor, who usually terminates itself as a result. Then this error crawls up the supervisor tree until it ends up terminating your whole application.
Usually, the supervisor will log a supervisor report with the context set to start_error. This is your hint that the part of the supervision tree has problems you should handle. You should check for this, because you may have the wrong assumption on where the error occurs.
EDITED FROM HERE
Your problem is that you don't know about SASL at all. Study it. Here is an example of how to use it.
Hoisted code from your example:
First, the bahlonga needed to tell Erlang we have a gen_server.
-module(foo).
-behaviour(gen_server).
-export([start_link/0]).
-export([init/1, handle_call/3, handle_cast/2, handle_info/2,
terminate/2, code_change/3]).
We hack the #state{} record so it can be used with your code
-record(state, { name, port, socket_listen }).
Basic start_linkage...
start_link() ->
gen_server:start_link({local, foo}, ?MODULE, [], []).
Your init function, spawn problem included.
init([]) ->
Port = 3252,
Name = "foo",
Above we have hacked a bit for the sake of simplification...
process_flag(trap_exit, true),
erlang:error(blabla),
Opts = [binary, {reuseaddr, true},
{backlog,5}, {packet, 0}, {active, false}, {nodelay, true}],
case gen_tcp:listen(Port,Opts) of
{ok,Socket_Listen} ->
logger:fmsg("--> [~s,init] Socket_Listen crée = ~p",
[Name,Socket_Listen]),
{ok,handle_accept(#state{socket_listen=Socket_Listen})};
{error, Reason} ->
logger:fmsg("--> [~s,init] Erreur, Raison =~p",
[Name,Reason]), {stop, Reason}
end.
Hacks for missing functions....
handle_accept(_) ->
#state{}.
The rest is just the basics..., so I omit them.
Now for foo_sup the supervisor for foo:
-module(foo_sup).
-behaviour(supervisor).
-export([start_link/0]).
-export([init/1]).
-define(SERVER, ?MODULE).
Basic start link...
start_link() ->
supervisor:start_link({local, ?SERVER}, ?MODULE, []).
Basic ChildSpec. Get the foo child up and running...
init([]) ->
FooCh = {foo, {foo, start_link, []},
permanent, 2000, worker, [foo]},
{ok, {{one_for_all,0,1}, [FooCh]}}.
Boot Erlang with SASL enabled:
jlouis#illithid:~$ erl -boot start_sasl
Erlang R14B02 (erts-5.8.3) [source] [64-bit] [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:false]
=PROGRESS REPORT==== 9-Dec-2010::01:01:51 ===
[..]
Eshell V5.8.3 (abort with ^G)
Let us try to spawn the supervisor...
1> foo_sup:start_link().
And we get this:
=CRASH REPORT==== 9-Dec-2010::01:05:48 ===
crasher:
initial call: foo:init/1
pid: <0.58.0>
registered_name: []
exception exit: {blabla,[{foo,init,1},
{gen_server,init_it,6},
{proc_lib,init_p_do_apply,3}]}
Above we see that we have a crash in foo:init/1 due to an exception blabla.
in function gen_server:init_it/6
ancestors: [foo_sup,<0.45.0>]
messages: []
links: [<0.57.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 233
stack_size: 24
reductions: 108
neighbours:
And now the supervisor gets to report about the problem!
=SUPERVISOR REPORT==== 9-Dec-2010::01:05:48 ===
Supervisor: {local,foo_sup}
Context: start_error
The context is exactly as I said it would be...
Reason: {blabla,[{foo,init,1},
{gen_server,init_it,6},
{proc_lib,init_p_do_apply,3}]}
And with the expected reason.
Offender: [{pid,undefined},
{name,foo},
{mfargs,{foo,start_link,[]}},
{restart_type,permanent},
{shutdown,2000},
{child_type,worker}]

erlang OTP Supervisor crashing

I'm working through the Erlang documentation, trying to understand the basics of setting up an OTP gen_server and supervisor. Whenever my gen_server crashes, my supervisor crashes as well. In fact, whenever I have an error on the command line, my supervisor crashes.
I expect the gen_server to be restarted when it crashes. I expect command line errors to have no bearing whatsoever on my server components. My supervisor shouldn't be crashing at all.
The code I'm working with is a basic "echo server" that replies with whatever you send in, and a supervisor that will restart the echo_server 5 times per minute at most (one_for_one). My code:
echo_server.erl
-module(echo_server).
-behaviour(gen_server).
-export([start_link/0]).
-export([echo/1, crash/0]).
-export([init/1, handle_call/3, handle_cast/2]).
start_link() ->
gen_server:start_link({local, echo_server}, echo_server, [], []).
%% public api
echo(Text) ->
gen_server:call(echo_server, {echo, Text}).
crash() ->
gen_server:call(echo_server, crash)..
%% behaviours
init(_Args) ->
{ok, none}.
handle_call(crash, _From, State) ->
X=1,
{reply, X=2, State}.
handle_call({echo, Text}, _From, State) ->
{reply, Text, State}.
handle_cast(_, State) ->
{noreply, State}.
echo_sup.erl
-module(echo_sup).
-behaviour(supervisor).
-export([start_link/0]).
-export([init/1]).
start_link() ->
supervisor:start_link(echo_sup, []).
init(_Args) ->
{ok, {{one_for_one, 5, 60},
[{echo_server, {echo_server, start_link, []},
permanent, brutal_kill, worker, [echo_server]}]}}.
Compiled using erlc *.erl, and here's a sample run:
Erlang R13B01 (erts-5.7.2) [source] [smp:2:2] [rq:2] [async-threads:0] [kernel-p
oll:false]
Eshell V5.7.2 (abort with ^G)
1> echo_sup:start_link().
{ok,<0.37.0>}
2> echo_server:echo("hi").
"hi"
3> echo_server:crash().
=ERROR REPORT==== 5-May-2010::10:05:54 ===
** Generic server echo_server terminating
** Last message in was crash
** When Server state == none
** Reason for termination ==
** {'function not exported',
[{echo_server,terminate,
[{{badmatch,2},
[{echo_server,handle_call,3},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]},
none]},
{gen_server,terminate,6},
{proc_lib,init_p_do_apply,3}]}
=ERROR REPORT==== 5-May-2010::10:05:54 ===
** Generic server <0.37.0> terminating
** Last message in was {'EXIT',<0.35.0>,
{{{undef,
[{echo_server,terminate,
[{{badmatch,2},
[{echo_server,handle_call,3},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]},
none]},
{gen_server,terminate,6},
{proc_lib,init_p_do_apply,3}]},
{gen_server,call,[echo_server,crash]}},
[{gen_server,call,2},
{erl_eval,do_apply,5},
{shell,exprs,6},
{shell,eval_exprs,6},
{shell,eval_loop,3}]}}
** When Server state == {state,
{<0.37.0>,echo_sup},
one_for_one,
[{child,<0.41.0>,echo_server,
{echo_server,start_link,[]},
permanent,brutal_kill,worker,
[echo_server]}],
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[],[]}}},
5,60,
[{1273,79154,701110}],
echo_sup,[]}
** Reason for termination ==
** {{{undef,[{echo_server,terminate,
[{{badmatch,2},
[{echo_server,handle_call,3},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]},
none]},
{gen_server,terminate,6},
{proc_lib,init_p_do_apply,3}]},
{gen_server,call,[echo_server,crash]}},
[{gen_server,call,2},
{erl_eval,do_apply,5},
{shell,exprs,6},
{shell,eval_exprs,6},
{shell,eval_loop,3}]}
** exception exit: {{undef,
[{echo_server,terminate,
[{{badmatch,2},
[{echo_server,handle_call,3},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]},
none]},
{gen_server,terminate,6},
{proc_lib,init_p_do_apply,3}]},
{gen_server,call,[echo_server,crash]}}
in function gen_server:call/2
4> echo_server:echo("hi").
** exception exit: {noproc,{gen_server,call,[echo_server,{echo,"hi"}]}}
in function gen_server:call/2
5>
The problem testing supervisors from the shell is that the supervisor process is linked to the shell process. When gen_server process crashes the exit signal is propagated up to the shell which crashes and get restarted.
To avoid the problem add something like this to the supervisor:
start_in_shell_for_testing() ->
{ok, Pid} = supervisor:start_link(echo_sup, []),
unlink(Pid).
I would suggest you to debug/trace your application to check what's going on. It's very helpful in understanding how things work in OTP.
In your case, you might want to do the following.
Start the tracer:
dbg:tracer().
Trace all function calls for your supervisor and your gen_server:
dbg:p(all,c).
dbg:tpl(echo_server, x).
dbg:tpl(echo_sup, x).
Check which messages the processes are passing:
dbg:p(new, m).
See what's happening to your processes (crash, etc):
dbg:p(new, p).
For more information about tracing:
http://www.erlang.org/doc/man/dbg.html
http://aloiroberto.wordpress.com/2009/02/23/tracing-erlang-functions/
Hope this can help for this and future situations.
HINT: The gen_server behaviour is expecting the callback terminate/2 to be defined and exported ;)
UPDATE: After the definition of the terminate/2 the reason of the crash is evident from the trace. This is how it looks:
We (75) call the crash/0 function. This is received by the gen_server (78).
(<0.75.0>) call echo_server:crash()
(<0.75.0>) <0.78.0> ! {'$gen_call',{<0.75.0>,#Ref<0.0.0.358>},crash}
(<0.78.0>) << {'$gen_call',{<0.75.0>,#Ref<0.0.0.358>},crash}
(<0.78.0>) call echo_server:handle_call(crash,{<0.75.0>,#Ref<0.0.0.358>},none)
Uh, problem on the handle call. We have a badmatch...
(<0.78.0>) exception_from {echo_server,handle_call,3} {error,{badmatch,2}}
The terminate function is called. The server exits and it gets unregistered.
(<0.78.0>) call echo_server:terminate({{badmatch,2},
[{echo_server,handle_call,3},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]},none)
(<0.78.0>) returned from echo_server:terminate/2 -> ok
(<0.78.0>) exit {{badmatch,2},
[{echo_server,handle_call,3},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]}
(<0.78.0>) unregister echo_server
The Supervisor (77) receive the exit signal from the gen_server and it does its job:
(<0.77.0>) << {'EXIT',<0.78.0>,
{{badmatch,2},
[{echo_server,handle_call,3},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]}}
(<0.77.0>) getting_unlinked <0.78.0>
(<0.75.0>) << {'DOWN',#Ref<0.0.0.358>,process,<0.78.0>,
{{badmatch,2},
[{echo_server,handle_call,3},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]}}
(<0.77.0>) call echo_server:start_link()
Well, it tries... Since it happens what Filippo said...
On the other hand, if at all restart-strategy has to be tested from within console, use console to start the supervisor and check with pman to kill the process.
You would see that pman refreshes with same supervisor Pid but with different worker Pids depending upon the MaxR and MaxT you have set in restart-strategy.

Resources