Can you give some examples where a process gets restarted by Erlang supervisor. If a process dies, it will restart. But how does a process die?
Thanks.
You can take as example what occurs in the Erlang shell, for example consider the sequence:
1> self().
<0.32.0>
2> A = 1.
1
3> self().
<0.32.0>
4> A = 2.
** exception error: no match of right hand side value 2
5> self().
<0.37.0>
1> The first command asks to the shell to prompt its own Pid: <0.32.0>.
2> Next a new command set the variable A to 1, it works, since A was unbound.
3> A new request to the shell shows that its Pid didn't change.
4> trying to match A with the integer 2 fails, it raise an exception. In fact, in the background, the shell process dies, and a supervisor restart it immediately.
5> It can be verified with a new request to get the shell Pid, now it is <0.37.0>.
6> when the shell died, it has lost every information, and it is restarted from scratch. But during initialization it can connect to some other processes who was in charge of keeping the history of the session, and all the bound variables. It can be verified by asking the value of A:
6> A.
1
7> or by asking the history
7> h().
1: self()
-> <0.32.0>
2: A = 1
-> 1
3: self()
-> <0.32.0>
4: A = 2
-> {'EXIT',{{badmatch,2},[{erl_eval,expr,3,[]}]}}
5: self()
-> <0.37.0>
6: A
Depending on the environment (hardware failure, loss of communication, bad parameters, bug...) an erlang process may die with an Error reason. If it is managed in a supervision tree (or your own monitoring) it can be restarted from scratch. It is the application responsibility to provide the means to all the processes to recover the appropriate state.
An erlang process may also die with the reason "normal", for example when a user close a session (in the shell you type q().), in this case, the supervisor will not restart it.
You will find many valuable information on the web:
design principle
erlang.org supervisor
learn you some erlang : run time errors
learn you some erlang : errors and processes
learn you some erlang : supervisors
I understand that I can set a seq_trace in erlang to the current process that is executing. But how can I set it on another process from the shell, or remote shell like dbg tracing?
You can enable sequential tracing on another process using dbg. For example, let's say we have a module x with an exported call/2 function:
call(Pid, Msg) ->
Pid ! {self(), Msg},
receive
{Pid, Reply} -> Reply
end.
This function implements a simple call-response. Let's also say we have a module y that has a looping receiver function:
loop() ->
receive
{Pid, Msg} ->
seq_trace:print({?MODULE, self(), Pid, Msg}),
Pid ! {self(), {Msg, os:timestamp()}};
_ -> ok
end,
?MODULE:loop().
This function expects a message of the form sent by x:call/2, and when it receives one it prints a message into the sequential trace, if enabled, and then sends the original message back to the caller augmented with a timestamp. It ignores all other messages.
We also need a function to collect the sequential trace. The recursive systracer/1 function below just collects seq_trace tuples into a list, and produces the list of seq_trace messages when asked:
systracer(Acc) ->
receive
{seq_trace,_,_,_}=S ->
systracer([S|Acc]);
{seq_trace,_,_}=S ->
systracer([S|Acc]);
{dump, Pid} ->
Pid ! lists:reverse(Acc),
systracer([]);
stop -> ok
end.
Let's assume our systracer/1 function is exported from module x as well.
Let's use our Erlang shell to set this all up. First, let's spawn y:loop/0 and x:systracer/1:
1> Y = spawn(y,loop,[]).
<0.36.0>
2> S = spawn(x,systracer,[[]]).
<0.38.0>
3> seq_trace:set_system_tracer(S).
false
After spawning x:systracer/1 we set the process as the seq_trace system tracer. Now we need to start dbg:
4> dbg:tracer(), dbg:p(all,call).
{ok,[{matched,nonode#nohost,28}]}
These dbg calls are pretty standard, but you can feel free to vary them as needed especially if you plan to use dbg tracing during your debug session as well.
In practice when you enable sequential tracing with dbg, you typically do so by keying on a particular argument to a function. This enables you to get a trace specific to a given function invocation without getting traces for all invocations of that function. Along these lines, we'll use dbg:tpl/3 to turn on sequential trace flags when x:call/2 is invoked with its second argument having the value of the atom trace. First, we use dbg:fun2ms/1 to create the appropriate match specification to enable the sequential tracing flags we want, then we'll apply the match spec with dbg:tpl/3:
5> Ms = dbg:fun2ms(fun([_,trace]) -> set_seq_token(send,true), set_seq_token('receive',true), set_seq_token(print,true) end).
[{['_',trace],
[],
[{set_seq_token,send,true},
{set_seq_token,'receive',true},
{set_seq_token,print,true}]}]
6> dbg:tpl(x,call,Ms).
{ok,[{matched,nonode#nohost,1},{saved,1}]}
Now we can call x:call/2 with the second argument trace to cause sequential tracing to occur. We make this call from a spawned process to avoid having shell I/O-related messages appearing in the resulting trace:
7> spawn(fun() -> x:call(Y, trace), x:call(Y, foo) end).
(<0.46.0>) call x:call(<0.36.0>,trace)
<0.46.0>
The first line of output comes from normal dbg tracing, since we specified dbg:p(all, call) earlier. To get the sequential trace results, we need to get a dump from our systrace/1 process:
8> S ! {dump, self()}.
{dump,<0.34.0>}
This sends all sequential trace collected so far to our shell process. We can use the shell flush() command to view them:
9> flush().
Shell got [{seq_trace,0,{send,{0,1},<0.47.0>,<0.36.0>,{<0.47.0>,trace}}},
{seq_trace,0,{'receive',{0,1},<0.47.0>,<0.36.0>,{<0.47.0>,trace}}},
{seq_trace,0,{print,{1,2},<0.36.0>,[],{y,<0.36.0>,<0.47.0>,trace}}},
{seq_trace,0,
{send,{1,3},
<0.36.0>,<0.47.0>,
{<0.36.0>,{trace,{1423,709096,206121}}}}},
{seq_trace,0,
{'receive',{1,3},
<0.36.0>,<0.47.0>,
{<0.36.0>,{trace,{1423,709096,206121}}}}},
{seq_trace,0,{send,{3,4},<0.47.0>,<0.36.0>,{<0.47.0>,foo}}},
{seq_trace,0,{'receive',{3,4},<0.47.0>,<0.36.0>,{<0.47.0>,foo}}},
{seq_trace,0,{print,{4,5},<0.36.0>,[],{y,<0.36.0>,<0.47.0>,foo}}},
{seq_trace,0,
{send,{4,6},
<0.36.0>,<0.47.0>,
{<0.36.0>,{foo,{1423,709096,206322}}}}},
{seq_trace,0,
{'receive',{4,6},
<0.36.0>,<0.47.0>,
{<0.36.0>,{foo,{1423,709096,206322}}}}}]
And sure enough, these are the sequential trace messages we expected to see. First, for the message containing the trace atom, we have the send from x:call/2 followed by the reception in y:loop/0 and the result of seq_trace:print/1, then the send from y:loop/0 back to the caller of x:call/2. Then, since x:call(Y,foo) is called in the same process, which means all the sequential tracing flags are still enabled, the first set of sequential trace messages is followed by a similar set for the x:call(Y,foo) invocation.
If we just call x:call(Y,foo) we can see we get no sequential trace messages:
10> spawn(fun() -> x:call(Y, foo) end).
<0.55.0>
11> S ! {dump, self()}.
{dump,<0.34.0>}
12> flush().
Shell got []
This is because our match spec enables sequential tracing only when the second argument to x:call/2 is the atom trace.
For more information, see the seq_trace and dbg man pages, and also read the match specification chapter of the Erlang Run-Time System Application (ERTS)
User's Guide.
Edit: I changed the title of this question, as it wasn't useful in light of the n00b mistake I had made. The remainder is unchanged, and serves as a cautionary tale!
I am using Erlang OTP version 17.4. Consider the following Erlang shell session where I am experimenting with the trap_exit process flag as explained in "Learn You Some Erlang:Errors and Processes".
First, I set the trap_exit flag to convert exit signals in linked processes to regular messages:
Eshell V6.2 (abort with ^G)
1> process_flag(trap_exit, true).
false
Then I spawn a linked process and terminate it immediately with a call to exit/2:
2> exit(spawn_link(fun() -> timer:sleep(50000) end), kill).
true
Then I read the converted exit message:
3> receive X -> X end.
{'EXIT',<0.61.0>,killed}
All looking good so far, just like the book describes. Now, just for fun, I spawn_link and terminate another process:
4> exit(spawn_link(fun() -> timer:sleep(5000) end), kill).
true
And try to read the converted exit message:
5> receive X -> X end.
At this point the shell hangs. My question is why does the behaviour change on the second go around and where did the exit message go?
Your second receive X -> X end. already has X bound; it is attempting to receive a message exactly matching the one you already saw. Since the pid is going to be different, the message will never match. So it hangs, waiting for one that does match.
You need to f(X) first.
I have a strange behaviour in erlang with ets:select.
I achieve a correct select statement (4 and 5 below), then I make an error in my statement (6 below), and then I try again the same statement as in 4 and 5, and it does not work any longer.
What is happening ? any idea ?
Erlang R14B01 (erts-5.8.2) [source] [smp:2:2] [rq:2] [async-threads:0] [kernel-poll:false]
Eshell V5.8.2 (abort with ^G)
1> Tab = ets:new(x, [private]).
16400
2> ets:insert(Tab, {c, "rhino"}).
true
3> ets:insert(Tab, {a, "lion"}).
true
4> ets:select(Tab,[{{'$1','$2'},[],['$1', '$2']}]).
["rhino","lion"]
5> ets:select(Tab,[{{'$1','$2'},[],['$1', '$2']}]).
["rhino","lion"]
6> ets:select(Tab,[{{'$1','$2'},[],['$1', '$2', '$3']}]).
** exception error: bad argument
in function ets:select/2
called as ets:select(16400,[{{'$1','$2'},[],['$1','$2','$3']}])
7> ets:select(Tab,[{{'$1','$2'},[],['$1', '$2']}]).
** exception error: bad argument
in function ets:select/2
called as ets:select(16400,[{{'$1','$2'},[],['$1','$2']}])
Has my ets table been destroyed ? Would it be a bug of ets ?
Thank you.
The shell process has created the ETS table and is the owner of it. When the owner process dies the ETS table is automatically deleted.
So when you get an exception at 6, the shell process dies so the ETS table is deleted.
Making it private also means that no other process can access it (so even if the table was persisted the new shell wouldn't be able to access it), but in this case it is even worse as the table has been deleted.
(too big to leave as a comment to thanosQR's correct answer)
if you'd like the table to survive an exception in the shell, you can give it away to another process. for example:
1> Pid = spawn(fun () -> receive foo -> ok end end). % sit and wait for 'foo' message
<0.62.0>
2> Tab = ets:new(x, [public]). % Tab must be public if you plan to give it away and still have access
24593
3> ets:give_away(Tab, Pid, []).
true
4> ets:insert(Tab, {a,1}).
true
5> ets:tab2list(Tab).
[{a,1}]
6> 3=4.
** exception error: no match of right hand side value 4
7> ets:tab2list(Tab). % Tab survives exception
[{a,1}]
8> Pid ! foo. % cause owning process to exit
foo
9> ets:tab2list(Tab). % Tab is now gone
** exception error: bad argument
in function ets:match_object/2
called as ets:match_object(24593,'_')
in call from ets:tab2list/1 (ets.erl, line 323)
I've been learning how to use ets, but one thing that has bothered me is that, occasionally*, ets:match throws a bad argument… And, from them on, all subsequent calls (even calls which previously worked) also throw a bad argument:
> ets:match(Tid, { [$r | '$1'] }, 1).
% this match works...
% Then, at some point, this comes up:
** exception error: bad argument
in function ets:match/3
called as ets:match(24589,{[114|'$1']},1)
% And from then on, matches stop working:
> ets:match(Tid, { [$r | '$1'] }, 1).
** exception error: bad argument
in function ets:match/3
called as ets:match(24589,{[114|'$1']},1)
Is there any way to "reset" the ets system so that I can query it (ie, from the shell) again?
*: I haven't been able to reproduce the problem… But it happens fairly often while I'm trying to do "other things".
Although I'm not 100% sure, this thread seems to answer your question. It appears that you're observing this behaviour in the shell. If so, two facts are interacting in a confusing way:
An ets table is deleted as soon as its owning process dies.
The erlang shell dies whenver it receives an exception and is silently restarted.
So, when you get the first exception, the current shell process dies causing the ets table to be deleted, and then a new shell process is started for you. Now, when you try another ets:match, it fails because the table no longer exists.
Dale already told you what happens. You can confirm that by calling self() in the shell every now and then.
As a quick workaround you can spawn another process to create a public table for you. Then that table won't die along with your shell.
1> self().
<0.32.0> % shell's Pid
2> spawn(fun() -> ets:new(my_table, [named_table, public]), receive X -> ok end end).
<0.35.0> % the spawned process's Pid
3> ets:insert(my_table, {a, b}).
true
Now make an exception and check that the table indeed survived.
4> 1/0.
** exception error: bad argument in an arithmetic expression
in operator '/'/2
called as 1 / 0
5> self().
<0.38.0> % shell's reborn, with a different Pid
6> ets:insert(my_table, {c, d}).
true
7> ets:tab2list(my_table).
[{c,d},{a,b}] % table did survive the shell restart
To delete the table, just send something to your spawned process:
8> pid(0,35,0) ! bye_bye.
bye_bye
9> ets:info(my_table).
undefined