Which one should I use exit, error or throw? - erlang

Could you tell me when to use throw, exit and error?
1> catch throw ({aaa}).
{aaa}
2> catch exit ({aaa}).
{'EXIT',{aaa}}
3> catch gen_server:call(aaa,{aaa}).
{'EXIT',{noproc,{gen_server,call,[aaa,{aaa}]}}}
4> catch exit("jaj")
{'EXIT',"jaj"}

There are 3 classes which can be caught with a try ... catch: throw, error and exit.
throw is generated using throw/1 and is intended to be used for non-local returns and does not generate an error unless it is not caught (when you get a nocatch error).
error is generated when the system detects an error. You can explicitly generate an error using error/1. The system also includes a stacktrace in the generated error value, for example {badarg,[...]}.
exit is generated using exit/1 and is intended to signal that this process is to die.
The difference between error/1 and exit/1 is not that great, it more about intention which the stacktrace generated by errors enhances.
The difference between them is actually more noticeable when doing catch ...: when throw/1 is used then the catch just returns the thrown value, as is expected from a non-local return; when an error/1 is used then the catch returns {'EXIT',Reason} where Reason contains the stacktrace; while from exit/1 catch also returns {'EXIT',Reason} but Reason only contains the actual exit reason. try ... catch looks like it equates them, but they are/were very different.

[UPDATED]
I glossed over the important difference between throw and error, pointed out by Robert Virding. This edit is just for the record!
throw error is to be used where one would use throw in other languages. An error in a running process has been detected by your code, which signals an exception with error/1. The same process catches it (possibly higher up in the stack), and the error is to be handled within the same process. error always brings with it a stacktrace.
throw is to be used not to signal an error, but just to return a value from a deeply nested function.
Since it unwinds the stack, calling throw returns the thrown value to the place it was caught. As in the case of error, we're catching stuff that was thrown, only what was thrown wasn't an error but rather just a value passed up the stack. This is why throw does not bring with it a stacktrace.
As a contrived example, if we wanted to implement an exists function for lists, (similar to what list:any does) and as an exercise without doing the recursing ourselves, and using just list:foreach, then throw could be used here:
exists(P, List) ->
F = fun(X) ->
case P(X) of
true -> throw(true);
Whatever -> Whatever
end
end,
try lists:foreach(F, List) of
ok -> false
catch
true -> true
end.
A value thrown but not caught is treated as an error: a nocatch exception will be generated.
EXIT is to be signaled by a process when it 'gives up'. The parent process handles the EXIT, while the child process just dies. This is the Erlang let-it-crash philosophy.
So exit/1's EXIT is not to be caught within the same process, but left to the parent. error/1's errors are local to the process - i.e., a matter of what happens and how it is handled by the process itself; throw/1 is used for control flow across the stack.
[UPDATE]
This tutorial explains it well: http://learnyousomeerlang.com/errors-and-exceptions
Note there is also a exit/2 - called with a Pid of a process to send the EXIT to.
exit/1 implies the parent process.

I'm new to Erlang, but here's how I think about what these things are, their differences, what they're used for, etc.:
throw: a condition that should be handled locally (i.e. within the current process). E.g. caller is looking for an element in a collection, but does not know if the collection actually contains such an element; then, the callee could throw if such an element is not present, and the caller detect absence by using try[/of]/catch. If caller neglects to do this, then this gets turned into an nocatch error (explained below).
exit: The current process is done. E.g. it has simply finished (in that case, you'd pass normal, which is treated the the same as the original function returning), or its operation was cancelled (E.g. it normally loops indefinitely but has just received a shut_down message).
error: the process has done something and/or reached a state that the programmer did not take into account (E.g. 1/0), believes is impossible (E.g. case ... of encounters a value that does not match any case), or some precondition is not met (E.g. input is nonempty). In this case, local recovery doesn't make sense. Therefore, neither throw nor exit is appropriate. Since this is unexpected, a stack trace is part of the Reason.
As you can see, the above list is in escalating order:
throw is for sane conditions that the caller is expected to handle. I.e. handling occurs within the current process.
exit is also sane, but should end the current process simply because the process is done.
error is insane. Something happened that can't reasonably be recovered from (usually a bug?), and local recovery would not be appropriate.
vs. other languages:
throw is analogous to the way checked exceptions are used in Java. Whereas, error is used in a manner more analogous to unchecked exceptions. Checked exceptions are exceptions you want the caller to handle. Java requires you to either wrap calls in try/catch or declare that your method throws such exceptions. Whereas, unchecked exceptions generally propagate to the outermost caller.
exit does not have a good analog in more "conventional" languages like Java, C++, Python, JavaScript, Ruby, etc. exit vaguely like an uber-return: instead of returning at the end, you can return from the middle of a function, except you don't just return from the current function, you return from them ALL.
exit Example
serve_good_times() ->
receive
{top_of_the_mornin, Sender} ->
Sender ! and_the_rest_of_the_day_to_yourself;
{you_suck, Sender} ->
Sender ! take_a_chill_pill;
% More cases...
shut_down ->
exit(normal)
end,
serve_good_times()
end
Since serve_good_times calls itself after almost all messages, the programmer has decided that we don't want to repeat that call in every receive case. Therefore, she has put that call after the receive. But then, what if serve_good_times decides to stop calling itself? This is where exit comes to the rescue. Passing normal to exit causes the process to terminate just as though the last function call has returned.
As such, it's generally inappropriate to call exit in a general purpose library, like lists. It's none of the library's business whether the process should end; that should be decided by application code.
What About Abnormal exit?
This matters if another process (the "remote" process) is linked to the "local" process that calls exit (and process_flag(trap_exit, true) was not called): Just like the last function returning, exit(normal) does not cause remote process to exit. But if the local process makes a exit(herp_derp) call, then the remote process also exits with Reason=herp_derp. Of course, if the remote process is linked to yet more processes, they also get exit signal with Reason=herp_derp. Therefore, non-normal exits result in a chain reaction.
Let's take a look at this in action:
1> self().
<0.32.0>
2> spawn_link(fun() -> exit(normal) end).
<0.35.0>
3> self().
<0.32.0>
4>
4>
4> spawn_link(fun() -> exit(abnormal) end).
** exception exit: abnormal
5> self().
<0.39.0>
6>
The first process that we spawned did not cause the shell to exit (we can tell, because self returned the same pid before and after spawn_link). BUT the second process did cause the shell to exit (and the system replaced the shell process with a new one).
Of course, if the remote process uses process_flag(trap_exit, true) then it just gets a message, regardless of whether the local process passes normal or something else to exit. Setting this flag stops the chain reaction.
6> process_flag(trap_exit, true).
false
7> spawn_link(fun() -> exit(normal) end).
<0.43.0>
8> self().
<0.39.0>
9> flush().
Shell got {'EXIT',<0.43.0>,normal}
ok
10>
10>
10> spawn_link(fun() -> exit(abnormal) end).
<0.47.0>
11> self().
<0.39.0>
12> flush().
Shell got {'EXIT',<0.47.0>,abnormal}
Recall that I said that exit(normal) is treated like the original function returning:
13> spawn_link(fun() -> ok end).
<0.51.0>
14> flush().
Shell got {'EXIT',<0.51.0>,normal}
ok
15> self().
<0.39.0>
What do you know: the same thing happened as when exit(normal) was called. Wonderful!

Related

Erlang: spawn a process and wait for termination without using `receive`

In Erlang, can I call some function f (BIF or not), whose job is to spawn a process, run the function argf I provided, and doesn't "return" until argf has "returned", and do this without using receive clause (the reason for this is that f will be invoked in a gen_server, I don't want pollute the gen_server's mailbox).
A snippet would look like this:
%% some code omitted ...
F = fun() -> blah, blah, timer:sleep(10000) end,
f(F), %% like `spawn(F), but doesn't return until 10 seconds has passed`
%% ...
The only way to communicate between processes is message passing (of course you can consider to poll for a specific key in an ets or a file but I dont like this).
If you use a spawn_monitor function in f/1 to start the F process and then have a receive block only matching the possible system messages from this monitor:
f(F) ->
{_Pid, MonitorRef} = spawn_monitor(F),
receive
{_Tag, MonitorRef, _Type, _Object, _Info} -> ok
end.
you will not mess your gen_server mailbox. The example is the minimum code, you can add a timeout (fixed or parameter), execute some code on normal or error completion...
You will not "pollute" the gen_servers mailbox if you spawn+wait for message before you return from the call or cast. A more serious problem with this maybe that you will block the gen_server while you are waiting for the other process to terminate. A way around this is to not explicitly wait but return from the call/cast and then when the completion message arrives handle it in handle_info/2 and then do what is necessary.
If the spawning is done in a handle_call and you want to return the "result" of that process then you can delay returning the value to the original call from the handle_info handling the process termination message.
Note that however you do it a gen_server:call has a timeout value, either implicit or explicit, and if no reply is returned it generates an error in the calling process.
Main way to communicate with process in Erlang VM space is message passing with erlang:send/2 or erlang:send/3 functions (alias !). But you can "hack" Erlang and use multiple way for communicating over process.
You can use erlang:link/1 to communicate stat of the process, its mainly used in case of your process is dying or is ended or something is wrong (exception or throw).
You can use erlang:monitor/2, this is similar to erlang:link/1 except the message go directly into process mailbox.
You can also hack Erlang, and use some internal way (shared ETS/DETS/Mnesia tables) or use external methods (database or other things like that). This is clearly not recommended and "destroy" Erlang philosophy... But you can do it.
Its seems your problem can be solved with supervisor behavior. supervisor support many strategies to control supervised process:
one_for_one: If one child process terminates and is to be restarted, only that child process is affected. This is the default restart strategy.
one_for_all: If one child process terminates and is to be restarted, all other child processes are terminated and then all child processes are restarted.
rest_for_one: If one child process terminates and is to be restarted, the 'rest' of the child processes (that is, the child processes after the terminated child process in the start order) are terminated. Then the terminated child process and all child processes after it are restarted.
simple_one_for_one: A simplified one_for_one supervisor, where all child processes are dynamically added instances of the same process type, that is, running the same code.
You can also modify or create your own supervisor strategy from scratch or base on supervisor_bridge.
So, to summarize, you need a process who wait for one or more terminating process. This behavior is supported natively with OTP, but you can also create your own model. For doing that, you need to share status of every started process, using cache or database, or when your process is spawned. Something like that:
Fun = fun
MyFun (ParentProcess, {result, Data})
when is_pid(ParentProcess) ->
ParentProcess ! {self(), Data};
MyFun (ParentProcess, MyData)
when is_pid(ParentProcess) ->
% do something
MyFun(ParentProcess, MyData2) end.
spawn(fun() -> Fun(self(), InitData) end).
EDIT: forgot to add an example without send/receive. I use an ETS table to store every result from lambda function. This ETS table is set when we spawn this process. To get result, we can select data from this table. Note, the key of the row is the process id of the process.
spawner(Ets, Fun, Args)
when is_integer(Ets),
is_function(Fun) ->
spawn(fun() -> Fun(Ets, Args) end).
Fun = fun
F(Ets, {result, Data}) ->
ets:insert(Ets, {self(), Data});
F(Ets, Data) ->
% do something here
Data2 = Data,
F(Ets, Data2) end.

Link two process in Erlang?

To exchange data,it becomes important to link the process first.The following code does the job of linking two processes.
start_link(Name) ->
gen_fsm:start_link(?MODULE, [Name], []).
My Question : which are the two processes being linked here?
In your example, the process that called start_link/1 and the process being started as (?MODULE, Name, Args).
It is a mistake to think that two processes need to be linked to exchange data. Data links the fate of the two processes. If one dies, the other dies, unless a system process is the one that starts the link (a "system process" means one that is trapping exits). This probably isn't what you want. If you are trying to avoid a deadlock or do something other than just timeout during synchronous messaging if the process you are sending a message to dies before responding, consider something like this:
ask(Proc, Request, Data, Timeout) ->
Ref = monitor(process, Proc),
Proc ! {self(), Ref, {ask, Request, Data}},
receive
{Ref, Res} ->
demonitor(Ref, [flush]),
Res;
{'DOWN', Ref, process, Proc, Reason} ->
some_cleanup_action(),
{fail, Reason}
after
Timeout ->
{fail, timeout}
end.
If you are just trying to spawn a worker that needs to give you an answer, you might want to consider using spawn_monitor instead and using its {pid(), reference()} return as the message you're listening for in response.
As I mentioned above, the process starting the link won't die if it is trapping exits, but you really want to avoid trapping exits in most cases. As a basic rule, use process_flag(trap_exit, true) as little as possible. Getting trap_exit happy everywhere will have structural effects you won't intend eventually, and its one of the few things in Erlang that is difficult to refactor away from later.
The link is bidirectional, between the process which is calling the function start_link(Name) and the new process created by gen_fsm:start_link(?MODULE, [Name], []).
A called function is executed in the context of the calling process.
A new process is created by a spawn function. You should find it in the gen_fsm:start_link/3 code.
When a link is created, if one process exit for an other reason than normal, the linked process will die also, except if it has set process_flag(trap_exit, true) in which case it will receive the message {'EXIT',FromPid,Reason} where FromPid is the Pid of the process that came to die, and Reason the reason of termination.

error:spawn a process with only argument which is a recursive function that calls io:format

I use EUNIT module and including "eunit/include/eunit.hrl". I call spawn/1 with the argument func/0 to spawn a new process in a test function and call io:format/1 in the new process. The argument func/0 is a recursive function like this:
func() ->
A = 2,
io:format("#######~p~n", [A]),
timer:sleep(1000),
func().
Then
10> bt:test().
All 2 tests passed.
ok
11>
=ERROR REPORT==== 19-Jun-2013::19:50:54 ===
Error in process <0.122.0> with exit value: {terminated,[{io,format,[<0.121.0>,"
#######~p~n",[2]],[]},{bt,func,0,[{file,"bt.erl"},{line,6}]}]}
What's wrong and what should I do?
If I correcly understand the problem is that you are spawning a process running a never ending function func(), but when the EUnit process terminates it probably closes standard output.
This makes the process issuing io:format() to exit (raises an exception). Indeed the ERROR REPORT mentions exactly this function.
My suggestion is to review the need of spawning a function that never ends.
The way fun() is written, it's infinite recursive. It's basically:
fun() ->
fun().
That will never return (keep running) and is probably the reason for termination by EUNIT.

Erlang. Correct way to stop process

Good day, i have following setup for my little service:
-module(mrtask_net).
-export([start/0, stop/0, listen/1]).
-define(SERVER, mrtask_net).
start() ->
Pid = spawn_link(fun() -> ?MODULE:listen(4488) end),
register(?SERVER, Pid),
Pid.
stop() ->
exit(?SERVER, ok).
....
And here is the repl excerpt:
(emacs#rover)83> mrtask_net:start().
<0.445.0>
(emacs#rover)84> mrtask_net:stop().
** exception error: bad argument
in function exit/2
called as exit(mrtask_net,ok)
in call from mrtask_net:stop/0
(emacs#rover)85>
As you see, stopping process produces error, process is stopping though.
What does this error mean and how to make thing clean ?
Not being an Erlang programmer and just from the documentation of exit (here), I'd say, that exit requires a process id as first argument whereas you are passing an atom (?SERVER) to it.
Try
exit(whereis(?SERVER), ok).
instead (whereis returns the process id associated with a name, see here)
You need to change the call to exit/2 as #MartinStettner has pointed out. The reason the process stops anyway is that you have started it with spawn_link. Your process is then linked to the shell process. When you called mrtask_net:stop() the error caused the shell process to crash which then caused your process to crash as they were linked. A new shell process is then automatically started so you can keep working with the shell. You generally do want to start your servers with spawn_link but it can cause confusion when your are testing them from the shell and they just "happen" to die.
I would suggest you to stick with OTP. It really gives you tons of advantages (I hardly can immagine the case where OTP doesn't benefit).
So, if you want to stop process in OTP you should do something like this for gen_server:
% process1.erl
% In case you get cast message {stopme, Message}
handle_cast({stopme, Message}, State) ->
% you will stop
{stop, normal, State}
handle_cast(Msg, State) ->
% do your stuff here with msg
{noreply, State}.
% process2.erl
% Here the code to stop process1
gen_server:cast(Pid, {stopme, "It's time to stop!"}),
More about it you can find here: http://www.erlang.org/doc/man/gen_server.html

Query an Erlang process for its state?

A common pattern in Erlang is the recursive loop that maintains state:
loop(State) ->
receive
Msg ->
NewState = whatever(Msg),
loop(NewState)
end.
Is there any way to query the state of a running process with a bif or tracing or something? Since crash messages say "...when state was..." and show the crashed process's state, I thought this would be easy, but I was disappointed that I haven't been able to find a bif to do this.
So, then, I figured using the dbg module's tracing would do it. Unfortunately, I believe because these loops are tail call optimized, dbg will only capture the first call to the function.
Any solution?
If your process is using OTP, it is enough to do sys:get_status(Pid).
The error message you mentions is displayed by SASL. SASL is an error reporting daemon in OTP.
The state you are referring in your example code is just an argument of tail recursive function. There is no way to extract it using anything except for tracing BIFs. I guess this would be not a proper solution in production code, since tracing is intended to be used only for debug purposes.
Proper, and industry tested, solution would be make extensive use of OTP in your project. Then you can take full advantage of SASL error reporting, rb module to collect these reports, sys - to inspect the state of the running OTP-compatible process, proc_lib - to make short-lived processes OTP-compliant, etc.
It turns out there's a better answer than all of these, if you're using OTP:
sys:get_state/1
Probably it didn't exist at the time.
It looks like you're making the problem out of nothing. erlang:process_info/1 gives enough information for debugging purposes. If your REALLY need loop function arguments, why don't you give it back to caller in response to one of the special messages that you define yourself?
UPDATE:
Just to clarify terminology. The closest thing to the 'state of the process' on the language level is process dictionary, usage of which is highly discouraged. It can be queried by erlang:process_info/1 or erlang:process/2.
What you actually need is to trace process's local functions calls along with their arguments:
-module(ping).
-export([start/0, send/1, loop/1]).
start() ->
spawn(?MODULE, loop, [0]).
send(Pid) ->
Pid ! {self(), ping},
receive
pong ->
pong
end.
loop(S) ->
receive
{Pid, ping} ->
Pid ! pong,
loop(S + 1)
end.
Console:
Erlang (BEAM) emulator version 5.6.5 [source] [smp:2] [async-threads:0] [kernel-poll:false]
Eshell V5.6.5 (abort with ^G)
1> l(ping).
{module,ping}
2> erlang:trace(all, true, [call]).
23
3> erlang:trace_pattern({ping, '_', '_'}, true, [local]).
5
4> Pid = ping:start().
<0.36.0>
5> ping:send(Pid).
pong
6> flush().
Shell got {trace,<0.36.0>,call,{ping,loop,[0]}}
Shell got {trace,<0.36.0>,call,{ping,loop,[1]}}
ok
7>
{status,Pid,_,[_,_,_,_,[_,_,{data,[{_,State}]}]]} = sys:get_status(Pid).
That's what I use to get the state of a gen_server. (Tried to add it as a comment to the reply above, but couldn't get formatting right.)
As far as I know you cant get the arguments passed to a locally called function. I would love for someone to prove me wrong.
-module(loop).
-export([start/0, loop/1]).
start() ->
spawn_link(fun () -> loop([]) end).
loop(State) ->
receive
Msg ->
loop([Msg|State])
end.
If we want to trace this module you do the following in the shell.
dbg:tracer().
dbg:p(new,[c]).
dbg:tpl(loop, []).
Using this tracing setting you get to see local calls (the 'l' in tpl means that local calls will be traced as well, not only global ones).
5> Pid = loop:start().
(<0.39.0>) call loop:'-start/0-fun-0-'/0
(<0.39.0>) call loop:loop/1
<0.39.0>
6> Pid ! foo.
(<0.39.0>) call loop:loop/1
foo
As you see, just the calls are included. No arguments in sight.
My recommendation is to base correctness in debugging and testing on the messages sent rather than state kept in processes. I.e. if you send the process a bunch of messages, assert that it does the right thing, not that it has a certain set of values.
But of course, you could also sprinkle some erlang:display(State) calls in your code temporarily. Poor man's debugging.
This is a "oneliner" That can be used in the shell.
sys:get_status(list_to_pid("<0.1012.0>")).
It helps you convert a pid string into a Pid.

Resources