There is an locked door example about gen_fsm in the Elrang Otp System Documentation. I have a question about timeout. I will copy the code here first:
-module(code_lock).
-behaviour(gen_fsm).
-export([start_link/1]).
-export([button/1]).
-export([init/1, locked/2, open/2]).
start_link(Code) ->
gen_fsm:start_link({local, code_lock}, code_lock, lists:reverse(Code), []).
button(Digit) ->
gen_fsm:send_event(code_lock, {button, Digit}).
init(Code) ->
{ok, locked, {[], Code}}.
locked({button, Digit}, {SoFar, Code}) ->
case [Digit|SoFar] of
Code ->
do_unlock(),
{next_state, open, {[], Code}, 30000};
Incomplete when length(Incomplete)<length(Code) ->
{next_state, locked, {Incomplete, Code}};
_Wrong ->
{next_state, locked, {[], Code}}
end.
open(timeout, State) ->
do_lock(),
{next_state, locked, State}.
Here is the question: when the door is opened, if I press the button, the gen_fsm will have an {button, Digit} event at the state open. An error will occurs. But if I add these code after open function:
open(_Event, State) ->
{next_state, open, State}.
Then if I press the button in 30s, the timeout will not be occurs. The door will be opened forever. What should I do?
Thanks.
Update:
I know I could use send_event_after or something like that. But I don't think it is a good idea. Because the state you excepted to handle the message may be changed in a complex application.
For example, if I have a function to lock the door manually after the door opened in 30s. Then locked will handle the timeout message, which is not the excepted behaviour.
You could maintain the remaining timeout in StateData. To do this, add a third item to the tuple:
init(Code) ->
{ok, locked, {[], Code, infinity}}.
You'll need to change locked to set the initial value:
locked({button, Digit}, {SoFar, Code, _Until}) ->
case [Digit|SoFar] of
Code ->
do_unlock(),
Timeout = 30000,
Now = to_milliseconds(os:timestamp()),
Until = Now + Timeout,
{next_state, open, {[], Code, Until}, Timeout};
Incomplete when length(Incomplete)<length(Code) ->
{next_state, locked, {Incomplete, Code, infinity}};
_Wrong ->
{next_state, locked, {[], Code, infinity}}
end.
And, if a button is pressed while open, calculate the new timeout and go around again:
open({button, _Digit}, {_SoFar, _Code, Until} = State) ->
Now = to_milliseconds(os:timestamp()),
Timeout = Until - Now,
{next_state, open, State, Timeout};
You'll also need the following helper function:
to_milliseconds({Me, S, Mu}) ->
(Me * 1000 * 1000 * 1000) + (S * 1000) + (Mu div 1000).
You should be specifying a timeout at the open function "open(_Event, State)"
Since the next state is proceeded without timeout.. the door will remain open forever and no where a timeout occurs..
The newly defined function should be
open(_Event, State) ->
{next_state, open, State, 30000}. %% State should be re-initialized
Using the fsm timeout, it is not possible - as far as I know - to avoid the re-initialization of it:
If you don't specify a new timeout when you skip the event while the door is open, it will remain open forever, as you notice.
If you specify one, it will restart from the beginning.
If none of these solutions satisfy you, you can use an external process to create the timeout:
-module(code_lock).
-behaviour(gen_fsm).
-export([start_link/1]).
-export([button/1,stop/0]).
-export([init/1, locked/2, open/2,handle_event/3,terminate/3]).
start_link(Code) ->
gen_fsm:start_link({local, code_lock}, code_lock, lists:reverse(Code), []).
button(Digit) ->
gen_fsm:send_event(code_lock, {button, Digit}).
stop() ->
gen_fsm:send_all_state_event(code_lock, stop).
init(Code) ->
{ok, locked, {[], Code}}.
locked({button, Digit}, {SoFar, Code}) ->
case [Digit|SoFar] of
Code ->
do_unlock(),
timeout(10000,code_lock),
{next_state, open, {[], Code}};
Incomplete when length(Incomplete)<length(Code) ->
{next_state, locked, {Incomplete, Code}};
_Wrong ->
{next_state, locked, {[], Code}}
end.
open(timeout, State) ->
do_lock(),
{next_state, locked, State};
open(_, State) ->
{next_state, open, State}.
handle_event(stop, _StateName, StateData) ->
{stop, normal, StateData}.
terminate(normal, _StateName, _StateData) ->
ok.
do_lock() -> io:format("locking the door~n").
do_unlock() -> io:format("unlocking the door~n").
timeout(X,M) ->
spawn(fun () -> receive
after X -> gen_fsm:send_event(M,timeout)
end
end).
There are a bunch of functions in the module timer to do that, preferable to my custom example.
maybe a better usage of the Fsm timeout should be in the lock state:
wait for the first digit without timeout
a digit is entered and code is complete -> test it and continue without timeout (lock or open depending on code entered)
a digit is entered and code is not complete-> store it and continue with timeout
if an unexpected event occurs -> restart from begining without timeout
if timeout barks, restart from begining without timeout
EDIT:
to Bin Wang: what you say in your update is correct, but you cannot avoid to manage this situation. I don't know any built in function that cover your use case. To satisfy it you will need to manage the unexpected timeout message in the lock state, but to avoid multiple timeout running, you will need also to stop the current one before to go to lock state. Note that this does not prevent you to manage the timeout message in lock state, because there is a race between the message to stop the timer and the timeout itself. I wrote for one of my application a general purpose apply_after function that can be canceled, stopped and resumed:
applyAfter_link(T, F, A) ->
V3 = time_ms(),
spawn_link(fun () -> applyAfterp(T, F, A, V3) end).
applyAfterp(T, F, A, Time) ->
receive
cancel -> ok;
suspend when T =/= infinity ->
applyAfterp(infinity, F, A, T + Time - time_ms());
suspend ->
applyAfterp(T, F, A, Time);
resume when T == infinity ->
applyAfterp(Time, F, A, time_ms());
resume ->
Tms = time_ms(), applyAfterp(T + Time - Tms, F, A, Tms)
after T ->
%% io:format("apply after time ~p, function ~p, arg ~p , stored time ~p~n",[T,F,A,Time]),
catch F(A)
end.
time_us() ->
{M, S, U} = erlang:now(),
1000000 * (1000000 * M + S) + U.
time_ms() -> time_us() div 1000.
You will need to sore the Pid of the timeout process in the FSM state.
Related
Here is an example trace where I'm able to call erlang:monitor/2 on the same Pid:
1> Loop = fun F() -> F() end.
#Fun<erl_eval.30.99386804>
2> Pid = spawn(Loop).
<0.71.0>
3> erlang:monitor(process, Pid).
#Ref<0.2485499597.1470627842.126937>
4> erlang:monitor(process, Pid).
#Ref<0.2485499597.1470627842.126942>
5> erlang:monitor(process, Pid).
#Ref<0.2485499597.1470627842.126947>
The expressions returned by instruction #4 and #5 are different than #3, meaning that it is possible to create multiple monitor references between the current process and Pid. Is there a practical case where you would need or use multiple monitor references to the same process?
I would expect this to return the same reference (returning a new one would perhaps imply that the old one had failed/crashed), following the same logic that exists for link/1.
Imagine you use third party library which does this (basically what OTP *:call/* functions does):
call(Pid, Request) ->
call(Pid, Request, ?DEFAULT_TIMEOUT).
call(Pid, Request, Timeout) ->
MRef = erlang:monitor(process, Pid),
Pid ! {call, self(), MRef, Request},
receive
{answer, MRef, Result} ->
erlang:demonitor(Mref, [flush]),
{ok, Result};
{'DOWN', MRef, _, _, Info} ->
{error, Info}
after Timeout ->
erlang:demonitor(MRef, [flush]),
{error, timeout}
end.
and then you use it in your code where you would monitor the same process Pid and then call function call/2,3.
my_fun1(Service) ->
MRef = erlang:monitor(process, Service),
ok = check_if_service_runs(MRef),
my_fun2(Service),
mind_my_stuf(),
ok = check_if_service_runs(MRef),
erlang:demonitor(MRef, [flush]),
return_some_result().
check_if_service_runs(MRef) ->
receive
{'DOWN', MRef, _, _, Info} -> {down, Info}
after 0 -> ok
end.
my_fun2(S) -> my_fun3(S).
% and a many layers of other stuff and modules
my_fun3(S) -> call(S, hello).
What a nasty surprise it would be if erlang:monitor/2,3 would always return the same reference and if erlang:demonitor/1,2 would remove your previous monitor. It would be a source of ugly and unsolvable bugs. You should start to think that there are libraries, other processes, your code is part of a huge system and Erlang was made by experienced people who thought it through. Maintainability is key here.
Try to use OTP-style in project and got one OTP-interface question. What solution is more popular/beautiful?
What I have:
web-server with mochiweb
one process, what spawns many (1000-2000) children.
Children contain state (netflow-speed). Process proxies messages to children and create new children, if need.
In mochiweb I have one page with speed of all actors, how whey made:
nf_collector ! {get_abonents_speed, self()},
receive
{abonents_speed_count, AbonentsCount} ->
ok
end,
%% write http header, chunked
%% and while AbonentsCount != 0, receive speed and write http
This is not-opt style, how i can understand. Solutions:
In API synchronous function get all requests with speed and return list with all speeds. But I want write it to client at once.
One argument of API-function is callback:
nf_collector:get_all_speeds(fun (Speed) -> Resp:write_chunk(templater(Speed)) end)
Return iterator:
One of results of get_all_speeds will be function with receive-block. Every call of it will return {ok, Speed}, at the end it return {end}.
get_all_speeds() ->
nf_collector ! {get_abonents_speed, self()},
receive
{abonents_speed_count, AbonentsCount} ->
ok
end,
{ok, fun() ->
create_receive_fun(AbonentsCount)
end}.
create_receive_fun(0)->
{end};
create_receive_fun(Count)->
receive
{abonent_speed, Speed} ->
Speed
end,
{ok, Speed, create_receive_fun(Count-1)}.
Spawn your 'children' from a supervisor:
-module(ch_sup).
-behaviour(supervisor).
-export([start_link/0, init/1, start_child/1]).
start_link() -> supervisor:start_link({local, ?MODULE}, ?MODULE, []).
init([]) -> {ok, {{simple_one_for_one}, [{ch, {ch, start_link, []}, transient, 1000, worker, [ch]}]}}.
start_child(Data) -> supervisor:start_child(?MODULE, [Data]).
Start them with ch_sup:start_child/1 (Data is whatever).
Implement your children as a gen_server:
-module(ch).
-behaviour(gen_server).
-record(?MODULE, {speed}).
...
get_speed(Pid, Timeout) ->
try
gen_server:call(Pid, get, Timeout)
catch
exit:{timeout, _} -> timeout;
exit:{noproc, _} -> died
end
.
...
handle_call(get, _From, St) -> {reply, {ok, St#?MODULE.speed}, St} end.
You can now use the supervisor to get the list of running children and query them, though you have to accept the possibility of a child dying between getting the list of children and calling them, and obviously a child could for some reason be alive but not respond, or respond with an error, etc.
The get_speed/2 function above returns either {ok, Speed} or died or timeout. It remains for you to filter appropriately according to your applications needs; easy with a list comprehension, here's a few.
Just the speeds:
[Speed || {ok, Speed} <- [ch:get_speed(Pid, 1000) || Pid <-
[Pid || {undefined, Pid, worker, [ch]} <-
supervisor:which_children(ch_sup)
]
]].
Pid and speed tuples:
[{Pid, Speed} || {Pid, {ok, Speed}} <-
[{Pid, ch:get_speed(Pid, 1000)} || Pid <-
[Pid || {undefined, Pid, worker, [ch]} <-
supervisor:which_children(ch_sup)]
]
].
All results, including timeouts and 'died' results for children that died before you got to them:
[{Pid, Any} || {Pid, Any} <-
[{Pid, ch:get_speed(Pid, 1000)} || Pid <-
[Pid || {undefined, Pid, worker, [ch]} <-
supervisor:which_children(ch_sup)]
]
].
In most situations you almost certainly don't want anything other than the speeds, because what are you going to do about deaths and timeouts? You want those that die to be respawned by the supervisor, so the problem is more or less fixed by the time you know about it, and timeouts, as with any fault, are a separate problem, to be dealt with in whatever way you see fit... There's no need to mix the fault fixing logic with the data retrieval logic though.
Now, the problem with all these, which I think you were getting at in your post, but I'm not quite sure, is that the timeout of 1000 is for each call, and each call is synchronous one after the other, so for 1000 children with a 1 second timeout, it could take 1000 seconds to produce no results. Making time timeout 1ms might be the answer, but to do it properly is a bit more complicated:
get_speeds() ->
ReceiverPid = self(),
Ref = make_ref(),
Pids = [Pid || {undefined, Pid, worker, [ch]} <-
supervisor:which_children(ch_sup)],
lists:foreach(
fun(Pid) -> spawn(
fun() -> ReceiverPid ! {Ref, ch:get_speed(Pid, 1000)} end
) end,
Pids),
receive_speeds(Ref, length(Pids), os_milliseconds(), 1000)
.
receive_speeds(_Ref, 0, _StartTime, _Timeout) ->
[];
receive_speeds(Ref, Remaining, StartTime, Timeout) ->
Time = os_milliseconds(),
TimeLeft = Timeout - Time + StartTime,
receive
{Ref, acc_timeout} ->
[];
{Ref, {ok, Speed}} ->
[Speed | receive_speeds(Ref, Remaining-1, StartTime, Timeout)];
{Ref, _} ->
receive_speeds(Ref, Remaining-1, StartTime, Timeout)
after TimeLeft ->
[]
end
.
os_milliseconds() ->
{OsMegSecs, OsSecs, OsMilSecs} = os:timestamp(),
round(OsMegSecs*1000000 + OsSecs + OsMilSecs/1000)
.
Here each call is spawned in a different process and the replies collected, until the 'master timeout' or they have all been received.
Code has largely been cut-n-pasted from various works I have lying round, and edited manually and by search replace, to anonymise it and remove surplus, so it's probably mostly compilable quality, but I don't promise I didn't break anything.
I'm making this call:
add(Login, Pass, Role) ->
gen_server:call(?SERVER, {add, Login, Pass, Role}).
and I expect it to match with:
handle_call(State, {add, Login, Pass, Role}) ->
io:format("add ~n"),
Db = State#state.db,
case lists:keyfind(Login, 1, Db) of
false->
io:format("add - reg new ~n"),
{reply, registered, State#state{db=[{Login, erlang:md5(Pass), Role, ""}|Db]}};
{Key, Result}->
{reply, invalid_params, Db}
end.
but it always goes to:
handle_call(_Request, _From, State) ->
io:format("undef ~n"),
Reply = ok,
{reply, Reply, State}.
What's wrong?
The behaviour seems valid,
handle_call has such spec:
-spec(handle_call(Request :: term(), From :: {pid(), Tag :: term()},
State :: #state{}) ->
{reply, Reply :: term(), NewState :: #state{}} |
{reply, Reply :: term(), NewState :: #state{}, timeout() | hibernate} |
{noreply, NewState :: #state{}} |
{noreply, NewState :: #state{}, timeout() | hibernate} |
{stop, Reason :: term(), Reply :: term(), NewState :: #state{}} |
{stop, Reason :: term(), NewState :: #state{}}).
If you can take a look here
http://erlang.org/doc/man/gen_server.html#Module:handle_call-3
Also, for otp default behaviours, it would be the best, as a start, to use templates. For gen_server eg https://gist.github.com/kevsmith/1211350
Cheers!
In a module using the gen_server behaviour, the handle_call callback function should take three arguments. However, you have defined two different functions, handle_call/2 and handle_call/3. (In Erlang, functions that have the same name but take different numbers of arguments are considered different functions.)
Since the gen_server module only looks for handle_call/3 and ignores handle_call/2, your "undef" function is always called.
To fix this, change the function to take an (ignored) second argument, and put the request first and the state last:
handle_call({add, Login, Pass, Role}, _From, State) ->
and change the end. to end; — . separates different functions, while ; separates different clauses of the same function.
Quis custodiet ipsos custodes? -- (Decimus Iunius Iuvenalis)
I have the following setup:
On one node ('one#erlang.enzo') a server process is running which has a watchdog running one another node ('two#erlang.enzo'). When the server starts up, it will start its watchdog on the remote node. When the server exits ungracefully, the watchdog starts the server again. When the watchdog exits, the server starts it again.
The server is started as part of the runlevel after the network is up.
The server also monitors the remote node and starts a watchdog as soon as it (i.e. the node) comes online. Now connection losses between server and watchdog can have two reasons: First, the network may go down; second, the node may crash or be killed.
My code seems to work, but I have the slight suspicion that the following is happening:
When the watchdog node is shut down (or killed or crashed) and is restarted, the server correctly restarts its watchdog.
But when the network fails and the watchdog node keeps running, the server starts a new watchdog when connection is reestablished and leaves one zombie watchdog behind.
My questions are:
(A) Do I create zombies?
(B) In the case of a network loss, how can the server check if the watchdog is still alive (and vice versa)?
(C) If B is possible, how can I reconnect the old server and the old watchdog?
(D) What other major (and minor) flaws do you, distinguished reader, spot in my setup?
EDIT: The die and kill_dog messages are for faking ungraceful exits and won't make it beyond debugging.
Here goes the code:
-module (watchdog).
-compile (export_all).
init () ->
io:format ("Watchdog: Starting # ~p.~n", [node () ] ),
process_flag (trap_exit, true),
loop ().
loop () ->
receive
die -> 1 / 0;
{'EXIT', _, normal} ->
io:format ("Watchdog: Server shut down.~n");
{'EXIT', _, _} ->
io:format ("Watchdog: Restarting server.~n"),
spawn ('one#erlang.enzo', server, start, [] );
_ -> loop ()
end.
-module (server).
-compile (export_all).
start () ->
io:format ("Server: Starting up.~n"),
register (server, spawn (fun init/0) ).
stop () ->
whereis (server) ! stop.
init () ->
process_flag (trap_exit, true),
monitor_node ('two#erlang.enzo', true),
loop (down, none).
loop (Status, Watchdog) ->
{NewStatus, NewWatchdog} = receive
die -> 1 / 0;
stop -> {stop, none};
kill_dog ->
Watchdog ! die,
{Status, Watchdog};
{nodedown, 'two#erlang.enzo'} ->
io:format ("Server: Watchdog node has gone down.~n"),
{down, Watchdog};
{'EXIT', Watchdog, noconnection} ->
{Status, Watchdog};
{'EXIT', Watchdog, Reason} ->
io:format ("Server: Watchdog has died of ~p.~n", [Reason] ),
{Status, spawn_link ('two#erlang.enzo', watchdog, init, [] ) };
_ -> {Status, Watchdog}
after 2000 ->
case Status of
down -> checkNode ();
up -> {up, Watchdog}
end
end,
case NewStatus of
stop -> ok;
_ -> loop (NewStatus, NewWatchdog)
end.
checkNode () ->
net_adm:world (),
case lists:any (fun (Node) -> Node =:= 'two#erlang.enzo' end, nodes () ) of
false ->
io:format ("Server: Watchdog node is still down.~n"),
{down, none};
true ->
io:format ("Server: Watchdog node has come online.~n"),
monitor_node ('two#erlang.enzo', true),
Watchdog = spawn_link ('two#erlang.enzo', watchdog, init, [] ),
{up, Watchdog}
end.
Using global module to register watchdog should prevent your concern:
watchdog.erl:
-module (watchdog).
-compile (export_all).
init () ->
io:format ("Watchdog: Starting # ~p.~n", [node () ] ),
process_flag (trap_exit, true),
global:register_name (watchdog, self ()),
loop ().
loop () ->
receive
die -> 1 / 0;
{'EXIT', _, normal} ->
io:format ("Watchdog: Server shut down.~n");
{'EXIT', _, _} ->
io:format ("Watchdog: Restarting server.~n"),
spawn ('one#erlang.enzo', server, start, [] );
_ -> loop ()
end.
server.erl:
checkNode () ->
net_adm:world (),
case lists:any (fun (Node) -> Node =:= 'two#erlang.enzo' end, nodes () ) of
false ->
io:format ("Server: Watchdog node is still down.~n"),
{down, none};
true ->
io:format ("Server: Watchdog node has come online.~n"),
global:sync (), %% not sure if this is necessary
case global:whereis_name (watchdog) of
undefined ->
io:format ("Watchdog process is dead"),
Watchdog = spawn_link ('two#erlang.enzo', watchdog, init, [] );
Watchdog ->
io:format ("Watchdog process is still alive")
end,
{up, Watchdog}
end.
Say I have some function fn1() in Erlang which returns {ok, Result} if the function was executed successfully and {error, "ErrorReason"} if there was an error.
Now in another function fn2() I call fn1() and I need to check the result of fn1 and proceed only if it is {ok, Result}.
I figured, I can do this using either case or try catch. But Efficiency is my main concern and I'd like to know which of the two methods below is more efficient:
try-catch Method
fn2() ->
try
{ok, Result} = fn1(),
%Do something with Result
ok
catch
throw:Term -> Term;
exit:Reason -> {exit, Reason};
error:Reason -> {error,{Reason,erlang:get_stacktrace()}}
end.
case Method
fn2() ->
Res = fn1(),
case Res of
{ok, Result} ->
%Do something with Result
ok;
{error, Reason} ->
Reason
end.
You really want to try and avoid try/catch like the plague. It is a very uncommon idiom in Erlang - really only used in a couple of special cases:
where you are checking user-supplied
input and you have no guarantees that
it will be 'correct'
where you have something which is
deeply nested and the cost of
unrolling it on an error condition
is too expensive
like mensia transactions
or in parser/lexer's
Try/catch is essential in languages like C++ where the application is unstable in the presence or errors, but Erlang is stable in those circumstances - the process crashes but doens't bring the system down.
You should programme the happy path, match return values and if the application deviates from what you expect then let it crash. The crash tells you you have a problem and tells you to fix it.
The problem with try/catch is that it can simply mask the problem, or even worse, move the eventual crash away from where it should happen (inside the expression you have wrapped) and make it appear elsewhere - where your programming logic expects it to have suceeded = which makes debugging much harder.
Programming the happy path and letting it crash is very disconcerting the first time you do it, it feels like going out with no clothes on, but actually you get used to it real quick :)
The case method will be more efficient, as it simply pattern matches, and does not involve building a call stack and stuff.
In both examples you are about to handle the "error" locally, so there is no point in the try catch.What you might see sometimes is something like:
fn2() ->
{ok, Result} = fn1(),
%Do stuff with Result
ok.
Here the intention is that you make fn2() throw a badmatch, if fn1() did not return ok. You let someone else "above" handle the problem. E.g. this might kill your process, and make your supervisor create a new one.
You should always measure to find things like this out.
Your code also does not do what you think it does.
-module(glurk).
-compile(export_all).
fn2() ->
try
{ok, Result} = fn1(),
%Do something with Result
ok
catch
throw:Term -> Term;
exit:Reason -> {exit, Reason};
error:Reason -> {error,{Reason,erlang:get_stacktrace()}}
end.
fn1() ->
{error, a}.
Try this out:
c(glurk).
./glurk.erl:6: Warning: variable 'Result' is unused
{ok,glurk}
16> glurk:fn2().
{error,{{badmatch,{error,a}},
[{glurk,fn2,0},
{erl_eval,do_apply,5},
{shell,exprs,6},
{shell,eval_exprs,6},
{shell,eval_loop,3}]}}
This is because fn1 did not raise an exception
it gebnerated a normal retyurn value {error, a} which
does not pattern match against {ok, Result}
The first version of your code works with a function that either returns a normal value
or raises an exception - you have to write it like this:
fn1(....) ->
...
%% success case
Val;
%% failure case
throw(...) | exit(...) | error(...)
You can't just pump the same function into fn1 and fn2.
If you had the case where the called function had to escape from a deep recursion
then the first method would be more efficient than the second - since you could
immediately exit from a deep recursion by saying throw(...).
So the answer depends upon the nature of the function that you are calling.
Code should always be optimised for beauty and not efficiency - since you have
to maintain the stuff - then it should only be optimised in the rare cases
where it is not fast enough. What needs to be optimised should be identified
by measuring the program (you will always be surprised here :-)
Me, I'd write
{ok,Result} = ...
Actually your first code has a more subtle error
fn2() ->
try
{ok, Result} = fn1(),
%Do something with Result
ok
catch
throw:Term -> Term;
exit:Reason -> {exit, Reason};
error:Reason -> {error,{Reason,erlang:get_stacktrace()}}
end.
Think about this. The caught error cases do not themselves handle the error
they just return tuples like {exit, Reason} or {error, Reason} this means that the
next layer up (ie the caller of fn2) will also have to mess around checking
error returns - if this is repeated at all levels the code will be a mess.
The "erlang" way is to have one try-catch at the top of the program and just terminate
abruptly with exit(Why) if an error occurs.
In fact often you should not even do this - you should link your process to another process
then the offending process will die and "the other processes will fix the error".
The exception propagates up the call stack and flies over to the linked processes
for treatment. So we have two types of processes - ones that have no inbuilt error handling
and processes that only do error handling.
In this case, irrespective of what is more efficient, you should definitely use the case alternative as it more succinctly describes what is going on. Your fn1() here return a value indicating if there is a successful value or an error. In the case version you directly match against this, while in the try version you match against the success value which will generate an error if an error was returned. It is unnecessarily convoluted and hides what is going on so it is bad programming style and should be avoided.
And as Gordon has already pointed out having a try there will catch more errors than you probably intend and may so mask other real errors which you should see.
The case will also be faster here, but the difference is probably small. The clarity, succinctness and good programming style is much more important!
The case will always be more efficient, but the difference is small and it's more important what makes more sense in your particular case. Especially, which approach produces more understandable code. See this benchmark:
%% Results:
%% 7> errors:run_normal(100).
%% {9,ok}
%% 8> errors:run_normal(1000).
%% {107,ok}
%% 9> errors:run_normal(10000).
%% {856,ok}
%% 10> errors:run_normal(1000, 10).
%% {263,ok}
%% 11> errors:run_wcatch(10000).
%% {2379,ok}
%% 12> errors:run_wcatch(1000, 10).
%% {401,ok}
%% 18> errors:run_normal_cplx(10000, 50).
%% {7910,ok}
%% 19> errors:run_wcatch_cplx(10000, 50).
%% {10222,ok}
-module(errors).
-compile(export_all).
run_normal(Iterations) ->
get_result(Iterations, fun() -> normal() end).
run_normal(Iterations, Level) ->
get_result(Iterations, fun() -> deepnormal(Level) end).
run_wcatch(Iterations) ->
get_result(Iterations, fun() -> wcatch() end).
run_wcatch(Iterations, Level) ->
get_result(Iterations, fun() -> deepwcatch(Level) end).
run_normal_cplx(Iterations) ->
get_result(Iterations, fun() -> normal_complex() end).
run_normal_cplx(Iterations, Level) ->
get_result(Iterations, fun() -> deepnormal_complex(Level) end).
run_wcatch_cplx(Iterations) ->
get_result(Iterations, fun() -> wcatch_complex() end).
run_wcatch_cplx(Iterations, Level) ->
get_result(Iterations, fun() -> deepwcatch_complex(Level) end).
%%------------------------------------------------------------------------------
get_result(Iterations, Fun) ->
timer:tc(fun() -> run(Iterations, Fun) end).
run(0, _Fun) ->
ok;
run(Iterations, Fun) ->
Fun(),
run(Iterations - 1, Fun).
%%------------------------------------------------------------------------------
normal() ->
case foo(atom) of
{ok, atom} -> ok;
{error, atom} -> ok
end.
normal_complex() ->
case foo_cplx() of
{ok, Res} -> Res;
{error, Res} -> Res
end.
deepnormal(Level) ->
case deepfoo(atom, Level) of
{ok, atom} -> ok;
{error, atom} -> ok
end.
deepnormal_complex(Level) ->
case deepfoo_cplx(Level) of
{ok, Res} -> Res;
{error, Res} -> Res
end.
wcatch() ->
try
{ok, atom} = foothrow(atom)
catch
throw:{error, atom} -> ok
end.
wcatch_complex() ->
try
{ok, _Res} = foothrow_cplx()
catch
throw:{error, Res} -> Res
end.
deepwcatch(Level) ->
try
{ok, atom} = deepfoothrow(atom, Level)
catch
throw:{error, atom} -> ok
end.
deepwcatch_complex(Level) ->
try
{ok, _Res} = deepfoothrow_cplx(Level)
catch
throw:{error, Res} -> Res
end.
%%------------------------------------------------------------------------------
foo(Arg) -> {error, Arg}.
foothrow(Arg) -> throw({error, Arg}).
deepfoo(Arg, 0) -> {error, Arg};
deepfoo(Arg, Level) -> deepfoo(Arg, Level - 1).
deepfoothrow(Arg, 0) -> throw({error, Arg});
deepfoothrow(Arg, Level) -> deepfoothrow(Arg, Level - 1).
foo_cplx() -> {error, {<<"Some">>, "Complex", data}}.
foothrow_cplx() -> throw({error, {<<"Some">>, "Complex", data}}).
deepfoo_cplx(0) -> {error, {<<"Some">>, "Complex", data}};
deepfoo_cplx(Level) -> deepfoo_cplx(Level - 1).
deepfoothrow_cplx(0) -> throw({error, {<<"Some">>, "Complex", data}});
deepfoothrow_cplx(Level) -> deepfoothrow_cplx(Level - 1).