Non-defensive programming in Erlang - erlang

The following lines appears in http://aosabook.org/en/riak.html, in the second paragraph of the section: 15.1. An Abridged Introduction to Erlang:
"Calling the function with a negative number will result in a run time error, as none of the clauses match. Not handling this case is an example of non-defensive programming, a practice encouraged in Erlang."
Two questions: What is the way the idiomatic way to handle the resulting error in Erlang; and Why would this be better than explicitly covering all cases, as in languages like OCaml or Haskell?

If you code nothing for error cases, letting the system generating a run time error you get at least 3 advantages:
the code is smaller, easier to read, focused on the function to realize.
in error cases, the system will raise an error compliant with the OTP standard, you will benefit for free of all OTP mechanisms to handle the situation at the appropriate level.
you automatically avoid the "lasagna" error detection syndrome, where many code layers track the same error case.
Now you can concentrate on the error management: where will you handle the errors. Erlang offers the classical way with the try and catch statements, and has a more idiomatic way with the OTP supervision tree and the link and monitor mechanism.
In a few words you have the control on the consequence of one process crash (which processes will crash with it, which processes will be informed) and sophisticated means to restart them.
It is important to keep in mind that in erlang you generally starts many small processes which have a very limited role an responsibility, and in this case, letting them crash and restart really makes sense.
I am a fan of the learnyousomeerlang web site where you will find 3 chapters related to the error management:
Errors and Exceptions
Errors and Processes
Who Supervises The Supervisors?

Related

Understanding supervisor duty in Erlang/Elixir

I wrote a new library called director.
It's a supervisor library.
One of its feature is giving a fun with arity 2 to director, and director will call function for every crash of process, first argument is crash reason and second is crash count, for example:
-module(director_test).
-behaviour(director).
-export([start_link/0, init/1]).
start_link() ->
director:start_link(?MODULE, []).
init([]) ->
ChildSpec = #{id => foo,
start => {m, f, args},
plan => [fun my_plan/2],
count => infinity},
{ok, [ChildSpec]}.
my_plan(normal, Count) when Count rem 10 == 0 ->
%% If process crashed with reason normal after every 10 times
%%, director will restart it after spending 3000 milliseconds.
{restart, 3000};
my_plan(normal, _Count) ->
%% If process crashed with reason normal director will restart its
restart;
my_plan(killed, _Count) ->
%% If process was killed, Director will delete it from its children
delete;
my_plan(Reason, Count) ->
%% For other reasons, director will crash with reason {foo_crashed, Reason}
{stop, {foo_crashed, Reason}}.
I announced my library in Slack and they was wondering about writing new supervisor in this way !
Someone said that "I tend to not let the supervisor handle back-off".
Finally they did not tell me clean information and i think i need to know more about supervisor and its duty, etc.
I think that a supervisor is a process that should understand when to restart which child and when to delete which child and when to not restart which child. Am i right?
Can you tell me some good features of OTP/Supervisor that i have not in Director? (List of director's features)
You are mixing the ideas of supervision and management.
Supervision is already a part of OTP. It is the basic idea that:
No process can ever possibly become an orphan
Crashes will be restarted or aborted, and this is an architectural decision made before internal logic is written.
Crashes can be logged externally (handled by a process other than whatever failed).
Error handling code, crash forensics, and so on never occur as part of supervision. Ever. (Complex logic leads to complex weirdness, and supervision needs to be simple, robust, and reliable.)
Management is something that may or may not be present in your system, so it is left up to you. It is the idea that you would have a single (usually named) process that guides the overall high-level task that your (supervised) workers are doing. Having a manager process gives you a single point of control for the overall effort being done -- which also means it is a single place you can tell that overall effort to start, stop, suspend itself, etc. and this is where you could add additional logic about selective restarts based on some crash condition.
Think of "supervision" as a low-level, system framework type idea. It is always the same in all programs just like opening a file or handling a network socket would be. Think of management as one discrete chunk of the actual problem your program needs to solve to accomplish its work.
Management may or may not be complex. Supervision must always be uniform and simple. Giving a supervisor too much responsibility makes them difficult to understand and debug, and often leads to business problems -- an overloaded supervisor can be a major problem in a system. Don't burden your supervisors with high-level management tasks.
I wrote an article about the "service -> worker pattern" in Erlang a while back. Hopefully it informs more than it confuses: https://zxq9.com/archives/1311
Please do not take this personally. You have asked for a feedback and I'm trying to give it to you.
After quickly looking at the docs and the code, I think the main problems with your library are:
You are introducing some complexity in the area where it's normally not needed. In the vast majority of Erlang programs you don't want to analyse why a process have crashed. Analysing it is prone to errors. So the "normal" solution is just to restart the process. If you introduce any logic at this point, you probably introduce some errors too. Such a program is harder to reason about and the advantages are disputable at least.
You are making an assumption that the exit reason is the reason why the process have exited. This is not necessarily true. The reason could have been propagated from its linked processes. If you wanted to really react on all possible exit reasons, you would have to make a transitive closure on all process exit reasons, all it's children exit reasons, all their children exit reasons etc. And you have to change it whenever any of the components changes which is very bad attitude, very error prone. And the introduced complexity (see 1) explodes very badly.
You introduce some "introspection" logic out of the context where the internal logic should be kept ideally - i.e. there's some knowledge about the internal working of the process used outside of its module - in the director's plan. This breaks encapsulation somewhat. The "normal" supervisor knows just how to start the process, it don't need any more information about the process internals.
Last but not least: you are probably solving a non-existing problem. Instead of developing a whole new solution, you should clearly identify the problems of an existing solution and try to solve them very directly and minimally.

When to "let it crash" and when to defend the code in Erlang?

So, with "let it crash" mantra Erlang code is meant to be resistant to cruel world events like unexpected pulling the plug, hardware failure, and unstable network connections.
On the other hand, there is defensive programming.
Being new to Erlang, I wonder, how to know when I want the process just crash and when I want it to defend the flow with if, case..of, type guards?
Say, I have an authentication module, which can return true/false result if authenticated successfully or not. Should it just have successful scenario and crash if the user authentication fails due to wrong login/password?
What about other scenarios, like, if a product is not found in the database, or search results are empty?
I suppose, I can't ignore defensive constructs completely, as any guards are in their nature to defend the "normal" flow of the app?
Is there a rule of thumb when to defend and when to crash?
As Fred Hebert says at http://ferd.ca/the-zen-of-erlang.html -
If I know how to handle an error, fine, I can do that for that
specific error. Otherwise, just let it crash!
I'd say that authentication errors, empty search results, etc., are expected errors and those that warrant an appropriate response to the user.
I don't think there is actually a Rule of Thumb in this case.
As I see it, whenever you know how to handle an expected error - handle it. In case of authentication, I don't really think it's an actual error, it's a normal behavior so go ahead and write few lines of code to handle this specific case.
In contrast, network failure is something that might happen for various of reasons, they are not actually expected as part of your code normal behaviour, so in this case I would go with the "let it crash" philosophy.
Anyway, when going with let it crash - you of course still need to handle the case where the process crashed (i.e. using links and monitors and restarting the process).
Please check also this very good answer. And you may read more about it here and here.

In what cases would you use a process monitor instead of a try catch around a function

I know supervisors can monitor many processes and OTP supervisors provide nice defaults like if there's too many errors in X time then don't retry.
My question is when do you personally choose try catch over a process monitor.
The tools you are speaking of have two completely different use cases.
You should only run things in a process that needs to happen in parallel with something else. Therefore, I would argue that you should use try/catch when you need to catch errors in a situation when you are fine waiting for the result (i.e. a sequential program). You should use an external process for when you need to run something in parallel, and you should monitor it if you are interested in exceptions happening in that process.
That being said, there are of course edge cases where externalizing an activity into a process makes some sense, like special garbage collection corner cases for example (it is sometimes easier/faster to garbage collect an activity by just killing its process).
Performance wise there are so many parameters involved (overhead of try/catch vs monitor, the frequency your code is run at etc.) that you'd have to benchmark for your case.
I prefer to use try/catch only in situations where throwing of error is normal behaviour e.g. say, lookup_value/1 in grpoc (it throws badarg if key is not registered, while I'd expect and prefer to get undefined instead).
That's Erlang philosophy as I understand it: you should program for good case and not try to be defensive too much i.e. there are some cases where you should care while in most cases just let it crash.

How to implement status in Erlang?

I am thinking an Erlang program that has many workers (loop receive), these workers almost always manipulate their status at the same time, ie. massive concurrent, the amount of workers is so big that keep their status in mnesia will cause performance problem, so I am thinking pass the status as args in each loop, then write to mnesia some time later. Is this a good practice? Is there a better way to do this? (roughly speaking, I'm looking for something like an instance with attributes in the object oriented language)
Thanks.
With Erlang, it is a good habit to see the processes as actor with a dedicated and limited role. With this in mind you will see that you will split your problem in different categories like:
Maintain the state of a connection with a user over the Internet,
Keep information such as login, user profile, friends, shop-cart...
log events
...
for each role you will have to decide if the state information must survive to the process.
In a lot of cases it is not necessary (case 1) and the solution is simply to keep the state in the argument of loop funtion of the process. I encourage you to look at the OTP behaviors, the gen_server and gen_fsm are made for this.
The case 2 obviously manipulates permanent data which must survive to a process crash or even a hardware crash. These data will be stored using dets, mnesia or any database adapted to your problem (Redis, CouchDB ...).
It is important to limit the information stored into external database, otherwise you will not benefit of this very powerful feature which is the lack of side effect. In other words, it is a very bad idea to have process behavior which depends on external information.

Is the process dictionary appropriate in this case?

I've read several comments here and elsewhere suggesting that Erlang's process dictionary was a bad idea and should die. Normally, as a total Erlang newbie, I'd just avoid it. However, in this situation my other options aren't great.
I have a main dispatcher function that looks something like this:
dispatch(State) ->
receive
{cmd1, Params} ->
NewState = do_cmd1_stuff(Params, State),
dispatch(NewState);
{cmd2, Params} ->
NewState = do_cmd2_stuff(Params, State),
dispatch(NewState);
BadMsg ->
log_error(BadMsg),
dispatch(State)
end.
Obviously, my names are more meaningful to me, but that's the gist of it. Deep down in a function called by a function called by a function called by do_cmd2_stuff(), I want to send out messages to all my users telling them about something I've done. In order to do that, I need to get the list of users from the point where I send the messages. The user list doesn't lend itself easily to sticking in the global state, since that's just one data structure representing the only block of data on which I operate.
The way I see it, I have a couple unpleasant options other than using the process dictionary. I can send the user list through all the various levels of functions down to the very bottom one that does the broadcasting. That's unpleasant because it causes all my functions to gain a parameter, whether they really care about it or not.
Alternatively, I could have all the do_cmdN_stuff() functions return a message to send. That's not great either though, since sending the message may not be the last thing I want to do and it clutters up my dispatcher with a bunch of {Msg, NewState} tuples. Furthermore, some of the functions might not have any messages to send some of the time.
Like I said earlier, I'm very new to Erlang. Maybe someone with more experience can point me at a better way. Is there one? Is the process dictionary appropriate in this case?
The general rule is that if you have doubts, you shouldn't use the process dictionary.
If the two options you mentioned aren't good enough (I personally like the one where you return the messages to send) and what you want is some particular piece of code to track users and forward messages to them, maybe what you want to do is have a process holding that info.
Pid ! {forward, Msg}
where Pid will take care of sending everything to a bunch of other processes. Now, you would still need to pass the Pid around, unless you give it a name in some registry to find it. Either with register/2, global or gproc.
A simple answer would be to nest your global within a state record, which is then threaded through the system, at least at the stop level. This makes it easy to add new fields to the state in the future, not an uncommon occurrence, and allow you to keep your global state data structure untouched. So initially
-record(state, {users=[],state_data}).
Defining it as a record makes it easy to access and extend when necessary.
As you mentioned you can always pass the user list as extra param, thats not so bad.
If you don't want to do this just put it in State. You can have a special State just for this part of the calculation that also contains the user list.
Then there always is the possibility of putting it in ETS or in another server process.
What exactly to do is hard to recommend since it depends a lot on your exact application and preferences.
Just choose from the mentioned possibilities as if the process dictionary doesn't exist. Maybe your code needs restructuring if none of the variants look elegant, there always is some better way without the process dictionary.
Its really bad it is still there, because its alluring to many beginning Erlang users.
You really should not use process dictionary. I accept using dictionary only if
It is short living process.
I have full control about the process from spawn to termination i.e. I use minimum and well known set of external modules.
I need performance gain badly. It means avoid copy of data when using ets and dict/gb_tree is too slow (for GC reason).
ad 1. is not your case, you are using in server. ad 2. I don't know if it is your case. ad 3. is not your case because you need list of recipient so you don't gain nothing from that process dictionary is very fast key/value storage. In your case I don't see any reason why you should not include what you need to your State. IMHO State is exactly the right place for it.
Its an interesting question because it involves the fundamentals of functional design.
My opinion:
Try as much as possible to make the function return the messages, then send them. This separates the two different tasks nicely, and separates the purely functional task from the one that causes side effects.
If this isn't possible, pass receivers as argument even if its a bit messy. If the broadcasting function uses that data, it should be given to it explicitly, for clarity and predictability.
Using ETS as Peer Stritzinger suggests is really not any better than the PD, both hides the fact that the broadcasting function uses the receiver list and makes it dependent on global data.
I'm not sure about the Erlang way of encapsulating some state in a process, as I GIVE TERRIBLE ADVICE suggests. Is it really any better that ETS or PD?
clutters up my dispatcher with a bunch
of {Msg, NewState}
This is my experience also, that you often end up like this. It's not particularly pretty, but functional design seems to encourage this. Could some language feature be introduced to make it more beautiful and natural?
EDIT:
6 years ago I wrote:
Could some language feature be introduced to make it more beautiful and natural?
After learning much more about functional programming I have realised that examples of this are state-monads and do-notation that are found in Haskell.
I would consider sending a special message to self() from deep inside the call stack, and handling it at the top level dispatch method that you've sketched, where list of users is available.

Resources