How to run an Erlang application under the supervisor - erlang

What does it mean to run an Erlang application under the supervisor? If there are any examples how to make it, could you please show it.

It means you should create a top level supervisor that makes sure parts of your application are restarted if they crash.
The exact topology of your application depends on what you're trying to do (how many processes, what they do and what their relationship is). For applications that are not trivial you might want to create a top level supervisor and then further supervisors responsible for the different processes or process groups.
The most common for simple applications is to create an application, and to make your application start a supervisor which in turn starts up one or more worker children. If you use rebar to create your project that will by default create the stubs for this.

Related

Erlang applications design (how to short-circuit)

I am having an existential question about how I design my erlang applications:
I usually create an application, which starts a supervisor and some workers.
Aside from the supervision tree, I have modules with functions (duh).
I also have a web API that calls functions from applications' modules.
When I stop my application (application:stop(foo).), the webserver can still call foo's functions.
I find it "not idiomatic" to not be able to have a proper circuit-breaker for the foo application.
Does it mean that every public functions from foo should spawn a process under it's supervisor?
Thanks,
Bastien
Not necessarily, for two reasons:
The foo application will have two kinds of functions: those that require the worker processes to be running, and those that don't (most likely pure functions). If the application is stopped, obviously the former will fail when called, while the latter will still work. As per Erlang's "let it crash" philosophy, this is just another error condition that the web server needs to handle (or not handle). If the pure functions still work, there is no reason to prohibit the web server from calling them: it means that a greater portion of the system is functional.
In an Erlang node, stopping an application is not something you'd normally do. An Erlang application declares dependencies, that is, applications that need to be running for it to function correctly. You'll notice that if you try to start an application before its dependencies, it will refuse to start. While it's possible to stop applications manually, this means that the state of the node is no longer in accordance with the assumptions of the application model. When building a "release" consisting of a set of Erlang applications, normally they would all be started as permanent applications, meaning that if any one application crashes, the entire Erlang node would exit, in order not to violate this assumption.

Erlang supervisor processes

I have been learning Erlang intensively, and after finishing 'Programming Erlang' from Joe Armstrong, there is one thing that I keep coming back to.
In my mind a Supervisor spawns One process per child handler. So each declared gen_server type handler will run as a separate process.
What happens if you are building a tiny web server and you want each requests to be its own process. Do you still conform to OTP principles and use a gen_server somehow (how ?), or do you create your own behaviour?
How does Cowboy handle this for eg. ? Does it still use gen_server ?
tl;dr: I find that trying to figure out the "correct" supervision structure a the beginning of a project is a form of premature optimization.
The "right" way to design your supervision tree depends on what the worker parts of your application are doing. In the case of a web server I would probably first explore something along the lines of:
top supervisor (singular)
data service supervisor (one per service type)
worker pool (all workers under the service sup)
client connection supervisor (one)
connection worker pool (or one per connection, have to play with it to decide)
logical supervisor (as appropriate -- massive variance here, depending on problem domain)
workers or supervisors (as appropriate -- have to explore/know the problem domain to have any idea how this should be structured)
So that's several workers per supervisor type at the lower level. I haven't used Cowboy so I don't know how it is organized. The point I'm trying to make is that while the mechanics of handling data services serving web pages are relatively trivial, the part of the system that actually does the core problem-solving work might not be and this is going to dictate everything interesting about the system.
It is a bad thing to have your problem-solving bits mixed in the same module as your web-displaying or connection handling bits. Ideally you should be able to use the same logic units in a native application, a web application and a networked service without any changes.
Ultimately the answer to whether you should have 1:1 supervisors to workers or 1:n depends on what you're doing and what restart strategy gives you the best balance among recovery to a known consistent state, latency felt by the user, and resource usage.
One of my favorite things about Erlang is that I can start with a naive supervisor structure like the one above, play with it until I see where its not so good, and rather easily switch things around and experiment with alternatives without fundamentally altering my system much. (The same goes for playing with alternative data representations if you write proper abstractions around them.) So first, get something that works in testing. Then load it up and see if you can break it. Then start worrying about the details, after you understand where the problems actually are.
It is a common pattern to spawn one server per client in erlang, You will then use a supervisor using the simple_one_to_one strategy for the children servers. This allows to ask the server to start a server on_demand. Generally this is used when you don't know how many processes you will need, and when the processes are independent (a crash of one process should not impacts the other).
There is a very good information in the site learningyousomeerlang.com (LYSE supervisor chapter). the whole site is worth to read.

Erlang supervision and applications

I have number of supervised components that can stand alone as separate applications. But I would like to cascade them such that a call or event in a worker in one component starts the next component down in an inverted tree-like structure.
1) Can I package each of these components as separate applications?
2) If so,how do I write the calling code to start the child application?
3) Or do I need to do something else altogether and,if so, what?
Note: I'm still working on mastery of supervision trees. The chain of events following application:start(Mod) is still not burned well into my head.
Many thanks,
LRP
Supervision trees and applications are complex Erlang/OTP concepts. They are both documented in OTP Design Principles User's Guide and in particular in:
chapter 5: Supervisor behaviour
chapter 7: Applications.
Supervision trees are not dependency trees and should not be designed as such. Instead, a supervision tree should be defined based on the desired crash behavior, as well as the desired start order. As a reminder, every process running long enough must be in the supervision tree.
An application is a reusable component that can be started and stopped. Applications can depend on other applications. However, applications are meant to be started at boot-time, and not when an event occurs.
Processes can be started when a given event occurs. If such processes shall be supervised, simply call supervisor:start_child/2 on its supervisor when the event occurs. This will start the process and it will be inserted in the supervision tree. You will typically use a Simple-one-for-one supervisor which will initially have no child.
Consequently:
You can package components as separate applications. In this case, you will declare the dependencies of applications in each application's app(4) file. Applications could then only be started in the proper order, either with a boot script or interactively with application:start/1.
You can package all your components in a single application and have worker processes starting other worker processes with supervisor:start_child/2.
You can package components as separate applications and have worker processes in one application starting processes in another application. In this case, the best would be to define a module in the target application that will call supervisor:start_child/2 itself, as applications should have clean APIs.
When you have worker processes (parents) starting other worker processes (children), you probably will link those processes. Linking is achieved with link/1. Links are symmetric and are usually established from the parent since the parent knows the pid of the child. If the parent process exits abnormally, the child will be terminated, and reciprocally.
Links are the most common way to handle crashes, for example a child shall be terminated if the parent is no longer there. Links are actually the foundation of OTP supervision. Adding links between worker processes reveals that designing supervision trees is actually difficult. Indeed, with links, you will have both processes terminating if one crashes, and yet, you probably do not want the child process to be restarted by the supervisor, as a supervisor-restarted child process will not be known (or linked) to a supervisor-restarted parent process.
If the parent shall terminate when the child exits normally, then this is a totally different design. You can either have the child send a message to the parent (e.g. return a result) or the parent monitor the child.
Finally, the parent process can terminate a child process. If the child is supervised, use supervisor:terminate_child/2. Otherwise, you can simply send an exit signal to the child process. In either cases, you will need to unlink the child process to avoid an exit of the parent process.
Both links and monitors are documented in the Erlang Reference Manual User's Guide. Instead of monitors, you might be tempted to trap exits, something explained in the guide. However, the manual page for the function to achieve this (process_flag/2) specifically reads:
Application processes should normally not trap exits.
This is typical OTP design wisdom, spread here and there in the documentation. Use monitors or simple messages instead.

Simple_one_for_one application

I have a supervisor which starts simple_one_for_one children. Each child is in fact a supervisor which has its own tree. Each child is started with an unique ID, so I can distinguish them. Each gen_server is then started with start_link(Id), where:
-define(SERVER(Id), {global, {Id, ?MODULE}}).
start_link(Id) ->
gen_server:start_link(?SERVER(Id), ?MODULE, [Id], []).
So, each gen_server can easily be addresed with {global, {Id, module_name}}.
Now I'd like to make this child supervisor into application. So, my mother supervisor should start applications instead of supervisors. That should be straightforward, except one part: passing ID to an application. Starting supervisor with an ID is easy: supervisor:start_child(?SERVER, [Id]). How do I do it for application? How can I start several applications of the same name (so I can access the same .app file) with different ID (so I can start my children with supervisor:start_child(?SERVER, [Id]))?
If my question is not clear enough, here is my code. So, currently, es_simulator_dispatcher starts es_simulator_sup. I'd like to have this: es_simulator_dispatcher starts es_simulator_app which starts es_simulator_sup. That's all there is to it :-)
Thanks in advance,
dijxtra
Applications don't run under anything else, they are a top-level abstraction. When you start an application with application:start/1 the application is started by the application controller which manages applications. Applications contain code and data, and maybe at runtime a supervision tree of processes doing the applications thing at runtime. Running multiple invocations of an application does not really make sense because of the nature of applications.
I would suggest reading OTP Design Principles User's Guide for a description of the components of OTP, how they relate and how they are intended to be used.
I don't think applications where meant for dynamic construction like you want. I'd make a single application, because in Erlang, applications are bundles of code more than they are bundles of running processes (you can say they are an artifact of compile-time moreso than of runtime).
Usually you feed configuration to an application through the built-in configuration system. That is, you use application:get_env(Key) to read something it should use. There is also an application:set_env(...) to feed specific configuration into one - but the preferred way is the config file on disk. This may or may not work in your case.
In some sense, what you are up to corresponds to creating 200 Apache configuration files and then spawn 200 Apache systems next to each other, rather than running a single one and then handle the multiple domains inside it.

Erlang: Who supervises the supervisor?

In all Erlang supervisor examples I have seen yet, there usually is a "master" supervisor who supervises the whole tree (or at least is the root node in the supervisor tree). What if the "master"-supervisor breaks? How should the "master"-supervisor be supervised?? any typical pattern?
The top supervisor is started in your application start/2 callback using start_link, this means that it links with the application process. If the application process receives an exit signal from the top supervisor dying it does one of two things:
If the application is started as an permanent application the entire node i terminated (and maybe restarted using HEART).
If the application is started as temporary the application stops running, no restart attempts will be made.
Typically Supervisor is set to "only" supervise other processes. Which mens there is no user written code which is executed by Supervisor - so it very unlikely to crash.
Of course, this cannot be enforced ... So typical pattern is to not have any application specific logic in Supervisor ... It should only Supervise - and do nothing else.
Good question. I have to concur that all of the examples and tutorials mostly ignore the issue - even if occasionally someone mentions the issue (without providing an example solution):
If you want reliability, use at least two computers, and then make them supervise each other. How to actually implement that with OTP is (with the current state of documentation and tutorials), however, appears to be somewhere between well hidden and secret.

Resources