supervisor start multiple children as atomic operation - erlang

I need to start multiple supervisor children in an atomic way. That is, if one of children in group fails at startup then none of them should be started.
I see this operation as a function:
start_children(SupRef, ChildSpecs)
ChildSpecs = List :: [child_spec()]
How should I implement this in a proper way? Any examples, libraries etc.? My intuition tells me that starting all children from the list, checking if all of them were successful and then killing remaining ones is not the way.
Or perhaps my design is flawed and I really should not need to do such things?

OTP's supervisor provides support for this with the one_for_all strategy. By default, if some process fails, all processes are restarted, but you can change this by using a suitable for your purpose Restart parameter (e.g. temporary).

Related

How do I tell supervisor to start 1000 instances of a specific gen_server?

As part of a solving a computationally intensive task, I wish to have 1000 gen_servers doing small task and update the global database. How can I achieve this in erlang OTP? In most of the examples the supervisor supervises only a single gen_server. Can a supervisor supervise more than a thousand instances of the same gen_server?
e.g. Say I want to find maximum of a extremely long array and each gen_server instance should create work on a part of the array and update the global minimum.
Like Pascal said, it is possible to start a set number or children but the use case you described would probably work better with a simple_one_for_one strategy as all children are the same. This lets you add as many of the same type of children as needed at a smaller cost. gen_servers have overhead, and even though it's not too big, when you're talking about 1000 processes crunching numbers it makes a difference.
If your processes will be doing something very simple and you want it to be fast I would consider not using gen_servers, but instead just spawning processes. For real power, you would have to spawn processes on different nodes using spawn/4 to make use of more cores. If you are using machines in different locations you can also use a message buss as a load balancer to distribute the work between nodes. All depends on how much work you need done.
Also keep in mind that Erlang is not the best for crunching numbers. You could use C code to do the crunching and have each Erlang process you spawn/each child call a nif.
Is it possible: yes. For example you can create a pool of 1000 processes with the following supervisor:
-module (big_supervisor).
-export([start_link/0]).
-behaviour(supervisor).
-export([init/1]).
start_link() ->
supervisor:start_link({local, ?MODULE}, ?MODULE, {}).
%% #private
init({}) ->
Children = create_child_specs(1000),
RestartStrategy = {one_for_one, 5, 10},
{ok, {RestartStrategy, Children}}.
create_child_specs(Number) ->
[{{child_process,X},{child_process, start_link, []},permanent, 5000, worker,[child_process]} || X <- lists:seq(1,Number)].
Is it a good architecture, I don't know. Until now I have found 2 kinds of architectures:
One with a limited and well identified (by role) chidren
One with a kind of process factory, creating dynamically as many children as needed on demand, using the simple_one_for_one strategy and the start_child/2 terminate_child/2 functions.
Notes also that the supervisors are not mandatory if you want to spawn processes. In your explanation it seems that the processes could be created for a very limited time, in order to compute in parallel something. Two remarks in this case:
it is not worth to spawn more processes that the number of threads that will effectively run in parallel on your VM.
one exception is if the work to achieve in each process will have to wait for an external information, for example the return of an external database. In this case it may be interesting to spawn more processes, the optimal number depending on the external access limits.

How to restart Erlang Supervised function with parameters?

I'm learning Erlang, and have a Supervisor question...
I have a function which requires 3 parameters (string, string, number). I want to Supervise that and make sure that if it fails, it gets restarted with the 3 parameters I passed it.
Is this something a Supervisor can handle, or do I need to look into some other concept?
Thanks.
Update 1/23/2016
One thing I want to mention... I have a list of 1439 entries. I need to create a Supervisor for each entry in that list. Each entry will incur different arguments. For example, here's some psuedo-code (reminiscent of Ruby):
(360..1799).each do |index|
export(output_path, table_name, index) # Supervise this
end
This is triggered by a user interaction at runtime. The output_path and table_name are dynamic too, but won't change for a given batch. Unravelled, a run may look something like this:
export("/output/2016-01-23/", "temp1234", 360)
export("/output/2016-01-23/", "temp1234", 361)
export("/output/2016-01-23/", "temp1234", 362)
.
.
So if 361 fails, I need it restarted with "/output/2016-01-23/", temp1234, and 361.
Is this something I can do with a Supervisor?
Yes, this is what supervisor does, but you mean "arguments", not "parameters".
For ordinary (not simple_one_for_one) supervisors your init/1 implementation returns a list of so called child specifications, each of which specifies a child that will be spawned, and the arguments that will be passed are provided as part of this child specification.
With simple_one_for_one supervisors you still have to provide a child specification, but you provide the arguments when you start each supervised child.
In all cases your supervised children will be restarted with the same arguments.
General Concepts
Erlang systems are divided into modules.
Each module are composed of functions.
Works are done by processes and they use functions to do it.
A supervisor is a process which supervises other processes.
So your function will sure be executed inside a process, and you can specify and supervise that process by a supervisor. This way when your function fails and crashes, the process that are executing it will crash as well. When the process crashes, its supervisor will find out and can act upon it based on a predefined restart strategy. These strategies could be one_for_one, one_for_all and rest_for_one.
You can check its manual for further options.

How to maintain state in Erlang?

I have seen people use dict, ordict, record for maintaining state in many blogs that I have read. I find it as very vital concept.
Generally I understand the meaning of maintaining state and recursions but when it comes to Erlang..I am a little vague about how it is handled.
Any help?
State is the present arrangement of data. It is sometimes hard to remember this for two reasons:
State means both the data in the program and the program's current point of execution and "mode".
We build this up to be some magical thing unnecessarily.
Consider this:
"What is the process's state?" is asking about the present value of variables.
"What state is the process in?" usually refers to the mode, options, flags or present location of execution.
If you are a Turing machine then these are the same question; we have separated the ideas to give us handy abstractions to build on (like everything else in programming).
Let's think about state variables for a moment...
In many older languages you can alter state variables from whatever context you like, whether the modification of state is appropriate or not, because you manage this directly. In more modern languages this is a bit more restricted by imposing type declarations, scoping rules and public/private context to variables. This is really a rules arms-race, each language finding more ways to limit when assignment is permitted. If scheduling is the Prince of Frustration in concurrent programming, assignment is the Devil Himself. Hence the various cages built to manage him.
Erlang restricts the situations that assignment is permitted in a different way by setting the basic rule that assignment is only once per entry to a function, and functions are themselves the sole definition of procedural scope, and that all state is purely encapsulated by the executing process. (Think about the statement on scope to understand why many people feel that Erlang macros are a bad thing.)
These rules on assignment (use of state variables) encourage you to think of state as discreet slices of time. Every entry to a function starts with a clean slate, whether the function is recursive or not. This is a fundamentally different situation than the ongoing chaos of in-place modifications made from anywhere to anywhere in most other languages. In Erlang you never ask "what is the value of X right now?" because it can only ever be what it was initially assigned to be in the context of the current run of the current function. This significantly limits the chaos of state changes within functions and processes.
The details of those state variables and how they are assigned is incidental to Erlang. You already know about lists, tuples, ETS, DETS, mnesia, db connections, etc. Whatever. The core idea to understand about Erlang's style is how assignment is managed, not the incidental details of this or that particular data type.
What about "modes" and execution state?
If we write something like:
has_cheeseburger(BurgerName) ->
receive
{From, ask, burger_name} ->
From ! {ok, BurgerName},
has_cheeseburger(BurgerName);
{From, new_burger, _SomeBurger} ->
From ! {error, already_have_a_burger},
has_cheeseburger(BurgerName);
{From, eat_burger} ->
From ! {ok, {ate, BurgerName}},
lacks_cheeseburger()
end.
lacks_cheeseburger() ->
receive
{From, ask, burger_name} ->
From ! {error, no_burger},
lacks_cheeseburger();
{From, new_burger, BurgerName} ->
From ! {ok, thanks},
has_cheeseburger(BurgerName);
{From, eat_burger} ->
From ! {error, no_burger},
lacks_cheeseburger()
end.
What are we looking at? A loop. Conceptually its just one loop. Quite often a programmer would choose to write just one loop in code and add an argument like IsHoldingBurger to the loop and check it after each message in the receive clause to determine what action to take.
Above, though, the idea of two operating modes is both more explicit (its baked into the structure, not arbitrary procedural tests) and less verbose. We have separated the context of execution by writing basically the same loop twice, once for each condition we might be in, either having a burger or lacking one. This is at the heart of how Erlang deals with a concept called "finite state machines" and its really useful. OTP includes a tool build around this idea in the gen_fsm module. You can write your own FSMs by hand as I did above or use gen_fsm -- either way, when you identify you have a situation like this writing code in this style makes reasoning much easier. (For anything but the most trivial FSM you will really appreciate gen_fsm.)
Conclusion
That's it for state handling in Erlang. The chaos of untamed assignment is rendered impotent by the basic rules of single-assignment and absolute data encapsulation within each process (this implies that you shouldn't write gigantic processes, by the way). The supremely useful concept of a limited set of operating modes is abstracted by the OTP module gen_fsm or can be rather easily written by hand.
Since Erlang does such a good job limiting the chaos of state within a single process and makes the nightmare of concurrent scheduling among processes entirely invisible, that only leaves one complexity monster: the chaos of interactions among loosely coupled actors. In the mind of an Erlanger this is where the complexity belongs. The hard stuff should generally wind up manifesting there, in the no-man's-land of messages, not within functions or processes themselves. Your functions should be tiny, your needs for procedural checking relatively rare (compared to C or Python), your need for mode flags and switches almost nonexistant.
Edit
To reiterate Pascal's answer, in a super limited way:
loop(State) ->
receive
{async, Message} ->
NewState = do_something_with(Message),
loop(NewState);
{sync, From, Message} ->
NewState = do_something_with(Message),
Response = process_some_response_on(NewState),
From ! {ok, Response},
loop(NewState);
shutdown ->
exit(shutdown);
Any ->
io:format("~p: Received: ~tp~n", [self(), Any]),
loop(State)
end.
Re-read tkowal's response for the most minimal version of this. Re-read Pascal's for an expansion of the same idea to include servicing messages. Re-read the above for a slightly different style of the same pattern of state handling with the addition of ouputting unexpected messages. Finally, re-read the two-state loop I wrote above and you'll see its actually just another expansion on this same idea.
Remember, you can't re-assign a variable within the same iteration of a function but the next call can have different state. That is the extent of state handling in Erlang.
These are all variations on the same thing. I think you're expecting there to be something more, a more expansive mechanism or something. There is not. Restricting assignment eliminates all the stuff you're probably used to seeing in other languages. In Python you do somelist.append(NewElement) and the list you had now has changed. In Erlang you do NewList = lists:append(NewElement, SomeList) and SomeList is sill exactly the same as it used to be, and a new list has been returned that includes the new element. Whether this actually involves copying in the background is not your problem. You don't handle those details, so don't think about them. This is how Erlang is designed, and that leaves single assignment and making fresh function calls to enter a fresh slice of time where the slate has been wiped clean again.
The easiest way to maintain state is using gen_server behaviour. You can read more on Learn you some Erlang and in the docs.
gen_server is process, that can be:
initialised with given state,
can have defined synchronous and asynchronous callbacks (synchronous for querying the data in "request-response style" and asynchronous for changing the state with "fire and forget" style)
It also has couple of nice OTP mechanisms:
it can be supervised
it gives you basic logging
its code can be upgraded while the server is running without loosing the state
and so on...
Conceptually gen_server is an endless loop, that looks like this:
loop(State) ->
NewState = handle_requests(State),
loop(NewState).
where handle requests receives messages. This way all requests are serialised, so there are no race conditions. Of course it is a little bit more complicated to give you all the goodies, that I described.
You can choose what data structure you want to use for State. It is common to use records, because they have named fields, but since Erlang 17 maps can come in handy. This one depends on, what you want to store.
Variable are not mutable, so when you want to have an evolution of state, you create a new variable, and later recall the same function with this new state as parameter.
This structure is meant for processes like server, there is no base condition as in the factorial usual example, generally there is a specific message to stop the server smoothly.
loop(State) ->
receive
{add,Item} -> NewState = [Item|State], % create a new variable
loop(NewState); % recall loop with the new variable
{remove,Item} -> NewState = lists:filter(fun(X) -> X /= Item end,State) , % create a new variable
loop(NewState); % recall loop with the new variable
{items,Pid} -> Pid ! {items,State},
loop(State);
stop -> stopped; % this will be the stop condition
_ -> loop(State) % ignoring other message may be interesting in a never ending loop
end

Replace the associated pid (i.e. unregister and register) atomically

Suppose a Pid is registered as follows.
register(foobar, Pid).
Now I want to replace the associated pid:
unregister(foobar),
register(foobar, NewPid).
How can I achieve this atomically?
Use gproc, https://github.com/uwiger/gproc
The advantage is that its registry is an ETS table and ETS tables have atomic updates where you can overwrite a name atomically like the thing you want. I am almost positive it can do this kind of thing.
I don't think this is possible, at least, using the register/2 and unregister/1 BIFs.
You need to serialize requests to the registry, for example using a gen_server or an ETS table.
Also, consider the following. Registered names for processes are atoms and atoms, in the Erlang VM, are limited and not garbage collected. If you're registering/unregistering processes dynamically a huge number of processes (e.g. one process per request) you might want to re-think to this approach, since you might run out of atoms at some point.

Is it possible implement Pregel in Erlang without supersteps?

Let's say we implement Pregel with Erlang. Why do we actually need supersteps? Isn't it better to just send messages from one supervisor to processes that represent nodes? They could just apply the calculation function to themselves, send messages to each other and then send a 'done' message to the supervisor.
What is the whole purpose of supersteps in concurrent Erlang implementation of Pregel?
The SuperStep concept as espoused by the Pregel model could be viewed as sort of a Barrier for parallel-y executing entities. At the end of each superstep, each worker, flushes it state to the persistent store.
The algorithm is check-pointed at the end of each SuperStep so that in case of failure, when a new node has to take over the function of a failed peer, it has a point to start from. Pregel guarantees that since the data of the node has been flushed to disk before the SuperStep started, it can reliably start from exactly that point.
It also in a way signifies "progress" of the algorithm. A pregel algorithm/job can be provided with a "max number of supersteps" after which the algorithm should terminate.
What you specified in your question (about superisors sending worker a calculation function and waiting for a "done") can definitely be implemented (although I dont think the current supervisor packaged with OTP can do stuff like that out of the box) but I guess the concept of a SuperStep is just a requirement of a Pregel model. If on the other hand, you were implementing something like a parallel mapper (like what Joe implements in his book) you wont need supersteps/

Resources