Replace the associated pid (i.e. unregister and register) atomically - erlang

Suppose a Pid is registered as follows.
register(foobar, Pid).
Now I want to replace the associated pid:
unregister(foobar),
register(foobar, NewPid).
How can I achieve this atomically?

Use gproc, https://github.com/uwiger/gproc
The advantage is that its registry is an ETS table and ETS tables have atomic updates where you can overwrite a name atomically like the thing you want. I am almost positive it can do this kind of thing.

I don't think this is possible, at least, using the register/2 and unregister/1 BIFs.
You need to serialize requests to the registry, for example using a gen_server or an ETS table.
Also, consider the following. Registered names for processes are atoms and atoms, in the Erlang VM, are limited and not garbage collected. If you're registering/unregistering processes dynamically a huge number of processes (e.g. one process per request) you might want to re-think to this approach, since you might run out of atoms at some point.

Related

How to restart Erlang Supervised function with parameters?

I'm learning Erlang, and have a Supervisor question...
I have a function which requires 3 parameters (string, string, number). I want to Supervise that and make sure that if it fails, it gets restarted with the 3 parameters I passed it.
Is this something a Supervisor can handle, or do I need to look into some other concept?
Thanks.
Update 1/23/2016
One thing I want to mention... I have a list of 1439 entries. I need to create a Supervisor for each entry in that list. Each entry will incur different arguments. For example, here's some psuedo-code (reminiscent of Ruby):
(360..1799).each do |index|
export(output_path, table_name, index) # Supervise this
end
This is triggered by a user interaction at runtime. The output_path and table_name are dynamic too, but won't change for a given batch. Unravelled, a run may look something like this:
export("/output/2016-01-23/", "temp1234", 360)
export("/output/2016-01-23/", "temp1234", 361)
export("/output/2016-01-23/", "temp1234", 362)
.
.
So if 361 fails, I need it restarted with "/output/2016-01-23/", temp1234, and 361.
Is this something I can do with a Supervisor?
Yes, this is what supervisor does, but you mean "arguments", not "parameters".
For ordinary (not simple_one_for_one) supervisors your init/1 implementation returns a list of so called child specifications, each of which specifies a child that will be spawned, and the arguments that will be passed are provided as part of this child specification.
With simple_one_for_one supervisors you still have to provide a child specification, but you provide the arguments when you start each supervised child.
In all cases your supervised children will be restarted with the same arguments.
General Concepts
Erlang systems are divided into modules.
Each module are composed of functions.
Works are done by processes and they use functions to do it.
A supervisor is a process which supervises other processes.
So your function will sure be executed inside a process, and you can specify and supervise that process by a supervisor. This way when your function fails and crashes, the process that are executing it will crash as well. When the process crashes, its supervisor will find out and can act upon it based on a predefined restart strategy. These strategies could be one_for_one, one_for_all and rest_for_one.
You can check its manual for further options.

Should a mapping of Id to Pid be stored in an ets table, or the gen_server's state?

I'm building an OTP application which follows a pattern similar to one described on trapexit, where I implement a non-blocking gen_server using gen_server:call/3 to initiate a transaction with a backend and store a mapping of transaction id to the From pid. When the gen_server receives a message from the backend, it extracts the transaction id and uses this mapping to look up the correct pid, which it forwards the message to.
In the trapexit example, this mapping is implemented using ets, however I found that having the gen_server's state contain a dict with these mappings to be a very natural solution.
For my particular use case the mapping will contain, at most, 200 entries.
Which implementation is recommended?
Thanks in advance!
200 is enough to have some impact on performance compared to ets (probably one order of magnitude or less). The real question you must ask yourself is "Do I need this extra performance or will this be sufficient?".
If performance isn't an issue use the dict.
The functional approach is to keep your private data in state. One practical consideration against having very large state data (which yours does not appear to be) however is that it will get dumped in a crash log.

supervisor start multiple children as atomic operation

I need to start multiple supervisor children in an atomic way. That is, if one of children in group fails at startup then none of them should be started.
I see this operation as a function:
start_children(SupRef, ChildSpecs)
ChildSpecs = List :: [child_spec()]
How should I implement this in a proper way? Any examples, libraries etc.? My intuition tells me that starting all children from the list, checking if all of them were successful and then killing remaining ones is not the way.
Or perhaps my design is flawed and I really should not need to do such things?
OTP's supervisor provides support for this with the one_for_all strategy. By default, if some process fails, all processes are restarted, but you can change this by using a suitable for your purpose Restart parameter (e.g. temporary).

Is it possible implement Pregel in Erlang without supersteps?

Let's say we implement Pregel with Erlang. Why do we actually need supersteps? Isn't it better to just send messages from one supervisor to processes that represent nodes? They could just apply the calculation function to themselves, send messages to each other and then send a 'done' message to the supervisor.
What is the whole purpose of supersteps in concurrent Erlang implementation of Pregel?
The SuperStep concept as espoused by the Pregel model could be viewed as sort of a Barrier for parallel-y executing entities. At the end of each superstep, each worker, flushes it state to the persistent store.
The algorithm is check-pointed at the end of each SuperStep so that in case of failure, when a new node has to take over the function of a failed peer, it has a point to start from. Pregel guarantees that since the data of the node has been flushed to disk before the SuperStep started, it can reliably start from exactly that point.
It also in a way signifies "progress" of the algorithm. A pregel algorithm/job can be provided with a "max number of supersteps" after which the algorithm should terminate.
What you specified in your question (about superisors sending worker a calculation function and waiting for a "done") can definitely be implemented (although I dont think the current supervisor packaged with OTP can do stuff like that out of the box) but I guess the concept of a SuperStep is just a requirement of a Pregel model. If on the other hand, you were implementing something like a parallel mapper (like what Joe implements in his book) you wont need supersteps/

Using ets:foldl as a poor man's forEach on every record

Short version: is it safe to use ets:foldl to delete every ETS record as one is iterating through them?
Suppose an ETS table is accumulating information and now it's time to process it all. A record is read from the table, used in some way, then deleted. (Also, assume the table is private, so no concurrency issues.)
In another language, with a similar data structure, you might use a for...each loop, processing every record and then deleting it from the hash/dict/map/whatever. However, the ets module does not have foreach as e.g. lists does.
But this might work:
1> ets:new(ex, [named_table]).
ex
2> ets:insert(ex, {alice, "high"}).
true
3> ets:insert(ex, {bob, "medium"}).
true
4> ets:insert(ex, {charlie, "low"}).
true
5> ets:foldl(fun({Name, Adjective}, DontCare) ->
io:format("~p has a ~p opinion of you~n", [Name, Adjective]),
ets:delete(ex, Name),
DontCare
end, notused, ex).
bob has a "medium" opinion of you
alice has a "high" opinion of you
charlie has a "low" opinion of you
notused
6> ets:info(ex).
[...
{size,0},
...]
7> ets:lookup(ex, bob).
[]
Is this the preferred approach? Is it at least correct and bug-free?
I have a general concern about modifying a data structure while processing it, however the ets:foldl documentation implies that ETS is pretty comfortable with you modifying records inside foldl. Since I am essentially wiping the table clean, I want to be sure.
I am using Erlang R14B with a set table however I'd like to know if there are any caveats with any Erlang version, or any type of table as well. Thanks!
Your approach is safe. The reason it is safe is that ets:foldl/3 internally use ets:first/1, ets:next/2 and ets:safe_fixtable/2. These have the guarantee you want, namely that you can kill elements and still get the full traverse. See the CONCURRENCY section of erl -man ets.
For your removal of all elements from the table, there is a simpler one-liner however:
ets:match_delete(ex, '_').
although it doesn't work should you want to do the IO-formatting for each row in which case your approach with foldl is probably easier.
For cases like this we will alternate between two tables or just create a new table every time we start processing. When we want to start a processing cycle we switch the writers to start using the alternate or new table, then we do our processing and clear or delete the old table.
We do this because there might otherwise be concurrent updates to a tuple that we might miss. We're working with high frequency concurrent counters when we use this technique.

Resources