How to get in erlang the pid of a node name? - erlang

I got a node: app01#mdiaz and I need to know the pid (something like <2908.77.0> )

An Erlang node doesn't have one single pid: there are many processes running on each node, so you need to specify which one you want.
If you want to know the pid of the process registered with the name foo on the node bar#localhost, you can make an RPC call to erlang:whereis/1:
(foo#localhost)1> rpc:call(bar#localhost, erlang, whereis, [foo]).
<7120.56.0>
Though you might not need that: if you want to send a message to a named process on another node, you can use {Name, Node} instead of first getting the pid. For example, to send a message to the process called foo on bar#localhost:
{foo, bar#localhost} ! my_message
You can also go the other direction, getting the node name from a pid, with the node/1 function:
(foo#localhost)1> RemotePid = rpc:call(bar#localhost, erlang, whereis, [foo]).
<6928.32.0>
(foo#localhost)2> node(RemotePid).
bar#localhost

Related

Erlang: spawn a process and wait for termination without using `receive`

In Erlang, can I call some function f (BIF or not), whose job is to spawn a process, run the function argf I provided, and doesn't "return" until argf has "returned", and do this without using receive clause (the reason for this is that f will be invoked in a gen_server, I don't want pollute the gen_server's mailbox).
A snippet would look like this:
%% some code omitted ...
F = fun() -> blah, blah, timer:sleep(10000) end,
f(F), %% like `spawn(F), but doesn't return until 10 seconds has passed`
%% ...
The only way to communicate between processes is message passing (of course you can consider to poll for a specific key in an ets or a file but I dont like this).
If you use a spawn_monitor function in f/1 to start the F process and then have a receive block only matching the possible system messages from this monitor:
f(F) ->
{_Pid, MonitorRef} = spawn_monitor(F),
receive
{_Tag, MonitorRef, _Type, _Object, _Info} -> ok
end.
you will not mess your gen_server mailbox. The example is the minimum code, you can add a timeout (fixed or parameter), execute some code on normal or error completion...
You will not "pollute" the gen_servers mailbox if you spawn+wait for message before you return from the call or cast. A more serious problem with this maybe that you will block the gen_server while you are waiting for the other process to terminate. A way around this is to not explicitly wait but return from the call/cast and then when the completion message arrives handle it in handle_info/2 and then do what is necessary.
If the spawning is done in a handle_call and you want to return the "result" of that process then you can delay returning the value to the original call from the handle_info handling the process termination message.
Note that however you do it a gen_server:call has a timeout value, either implicit or explicit, and if no reply is returned it generates an error in the calling process.
Main way to communicate with process in Erlang VM space is message passing with erlang:send/2 or erlang:send/3 functions (alias !). But you can "hack" Erlang and use multiple way for communicating over process.
You can use erlang:link/1 to communicate stat of the process, its mainly used in case of your process is dying or is ended or something is wrong (exception or throw).
You can use erlang:monitor/2, this is similar to erlang:link/1 except the message go directly into process mailbox.
You can also hack Erlang, and use some internal way (shared ETS/DETS/Mnesia tables) or use external methods (database or other things like that). This is clearly not recommended and "destroy" Erlang philosophy... But you can do it.
Its seems your problem can be solved with supervisor behavior. supervisor support many strategies to control supervised process:
one_for_one: If one child process terminates and is to be restarted, only that child process is affected. This is the default restart strategy.
one_for_all: If one child process terminates and is to be restarted, all other child processes are terminated and then all child processes are restarted.
rest_for_one: If one child process terminates and is to be restarted, the 'rest' of the child processes (that is, the child processes after the terminated child process in the start order) are terminated. Then the terminated child process and all child processes after it are restarted.
simple_one_for_one: A simplified one_for_one supervisor, where all child processes are dynamically added instances of the same process type, that is, running the same code.
You can also modify or create your own supervisor strategy from scratch or base on supervisor_bridge.
So, to summarize, you need a process who wait for one or more terminating process. This behavior is supported natively with OTP, but you can also create your own model. For doing that, you need to share status of every started process, using cache or database, or when your process is spawned. Something like that:
Fun = fun
MyFun (ParentProcess, {result, Data})
when is_pid(ParentProcess) ->
ParentProcess ! {self(), Data};
MyFun (ParentProcess, MyData)
when is_pid(ParentProcess) ->
% do something
MyFun(ParentProcess, MyData2) end.
spawn(fun() -> Fun(self(), InitData) end).
EDIT: forgot to add an example without send/receive. I use an ETS table to store every result from lambda function. This ETS table is set when we spawn this process. To get result, we can select data from this table. Note, the key of the row is the process id of the process.
spawner(Ets, Fun, Args)
when is_integer(Ets),
is_function(Fun) ->
spawn(fun() -> Fun(Ets, Args) end).
Fun = fun
F(Ets, {result, Data}) ->
ets:insert(Ets, {self(), Data});
F(Ets, Data) ->
% do something here
Data2 = Data,
F(Ets, Data2) end.

spawn_monitor registered name in different module

I'm trying to monitor a process with a registered name in a different module than where the monitor code is placed. This is an assignment for school, which is why I'm not going to post all of my code. However, here's the outline:
module1:start() spawns a process and registers its name:
register(name, Pid = spawn(?MODULE, loop, [])), Pid.
The loop waits for messages. If the message is of the wrong type it crashes.
module2:start() should start the registered process in module1 and monitor it, restarting it if it's crashed. I've been able to get it working using:
spawn(?MODULE, loop, [module1:start()]).
Then in the loop function I use erlang:monitor(process, Pid).
This way of solving the problem means the registered process can crash before the monitoring starts. I've been looking at spawn_monitor, but haven't been able to get the monitoring to work. The latest I've tried is:
spawn(?MODULE, loop, [spawn_monitor(name, start, [])]).
It starts the registered process. I can send messages to it, but I can't seem to detect anything. In the loop function I have a receive block, where I try to pattern match {'DOWN', Ref, process, Pid, _Why}. I've tried using spawn_monitor in module1 instead of simply spawn, but I noticed no change. I've also been trying to solve this using links (as in spawn_link), but I haven't gotten that to work either.
Any suggestions? What am I monitoring, if I'm not monitoring the registered process?
Since this is a homework assignment, I won't give you a complete answer.
Generally, you need two loops, one in module1 to do the work, and one in module2 to supervise the work. You already have a module1:start/0 function that calls spawn to execute the module1:loop/0 function to do the work, but as you've stated, this leaves a window of vulnerability between the spawning of the process and its monitoring by module2 that you're trying to close. As a hint, you could change the start function to call spawn_monitor instead:
start() ->
{Pid, Ref} = spawn_monitor(?MODULE, loop, []),
register(name, Pid),
{Pid, Ref}.
and then your module2:start/0 function would then just call it like this:
start() ->
{Pid, Ref} = module1:start(),
receive
{'DOWN', Ref, process, Pid, _Why} ->
%% restart the module1 pid
%% details left out intentionally
end.
Note that this implies that module2:start/0 needs a loop of some sort to spawn and monitor the module1 pid, and restart it when necessary. I leave that to your homework efforts.
Also, using spawn_link instead of spawn_monitor is definitely worth exploring.

How to start multiple instances of the same module/function under the Supervisor behavior in erlang?

Having a module/function mymodule , how to start it multiple times under the supervisor behavior ?
I need for example 2 instances of the same process (mymodule) to be started concurrently. I called the children identifiers as child1 and child2. They both point to the mymodule module that I want to start. I have specified two different functions to to start each instance of the worker process "mymodule" ( start_link1 and start_link2 )
-module(my_supervisor).
-behaviour(supervisor).
-export([start_link/0, init/1]).
start_link() ->
supervisor:start_link({local, ?MODULE}, ?MODULE, _Arg = []).
init([]) ->
{ok, {{one_for_one, 10, 10},
[{child1,
{mymodule, start_link1, []},
permanent,
10000,
worker,
[mymodule]}
,
{child2,
{mymodule, start_link2, []},
permanent,
10000,
worker,
[mymodule]}
]}}.
The worker has two distinguished start_link functions ( start_link1 and start_link2 ) for testing purposes:
-module(mymodule).
-behaviour(gen_server).
start_link1() ->
log_something("at link 1"),
gen_server:start_link({global, child1}, ?MODULE, [], []).
start_link2() ->
log_something("at link 2"),
gen_server:start_link({global, child2}, ?MODULE, [], []).
init([]) ->
....
With the above I can see in my log the message "at link 1" but it does reveal "at link 2" anywhere. It also does not perform anything in the instance of link1 : just dies apparently.
The only scenario that works is when the name "child1" matches the worker module name "mymodule".
As #MilleBessö asks are you trying to two processes which have the same registered name? Does mymodule:start_link register the mymodule process under a fixed name? If so then trying to start a second one will cause a clash. Ot are you trying to start multiple my_supervisor supervisors? Then you will also get a name clash. You have not included the code for my_module.
Remember you can only have one process registered under a name. This holds for both local registered processes and those registered using global.
EDIT: Does the supervisor die as well?
A gen_server, and all other behaviours, aren't considered to be properly started until the init callback has completed and returned a correct value ({ok,State}). So if there is an error in mymodule:init/1 then this will crash the child process before it has been initialised and the supervisor will give up. While a supervisor can and will restart children when they die it does require that they all start correctly. From supervisor:start_link/3
If the supervisor and its child processes are successfully created (i.e. if all child process start functions return {ok,Child}, {ok,Child,Info}, or ignore) the function returns {ok,Pid}, where Pid is the pid of the supervisor. If there already exists a process with the specified SupName the function returns {error,{already_started,Pid}}, where Pid is the pid of that process.
If Module:init/1 returns ignore, this function returns ignore as well and the supervisor terminates with reason normal. If Module:init/1 fails or returns an incorrect value, this function returns {error,Term} where Term is a term with information about the error, and the supervisor terminates with reason Term.
I don't know if this is the problem but it give the same behaviour as you get.
Check the docs for supervisor:start_link(). The first parameter you pass in here is the name it uses to register with global, which provides a global name -> pid lookup. Since this must be unique, your second process fails to start, since the name is already taken.
Edit: Here is link to the docs: http://erldocs.com/R15B/stdlib/supervisor.html?i=5&search=supervisor:start#start_link/3
Check also simple-one-for-one supervisor restart scenario. It allows to start multiple processes with same child specification in more automated way.

What should I use for service location in erlang?

I'm writing my first distributed erlang application, and I notice that I have to know the node on which I have my "service" up. How can I send requests to my service without knowing on which node it is running?
Basically I want to do something like this:
ReferenceToTheServiceProcess = locate(my_service).
ReferenceToTheServiceProcess ! {request, Stuff}.
Or something else to the equivalent effect (loose coupling).
Thanks!
You could register your service process with a global name, for example using gproc. That way you don't have to know which node your service currently resides on, and your could would pretty much look like you wanted.
You can register the process using the global module.
From your service process call:
global:register_name(my_service, self()).
To send a message to the globally registered process call:
Pid = global:whereis_name(my_service),
Pid ! {request, Stuff}.
or call:
global:send(my_service, {request, Stuff}).
The registration functionality is atomic.
If the service process terminates or the node goes down, the name will be globally unregistered.
Or if you still want local registration of process you could write something like:
where_is_service(Service) ->
lists:filter(fun(X) -> X =/= undefined end, [rpc:call(Node, erlang, whereis, [Service]) || Node <- [node() | nodes()]]).
This function will give back as result the pid of the processes referred locally (on the nodes) as Service
After this if you want just the first process running use the list returned by the function:
send_msg_to_service(Service,Message) ->
case where_is_service(Service) of
[APid | _] -> APid ! Message, ok;
[] -> {error, service_not_running}
end.
Hope this helps!

Erlang Takeover failing after successful Failover

I have an application distributed over 2 nodes. When I halt() the first node the failover works perfectly, but ( sometimes ? ) when I restart the first node the takeover fails and the application crashes since start_link returns already started.
SUPERVISOR REPORT <0.60.0> 2009-05-20 12:12:01
===============================================================================
Reporting supervisor {local,twitter_server_supervisor}
Child process
errorContext start_error
reason {already_started,<2415.62.0>}
pid undefined
name tag1
start_function {twitter_server,start_link,[]}
restart_type permanent
shutdown 10000
child_type worker
ok
My app
start(_Type, Args)->
twitter_server_supervisor:start_link( Args ).
stop( _State )->
ok.
My supervisor :
start_link( Args ) ->
supervisor:start_link( {local,?MODULE}, ?MODULE, Args ).
Both nodes are using the same sys.config file.
What am I not understanding about this process that the above should not work ?
It seems like your problem stem from twitter server supervisor trying to start one of its children. Since the error report complains about the child with start_function
{twitter_server,start_link,[]}
And since you are not showing that code, I can only guess that it is trying to register a name for itself, but there is already a process registered with that name.
Even more guessing, the reason shows a Pid, the Pid that has the name that we tried to grab for ourself:
{already_started,<2415.62.0>}
The Pid there has a non-zero initial integer, if it was zero it means it is a local process. From which I deduce that you are trying to register a global name, and you are connected to another node where there is already a process globally registered by that name.

Resources