If an Erlang process proc1 is killed (exit(killed)) and another process proc2 is notified of this event because it's linked to the process proc1, is there a possibility to respawn a replacement process with the mailbox of the killed process proc1?
Not really. Processes don't share any data, so when one dies no other can access any of it's memory. Erlang would not know how long to wait to garbage-collect such mailbox.
You could simulate that with separate proxy process that just keeps track of mailbox.
Or you could try to handle little better being killed in proc1. You could use process_flag(trap_exit, true) which would allow you to receive exit(killed) message as normal message {'EXIT',FromPid,killed}. And when you receive you read you whole mailbox, and then continue exiting with all messages being part of exit reason.
It could look somewhat like this:
init(Args) ->
process_flag(trap_exit, true),
.... % continiue process innicializaiton.
loop(State) ->
receive
....
{'EXIT', _FromPid, killed} ->
exit({killed, all_messages()})
end.
all_messages() ->
all_messages([]).
all_messages(Messages) ->
receive
AnyMessage ->
all_messages( [AnyMessage|Messages])
after 0 ->
lists:reverse(Messages)
end.
And proc2 would receive all unprocessed messages, and could send them again to newly spawn process.
Related
I'm trying to connect a gen_server to another gen_server and during the connect the servers need to monitor each other and know when the server has crashed, either the entire node or the server process. after im doing the first start_link and one of the servers crashes the other server gets a message from the monitor in the code (handle_info function is activated), but when it happens for the second time the monitor sends the information directly to the shell (the message does not go through the handle_info and goes directly to the shell only visible using flush() inside the shell) and the server that was suppose to be alerted from the monitor doesn't receive any message.
my code in the sending side:
handle_call({connect, Node, Who}, _From, _State) ->
case Who of
cdot -> ets:insert(address, {cdot, Node}), ets:insert(address,
{Node, cdot}), monitor_node(Node, true);
cact -> ets:insert(address, {cact, Node}), ets:insert(address,
{Node, cdot}), monitor_node(Node ,true);
ctitles -> ets:insert(address, {ctitles, Node}),
ets:insert(address, {Node, cdot}), monitor_node(Node, true);
_-> ok
end,
[{_, Pid2}] = ets:lookup(?name_table3, pidGui),
Pid2 ! {db, "Node "++ atom_to_list(Who) ++ " connected"}, %print to
gui witch node was connected
{reply, {{node(), self()}, connected}, node()};
and the one in the receiving side is:
connect() ->
{{Node, Pid}, Connected} = gen_server:call(server_node(), {connect,
node(), cact}),
monitor_node(Node, true),
monitor(process, Pid),
Connected.
please can anyone tell me why this is happening?
the same happens for either node or process monitoring
If you get the second monitor message in the shell, it is because you call the connect function in the shell context.
Check how you call this function, it must be done in the server context, it means inside a handle_call, handle_cast or handle_info function.
after im doing the first start_link and one of the servers crashes the
other server gets a message from the monitor in the code, but when it
happens for the second time
It sounds like you are starting a new server after a server crashes. Do you call monitor() on the new server Pid?
A monitor is triggered only once, after that it is removed from
both monitoring process and the monitored entity. Monitors are fired
when the monitored process or port terminates, does not exist at the
moment of creation, or if the connection to it is lost. In the case
with connection, we lose knowledge about the fact if it still exists
or not. The monitoring is also turned off when demonitor/1 is called.
Even though I unregister sts, my spawned process is not stopped. How can I stop it not using gen_server?
start() ->
case whereis(sts) of
undefined ->
PidA = spawn(dist_erlang, init,[]),
register(sts, PidA),
{ok,PidA};
_ ->
{ok,whereis(sts)}
end.
stop() ->
case whereis(sts) of
undefined ->
already_stopped;
_ ->
unregister(sts),
stopped,
end.
Using unregister does not stop the process. Stopping the process does, however, unregister it. So instead of using unregister here, use erlang:exit/2
stop() ->
case whereis(sts) of
undefined ->
already_stopped;
Pid ->
exit(Pid, normal), % Use whatever exit reason you want
stopped
end.
All that being said, you should really be using the OTP process behaviours (like gen_server), as they make process management much easier. With an OTP process, you can instead call the process and tell it to stop, so that when you get your reply it has already stopped. Otherwise your exit message may take some time to get through.
unregister does not stop the process. It just removes binding between process id and given atom.
You need to remember that stop/0 function is run in context of process that called this function, and not the gen_server itself. Actually (almost) only way to interact with some process is to send it a message. So you could implement your stop/0 function like this:
%% send stop message to `sts` server
stop() ->
gen_server:cast(sts, stop).
%% [...]
handle_cast( OtherCastMessages, State) ->
%% handel other cast messages ;
%% [...] ;
%% handle stop message
handle_cast( _Message = stop, State) ->
{stop,
_Reason = normal,
State}. % return result that stops server
%% [...]
terminate(_Reason = normal, State) ->
%% could do some cleanup in this callback
ok.
So to stop server you have to return special tuple from one of the behaviour functions. You can read more about this here. And of course trigger one of behaviour functions, you have to send message to your server with gen_server:cast or gen_server:call (or just send a message and handle it with handle_info). What to use is your decision. Finally terminate2 is called (no matter which callback returned tuple with stop atom), where you could do some cleanup with your state.
Of course you could unregister your process in terminate callback, but when process dies the unregistration is handled automatically.
What is the most recommended way in Erlang to ensure that some process exists before sending a message / event to it? In my scenario I am starting process upon first occurrence of the message and then it stays alive. While keep passing further messages, I first try to start the process with the same name to ensure it is started, something like this (using gen_fsm and simple-one-for-one restart scenario):
%% DeviceId - process name
heartbeat(ApplicationKey, DeviceId, Timeout) ->
ensure_session_started(ApplicationKey, DeviceId, Timeout),
gen_fsm:send_event(DeviceId, {heartbeat, Timeout}).
ensure_session_started(ApplicationKey, DeviceId, Timeout) ->
case session_server_sup:start_child(ApplicationKey, DeviceId, Timeout) of
{ok, _Pid} -> {ok, running};
{error, {already_started, _}} -> {ok, running};
{error, Error} -> erlang:throw({error, {failed_to_start_session, Error}})
end.
I believe this solution is not perfect and probably has some overhead but still believe it is less race condition prone than using erlang:is_process_alive. Am I right? Any ideas how to improve it?
You are right, erlang:is_process_alive/1 approach is useless in this scenario because of race condition.
Your example is workable and I saw it in the wild few times. Note that it does not guarantee that message will be processed. To be sure of that you need to monitor your receiver and get confirmation from it. This is how it is done in gen_server:call/2.
With gen_fsm:sync_send_event/2,3 you can send events to your FSM and wait for a response. So you can tell your calling process that its message has been received.
This could be a very basic question but is Erlang capable of calling a method on another prcoess and wait for it to repond back with some object type without sleeping threads?
Well, if you're waiting for an answer, the calling process will have to sleep eventually… But that's no big deal.
While processes are stuck on the receive loop, other processes can work. In fact, it's not uncommon to have thousands of processes just waiting for messages. And since Erlang processes are not true OS threads, they're very lightweight so the performance loss is minimal.
In fact, the way sleep is implemented looks like:
sleep(Milliseconds) ->
receive
% Intentionally left empty
after Milliseconds -> ok
end.
Yes, it is possible to peek into the mailbox if that is what you mean. Say we have sent a message to another process and now we want to see if the other process has sent something back to us. But we don't want to block on the receive:
receive
Pattern -> Body;
Pattern2 -> Body2
after 0 ->
AfterBody
end
will try to match against the Pattern and Pattern2 in the mailbox. If none matches, it will immediately time out and go to AfterBody. This allows you to implement a non-blocking peek into the mailbox.
If the process is a gen_server the same thing can be had by playing with the internal state and the Timeout setting when a callback returns to the gen_server's control. You can set a Timeout of 0 to achieve this.
What am getting from the question is that we are talking of Synchronous Message Passing. YES ! Erlang can do this perfectly well, its the most basic way of handling concurrency in Erlang. Consider this below:
rpc(Request, To)->
MySelf = self(),
To ! {MySelf,Request},
receive
{To,Reply} -> Reply
after timer:seconds(5) -> erlang:exit({error,timedout})
end.
The code above shows that a processes sends a message to another and immediately goes into waiting (for a reply) without having to sleep. If it does not get a reply within 5 seconds, it will exit.
After watching the Pragmatic Studio screen casts on Erlang, the last video on Supervisors mentioned that in order for a supervisor to get a notification about one of its children so it can properly restart it, the child should register with process_flag(trap_exit, true). Perhaps I just misunderstood the author (and chances are VERY high that I did misunderstand), but I thought supervisors automagically know when their children die (probably through spawn_link or something similar in the background). Is this really necessary? When should one use process_flag(trap_exit, true) in a real world case because the documentation explicitly states the following:
http://www.erlang.org/doc/man/erlang.html#process_flag-2
process_flag(trap_exit, Boolean)
When trap_exit is set to true, exit signals arriving to a process are converted to {'EXIT', From, Reason} messages, which can be received as ordinary messages. If trap_exit is set to false, the process exits if it receives an exit signal other than normal and the exit signal is propagated to its linked processes. Application processes should normally not trap exits.``
You have 3 idioms:
1/ I don't care if my child process dies:
spawn(...)
2/ I want to crash if my child process crashes:
spawn_link(...)
3/ I want to receive a message if my child process terminates (normally or not):
process_flag(trap_exit, true),
spawn_link(...)
Please see this example and try different values (inverse with 2 or 0 to provoke an exception, and using trap_exit or not):
-module(play).
-compile(export_all).
start() ->
process_flag(trap_exit, true),
spawn_link(?MODULE, inverse, [2]),
loop().
loop() ->
receive
Msg -> io:format("~p~n", [Msg])
end,
loop().
inverse(N) -> 1/N.
Supervisors use links and trap exits so they can keep track of their children and restart them when necessary. Child processes do not have to trap exits to be properly managed by their supervisors, indeed they should only trap when they specifically need to know that some process to which they are linked dies and they don't want to crash themselves.
The OTP behaviours are able to properly handle supervision if they are trapped or not trapped.
In Erlang, processes can be linked together. These links are bi-directional. Whenever a process dies, it sends an exit signal to all linked processes. Each of these processes will have the trapexit flag enabled or disabled. If the flag is disabled (default), the linked process will crash as soon as it gets the exit signal. If the flag has been enabled by a call to system_flag(trap_exit, true), the process will convert the received exit signal into an exit message and it will not crash. The exit message will be queued in its mailbox and treated as a normal message.
If you're using OTP supervisors, they take care of the trap_exit flags and details for you, so you don't have to care about it.
If you're implementing a supervision mechanism, which is probably what the screen-cast is about (haven't seen it), you will have to take care of the trap_exit thing.