What is the most recommended way in Erlang to ensure that some process exists before sending a message / event to it? In my scenario I am starting process upon first occurrence of the message and then it stays alive. While keep passing further messages, I first try to start the process with the same name to ensure it is started, something like this (using gen_fsm and simple-one-for-one restart scenario):
%% DeviceId - process name
heartbeat(ApplicationKey, DeviceId, Timeout) ->
ensure_session_started(ApplicationKey, DeviceId, Timeout),
gen_fsm:send_event(DeviceId, {heartbeat, Timeout}).
ensure_session_started(ApplicationKey, DeviceId, Timeout) ->
case session_server_sup:start_child(ApplicationKey, DeviceId, Timeout) of
{ok, _Pid} -> {ok, running};
{error, {already_started, _}} -> {ok, running};
{error, Error} -> erlang:throw({error, {failed_to_start_session, Error}})
end.
I believe this solution is not perfect and probably has some overhead but still believe it is less race condition prone than using erlang:is_process_alive. Am I right? Any ideas how to improve it?
You are right, erlang:is_process_alive/1 approach is useless in this scenario because of race condition.
Your example is workable and I saw it in the wild few times. Note that it does not guarantee that message will be processed. To be sure of that you need to monitor your receiver and get confirmation from it. This is how it is done in gen_server:call/2.
With gen_fsm:sync_send_event/2,3 you can send events to your FSM and wait for a response. So you can tell your calling process that its message has been received.
Related
I'm trying to connect a gen_server to another gen_server and during the connect the servers need to monitor each other and know when the server has crashed, either the entire node or the server process. after im doing the first start_link and one of the servers crashes the other server gets a message from the monitor in the code (handle_info function is activated), but when it happens for the second time the monitor sends the information directly to the shell (the message does not go through the handle_info and goes directly to the shell only visible using flush() inside the shell) and the server that was suppose to be alerted from the monitor doesn't receive any message.
my code in the sending side:
handle_call({connect, Node, Who}, _From, _State) ->
case Who of
cdot -> ets:insert(address, {cdot, Node}), ets:insert(address,
{Node, cdot}), monitor_node(Node, true);
cact -> ets:insert(address, {cact, Node}), ets:insert(address,
{Node, cdot}), monitor_node(Node ,true);
ctitles -> ets:insert(address, {ctitles, Node}),
ets:insert(address, {Node, cdot}), monitor_node(Node, true);
_-> ok
end,
[{_, Pid2}] = ets:lookup(?name_table3, pidGui),
Pid2 ! {db, "Node "++ atom_to_list(Who) ++ " connected"}, %print to
gui witch node was connected
{reply, {{node(), self()}, connected}, node()};
and the one in the receiving side is:
connect() ->
{{Node, Pid}, Connected} = gen_server:call(server_node(), {connect,
node(), cact}),
monitor_node(Node, true),
monitor(process, Pid),
Connected.
please can anyone tell me why this is happening?
the same happens for either node or process monitoring
If you get the second monitor message in the shell, it is because you call the connect function in the shell context.
Check how you call this function, it must be done in the server context, it means inside a handle_call, handle_cast or handle_info function.
after im doing the first start_link and one of the servers crashes the
other server gets a message from the monitor in the code, but when it
happens for the second time
It sounds like you are starting a new server after a server crashes. Do you call monitor() on the new server Pid?
A monitor is triggered only once, after that it is removed from
both monitoring process and the monitored entity. Monitors are fired
when the monitored process or port terminates, does not exist at the
moment of creation, or if the connection to it is lost. In the case
with connection, we lose knowledge about the fact if it still exists
or not. The monitoring is also turned off when demonitor/1 is called.
I have a gen_server that mimics a gen_fsm, (don't ask me why...), a gen_server:call will cause this gen_server from the current state to transfer to the next state, if no gen_server:call invoked in a certain time, the gen_server terminates.
To be more specific, the life of the gen_server looks like this:
state_1 -> state_2 -> ... -> state_N -> terminated
when the server is at state_i, if there is no gen_server call invoked on this server, after t_i seconds, the server will go to state terminated, of cause, this is achieved by using {reply, Reply, NewState, t_i} as the return of the handle_call/3.
The problem of the method is that, I cannot retrieve some information from this gen_server, for to do that, I need to invoke gen_server:call on it, which would mess the timeout up.
One possible workaround is to put the last state transfer time stamp into the state, after each retrieval call, reset the new timeout to a appropriate value, a prototype looks like this:
handle_call(get_a, _From, #state{a = 1,
state = state_2,
%% this is the timestamp when the server transfered to state_2
unixtime = T1
}=S) ->
Reply = S#state.a,
NewTimeout = t_2 - (current_unixtime() - T1),
{reply, Reply, S, NewTimeout};
In this way, I can get the effect I want, but it's ugly, is there any better way to to this?
If you want to set timeouts independently of other events like calls it's probably best to use a timed message.
When you set the timeout use erlang:send_after/3:
TimerRef = erlang:send_after(10000, self(), my_timeout).
You can cancel your timeout any time with erlang:cancel_timer/1.
erlang:cancel_timer(TimerRef).
Receive it with handle_info:
handle_info(my_timeout, State) ->
If you need multiple such timeouts, use a different message, or if you have the possibility of some sort of race condition and need further control, you can create unique references with erlang:make_ref/0 and send a message like {Ref, my_timeout}.
Watch out for the edge cases - remember you could cancel a timer and then still receive it unexpectedly (because it's in your message queue when you cancel), and you don't make them unique (as suggested above using a reference) you could be expecting a timeout, and get it early, because it's the previous one that entered your message queue, etc (where as with a reference you can check it is the latest set). These things are easy to deal with, but watch out for them.
If an Erlang process proc1 is killed (exit(killed)) and another process proc2 is notified of this event because it's linked to the process proc1, is there a possibility to respawn a replacement process with the mailbox of the killed process proc1?
Not really. Processes don't share any data, so when one dies no other can access any of it's memory. Erlang would not know how long to wait to garbage-collect such mailbox.
You could simulate that with separate proxy process that just keeps track of mailbox.
Or you could try to handle little better being killed in proc1. You could use process_flag(trap_exit, true) which would allow you to receive exit(killed) message as normal message {'EXIT',FromPid,killed}. And when you receive you read you whole mailbox, and then continue exiting with all messages being part of exit reason.
It could look somewhat like this:
init(Args) ->
process_flag(trap_exit, true),
.... % continiue process innicializaiton.
loop(State) ->
receive
....
{'EXIT', _FromPid, killed} ->
exit({killed, all_messages()})
end.
all_messages() ->
all_messages([]).
all_messages(Messages) ->
receive
AnyMessage ->
all_messages( [AnyMessage|Messages])
after 0 ->
lists:reverse(Messages)
end.
And proc2 would receive all unprocessed messages, and could send them again to newly spawn process.
Even though I unregister sts, my spawned process is not stopped. How can I stop it not using gen_server?
start() ->
case whereis(sts) of
undefined ->
PidA = spawn(dist_erlang, init,[]),
register(sts, PidA),
{ok,PidA};
_ ->
{ok,whereis(sts)}
end.
stop() ->
case whereis(sts) of
undefined ->
already_stopped;
_ ->
unregister(sts),
stopped,
end.
Using unregister does not stop the process. Stopping the process does, however, unregister it. So instead of using unregister here, use erlang:exit/2
stop() ->
case whereis(sts) of
undefined ->
already_stopped;
Pid ->
exit(Pid, normal), % Use whatever exit reason you want
stopped
end.
All that being said, you should really be using the OTP process behaviours (like gen_server), as they make process management much easier. With an OTP process, you can instead call the process and tell it to stop, so that when you get your reply it has already stopped. Otherwise your exit message may take some time to get through.
unregister does not stop the process. It just removes binding between process id and given atom.
You need to remember that stop/0 function is run in context of process that called this function, and not the gen_server itself. Actually (almost) only way to interact with some process is to send it a message. So you could implement your stop/0 function like this:
%% send stop message to `sts` server
stop() ->
gen_server:cast(sts, stop).
%% [...]
handle_cast( OtherCastMessages, State) ->
%% handel other cast messages ;
%% [...] ;
%% handle stop message
handle_cast( _Message = stop, State) ->
{stop,
_Reason = normal,
State}. % return result that stops server
%% [...]
terminate(_Reason = normal, State) ->
%% could do some cleanup in this callback
ok.
So to stop server you have to return special tuple from one of the behaviour functions. You can read more about this here. And of course trigger one of behaviour functions, you have to send message to your server with gen_server:cast or gen_server:call (or just send a message and handle it with handle_info). What to use is your decision. Finally terminate2 is called (no matter which callback returned tuple with stop atom), where you could do some cleanup with your state.
Of course you could unregister your process in terminate callback, but when process dies the unregistration is handled automatically.
This could be a very basic question but is Erlang capable of calling a method on another prcoess and wait for it to repond back with some object type without sleeping threads?
Well, if you're waiting for an answer, the calling process will have to sleep eventually… But that's no big deal.
While processes are stuck on the receive loop, other processes can work. In fact, it's not uncommon to have thousands of processes just waiting for messages. And since Erlang processes are not true OS threads, they're very lightweight so the performance loss is minimal.
In fact, the way sleep is implemented looks like:
sleep(Milliseconds) ->
receive
% Intentionally left empty
after Milliseconds -> ok
end.
Yes, it is possible to peek into the mailbox if that is what you mean. Say we have sent a message to another process and now we want to see if the other process has sent something back to us. But we don't want to block on the receive:
receive
Pattern -> Body;
Pattern2 -> Body2
after 0 ->
AfterBody
end
will try to match against the Pattern and Pattern2 in the mailbox. If none matches, it will immediately time out and go to AfterBody. This allows you to implement a non-blocking peek into the mailbox.
If the process is a gen_server the same thing can be had by playing with the internal state and the Timeout setting when a callback returns to the gen_server's control. You can set a Timeout of 0 to achieve this.
What am getting from the question is that we are talking of Synchronous Message Passing. YES ! Erlang can do this perfectly well, its the most basic way of handling concurrency in Erlang. Consider this below:
rpc(Request, To)->
MySelf = self(),
To ! {MySelf,Request},
receive
{To,Reply} -> Reply
after timer:seconds(5) -> erlang:exit({error,timedout})
end.
The code above shows that a processes sends a message to another and immediately goes into waiting (for a reply) without having to sleep. If it does not get a reply within 5 seconds, it will exit.