I noticed that messages sent to the pid of a gen_fsm process are matched in the state callbacks as events. Is this just accidental or can I rely on this feature?
Normally I would expect general messages sent to a gen_fsm to show up in the handle_info/3 callback and thought I would have to re-send it using gen_fsm:send_event.
Does gen_fsm try to match the message first to the state callback and then allways with the handle_info/3 callback? Or only if it doesn't match a state callback clause?
However when I try it my message seems to be handled twice according to debug output.
So basically the question can also be stated like: how to correctly handle received messages as events in gen_fsm state functions?
Clarification: that some of the events are occurring by getting messages passed should be considered given for this question.
I'm aware that in many cases its cleaner to make the protocol visible by using function calls into the fsm only.
I'm not so sure if this would improve the current framework where the mentioned gen_fsm has to fit in: Diverse protocol stacks where each layer calls a connect() function to attach (and sometimes start) the lower layer. Packets are sent to lower layers ba calling a function (send) and received by receiveing a message. Much like gen_tcp.
By looking at the code for gen_fsm I already figured out that general messages are only passed to handle_info, so only the question remains wether to call the state function directly from the handle_info/3 callback or resent using gen_fsm:send_event.
General messages are handled by handle_info callback, unless you have something like this in your code:
handle_info(Info, StateName, StateData) ->
?MODULE:StateName(Info, StateData).
Which avoids resending, but I do not recommend neither that, nor resending.
Delivering events exclusively by means of API calls encapsulating send_event/sync_send_event/send_all_state_event/sync_send_all_state_event makes protocol explicit. Which is a right thing, as it is easier to understand, maintain and document with edoc.
Related
I am having a hard time wrapping my head around the correct way to make calls against a gen_server instance dynamically created by a supervisor with a simple_one_for_one child strategy. I am attempting to create data access controls as gen_servers. Each entity will have its own supervisor, and that supervisor will create gen_server instances as needed to actually perform CRUD operations on the database. I understand the process for defining the child processes, as well as the process for creating them as needed.
Initially, my plan was to abstract the child creation process into custom functions in the gen_server module that created a child, fired off the requested operation (e.g. find, store, delete) on that child using gen_server:call(), and then returning the operation results back to the calling process. Unless I am mistaken, though, that will block any other processes attempting to use those functions until the call returns. That is definitely not what I have in mind.
I may be stuck in OO mode (my background is Java), but it seems like there should be a clean way of allowing a function in one module to obtain a reference to a child process and then make calls against that process without leaking the internals of that child. In other words, I do not want to have to call the create_child() method on an entity supervisor and then have my application code make gen_server:calls against that child PID (i.e. gen_sever:call(Pid, {find_by_id, Id})). I would instead like to be able to call a function more like Child:find_by_id(Id).
A full answer is highly dependent on your application — for example, one gen_server might suffice, or you might really need a pool of database connections instead. But one thing you should be aware of is that a gen_server can return from a handle_call callback before it actually has a reply ready for the client by returning {noreply, NewState} and then later, once it has a client reply ready, calling gen_server:reply/2 to send it back to the client. This allows the gen_server to service calls from other clients without blocking on the first call. Note though that this requires that the gen_server has a way of sending a request into the database without having to block waiting for a reply; this is often achieved by having the database send a reply that arrives in the gen_server:handle_info/2 callback, passing enough info back that the gen_server can associate the database reply with the correct client request. Note also that gen_server:call/2,3 has a default timeout of 5 seconds, so you'll need to deal with that if you expect the duration of database calls to exceed the default.
when you create, modify or delete a record, you don't need to wait for an answer. You can use a gen_server:cast for this, but you don't need a gen_server for this, as I said in my first comment, a simple call to an interface function executed in the client process will save time.
If you want to read, 2 cases:
you can do something else while waiting the answer, then a gen_server call is ok, but a simple spawned process waiting for the answer and sending it back to the client will provide the same service.
you cannot do anything before getting the answer, then there is no blocking issue, and I think that it is really preferable to use as less code as possible so again a simple function call will be enough.
gen_server is meant to be persistent and react to messages. I don't see in your example the need to be persistent.
-module(access).
-export([add/2,get/1]).
-record(foo, {bar, baz}).
add(A,B) ->
F = fun() ->
mnesia:write(#foo{bar=A,baz=B})
end,
spawn(mnesia,activity,[transaction, F]). %% the function return immediately,
%% but you will not know if the transaction failed
get(Bar) ->
F = fun() ->
case mnesia:read({foo, Bar}) of
[#foo{baz=Baz}] -> Baz;
[] -> undefined
end
end,
Pid = self(),
Ref = make_ref(),
Get = fun() ->
R = mnesia:activity(transaction, F),
Pid ! {Ref,baz,R}
end,
spawn(Get),
Ref. %% the function return immediately a ref, and will send later the message {Ref,baz,Baz}.
If the problem you see is that you are leaking that the internal implementation of your db-process is a gen_server, you could implement the api such that it takes the pid as argument as well.
-module(user).
-behaviour(gen_server).
-export([find_by_id/2]).
find_by_id(Pid, Id) ->
gen_server:call(Pid, {find_by_id, Id}).
%% Lots of code omitted
handle_call({find_by_id, Id}, From, State) ->
ok.
%% Lots more code omitted.
This way you don't tell clients that the implementation is in fact a gen_server (although someone could use gen_server:call as well).
If I write the following Dart code, how do I know which click handler happens first?
main() {
var button = new ButtonElement();
var stream = button.onClick.asBroadcastStream();
stream.listen(clickHandler1);
stream.listen(clickHandler2);
}
Let's say I'm in other code that doesn't know anything about the first two click handlers, but I register another one.
Can I know that the stream has two listeners?
Can I pause or cancel all other subscribers?
If I write button.onClick.asBroadcastStream() again elsewhere, does it point to the same stream as was used in main?
Can I say in one of the handlers to not pass event on to the other broadcast listener? Is that a consumer?
Let's say I'm in other code that doesn't know anything about the first
two click handlers, but I register another one.
Can I know that the stream has two listeners?
No, you can't. You could extend the stream class or wrap it and provide this functionality yourself, but it does not feel like a good design choice, because I don't think a listener should know about other listeners. What are you trying to do exactly? Perhaps there's a better way than letting listeners know about each other.
Can I pause or cancel all other subscribers?
You can cancel/pause/resume only the subscriber you are dealing with. Again, you probably shouldn't touch other listeners, but I guess you could wrap/extend the Stream class to have this behavior.
If I write button.onClick.asBroadcastStream() again elsewhere, does it point to the same stream as was used in main?
No, at least not at the current version of SDK. So, unfortunately, you need to store a reference to this broadcast stream somewhere, and refer to it, because calling asBroadcastStream() multiple times will not yield in the result you might expect. (Note: at least based on quick testing: http://d.pr/i/Ip0K although the documentation seems to indicate different, I have yet to test a bit more when I find the time).
Can I say in one of the handlers to not pass event on to the other broadcast listener?
Well, there's stopPropagation() in the HTML land which means that the event won't propagate to other elements, but it's probably not what you were looking for.
For being able to stop an event firing in other listeners, there needs to be an order of which the listeners are getting called. I believe the order is the order of registration of those listeners. From the design perspective, I don't think it would be a good idea to allow a listener to cancel/pause others.
Event propagation in HTML makes sense since it's about hierarchy, but here we don't have that (and even in case of events in HTML there can be multiple listeners for the single element).
There's no way to assign weight to listeners or define the order of importance, therefore it's not surprising that there isn't a way to stop the event.
Instead of letting listeners know about each other and manipulate each other, maybe you should try to think of another way to approach your problem (whatever that is).
Is that a consumer?
The StreamConsumer is just a class that you can implement if you want to allow other streams to be piped into your class.
Can I know that the stream has two listeners?
No, you have a ´Stream´ that wraps the DOM event handling. There is no such functionality.
Can I pause or cancel all other subscribers?
Look at Event.stopPropagation() and Event.stopImmediatePropagation(), and possibly Event.preventDefault().
If I write button.onClick.asBroadcastStream() again elsewhere, does it point to the same stream as was used in main?
[Updated] No, the current implementation doesn't gives you the same Stream back since the onClick getter returns a new stream every time it is invoked. However, the returned stream is already a broadcast stream so you shouldn't invoke asBroadcastStream() on it. If you do you will hower just get a reference to the same object back.
Stream<T> asBroadcastStream() => this;
Can I say in one of the handlers to not pass event on to the other broadcast listener? Is that a consumer?
Again, take a look at Event.stopPropagation() and Event.stopImmediatePropagation(), and possibly Event.preventDefault().
I have an OTP application with an event that happens periodically. There are several actors that want to do stuff in response to the event. The stuff each actor does is a function of its own state, but otherwise they're identical.
My problem is with how I go about incorporating this setup into a supervision tree. I have a gen_event manager with each actor being an event handler. This would work well if it weren't for the fact that gen_event supervision is weird. Once my first handler is add_sup_handler'd the rest fail with already_started and my gen_server that's acting as a supervisor for the event handlers dies.
So what should I do here? I'm starting to think I should just write my own event manager that can keep track of all my actors and their state.
gen_event:add_handler/3:
Handler is the name of the callback module Module or a tuple
{Module,Id}, where Id is any term. The {Module,Id} representation
makes it possible to identify a specific event handler when there are
several event handlers using the same callback module.
One of the things that attracted me to Erlang in the first place is the Actor model; the idea that different processes run concurrently and interact via asynchronous messaging.
I'm just starting to get my teeth into OTP and in particular looking at gen_server. All the examples I've seen - and granted they are tutorial type examples - use handle_call() rather than handle_cast() to implement module behaviour.
I find that a little confusing. As far as I can tell, handle_call is a synchronous operation: the caller is blocked until the callee completes and returns. Which seems to run counter to the async message passing philosophy.
I'm about to start a new OTP application. This seems like a fundamental architectural decision so I want to be sure I understand before embarking.
My questions are:
In real practice do people tend to use handle_call rather than handle_cast?
If so, what's the scalability impact when multiple clients can call the same process/module?
Depends on your situation.
If you want to get a result, handle_call is really common. If you're not interested in the result of the call, use handle_cast. When handle_call is used, the caller will block, yes. This is most of time okay. Let's take a look at an example.
If you have a web server, that returns contents of files to clients, you'll be able to handle multiple clients. Each client have to wait for the contents of files to be read, so using handle_call in such a scenario would be perfectly fine (stupid example aside).
When you really need the behavior of sending a request, doing some other processing and then getting the reply later, typically two calls are used (for example, one cast and the one call to get the result) or normal message passing. But this is a fairly rare case.
Using handle_call will block the process for the duration of the call. This will lead to clients queuing up to get their replies and thus the whole thing will run in sequence.
If you want parallel code, you have to write parallel code. The only way to do that is to run multiple processes.
So, to summarize:
Using handle_call will block the caller and occupy the process called for the duration of the call.
If you want parallel activities to go on, you have to parallelize. The only way to do that is by starting more processes, and suddenly call vs cast is not such a big issue any more (in fact, it's more comfortable with call).
Adam's answer is great, but I have one point to add
Using handle_call will block the process for the duration of the call.
This is always true for the client who made the handle_call call. This took me a while to wrap my head around but this doesn't necessarily mean the gen_server also has to block when answering the handle_call.
In my case, I encountered this when I created a database handling gen_server and deliberately wrote a query that executed SELECT pg_sleep(10), which is PostgreSQL-speak for "sleep for 10 seconds", and was my way of testing for very expensive queries. My challenge: I don't want the database gen_server to sit there waiting for the database to finish!
My solution was to use gen_server:reply/2:
This function can be used by a gen_server to explicitly send a reply to a client that called call/2,3 or multi_call/2,3,4, when the reply cannot be defined in the return value of Module:handle_call/3.
In code:
-module(database_server).
-behaviour(gen_server).
-define(DB_TIMEOUT, 30000).
<snip>
get_very_expensive_document(DocumentId) ->
gen_server:call(?MODULE, {get_very_expensive_document, DocumentId}, ?DB_TIMEOUT).
<snip>
handle_call({get_very_expensive_document, DocumentId}, From, State) ->
%% Spawn a new process to perform the query. Give it From,
%% which is the PID of the caller.
proc_lib:spawn_link(?MODULE, query_get_very_expensive_document, [From, DocumentId]),
%% This gen_server process couldn't care less about the query
%% any more! It's up to the spawned process now.
{noreply, State};
<snip>
query_get_very_expensive_document(From, DocumentId) ->
%% Reference: http://www.erlang.org/doc/man/proc_lib.html#init_ack-1
proc_lib:init_ack(ok),
Result = query(pgsql_pool, "SELECT pg_sleep(10);", []),
gen_server:reply(From, {return_query, ok, Result}).
IMO, in concurrent world handle_call is generally a bad idea. Say we have process A (gen_server) receiving some event (user pressed a button), and then casting message to process B (gen_server) requesting heavy processing of this pressed button. Process B can spawn sub-process C, which in turn cast message back to A when ready (of to B which cast message to A then). During processing time both A and B are ready to accept new requests. When A receives cast message from C (or B) it e.g. displays result to the user. Of course, it is possible that second button will be processed before first, so A should probably accumulate results in proper order. Blocking A and B through handle_call will make this system single-threaded (though will solve ordering problem)
In fact, spawning C is similar to handle_call, the difference is that C is highly specialized, process just "one message" and exits after that. B is supposed to have other functionality (e.g. limit number of workers, control timeouts), otherwise C could be spawned from A.
Edit: C is asynchronous also, so spawning C it is not similar to handle_call (B is not blocked).
There are two ways to go with this. One is to change to using an event management approach. The one I am using is to use cast as shown...
submit(ResourceId,Query) ->
%%
%% non blocking query submission
%%
Ref = make_ref(),
From = {self(),Ref},
gen_server:cast(ResourceId,{submit,From,Query}),
{ok,Ref}.
And the cast/submit code is...
handle_cast({submit,{Pid,Ref},Query},State) ->
Result = process_query(Query,State),
gen_server:cast(Pid,{query_result,Ref,Result});
The reference is used to track the query asynchronously.
Can an OTP event manager process (e.g. a logger) have some state of its own (e.g. logging level) and filter/transform events based on it?
I also have a need to put some state into the gen_event itself, and my best idea at the moment is to use the process dictionary (get/put). Handlers are invoked in the context of the gen_event process, so the same process dictionary will be there for all handler calls.
Yes, process dictionaries are evil, but in this case they seem less evil than alternatives (ets table, state server).
The gen_event implementation as contained in the OTP does no provide means for adding state.
You could extend the implementation to achieve this and use your implementation instead of gen_event. However I would advise against it.
The kind of state you want to add to the event manager belongs really in the event handler for several reasons:
You might want to use different levels in different handlers, e.g. only show errors on the console but write everything to the disk.
If the event level would be changed in the manager event handlers depending on getting all unfiltered events might cease to function (events have more uses than just logging). This might lead to hard to debug problems.
If you want a event manager for multiple handlers that all only get filtered events you can easily achieve this by having two managers: one for unfiltered messages and one for e.g. level filtered messages. Then install a handler to the unfiltered one, filter in the handler by level (easy) and pass on the filtered events to the other manager. All handlers that only want to get filtered messages can be registered to the second manager.
The handlers can have their own state that gets passed on every callback like:
Module:handle_event(Event, State) -> Result
Filtering might look like this (assuming e.g. {level N, Content} events):
handle_event({level, Lvl, Content}, State#state{max_level=Max}) when Lvl >= Max ->
gen_event:notify(filtered_man, Content);
The State can be changed either by special events, by gen_event:call\3,4 (preferably) or by messages handled by handle_info.
For details see Gen_Event Behaviour and gen_event(3)
When you start_link a gen_event process - thing that you should always do via a supervisor -, you can merely specify a name for the new process, if you need/want it to be registered.
As far as I can see, there's no way to initiate a state of some sort using that behaviour.
Of course, you can write your own behaviour, on the top of a gen_event or of a simple gen_server.
As an alternative, you might use a separate gen_event process for each debugging level.
Or you can just filter the messages in the handlers.