Related
I'm trying to understand what is an FP-alternative to good old dependency injection from OOP.
Say I have the following app (pseudocode)
app() is where application starts. It allows user to register and list user posts (whatever). These two functions are composed out of several other functions (register does it step by step, imperatively, while list posts really composes them (at least this is how I understand function composition).
app()
registerUser(u)
validate(u)
persist(u)
callSaveToDB(u)
notify(u)
sendsEmail
listPosts(u)
postsToView(loadUserPosts(findUser(u)))
Now I'd like to test this stuff (registerUser and listPosts) and would like to have stubbed functions so that I don't call db etc - you know, usual testing stuff.
I know it's possible to pass functions to functions e.g
registerUser(validateFn, persistFn, notifyFn, u)
and have it partially applied so it looks like registerUser(u) with other functions closed over and so on. But it all needs to be done on app boot level as it was in OOP (wiring dependencies and bootstraping an app). It looks like manually doing this will take ages and tons of boilerplate code. Is there something obvious I'm missing there? Is there any other way of doing that?
EDIT:
I see having IO there is not a good example. So what if I have function composed of several other functions and one of them is really heavy (in terms of computations) and I'd like to swap it?
Simply - I'm looking for FP way of doing DI stuff.
The way to answer this is to drop the phrase "dependency injection" and think about it more fundamentally. Write down interfaces as types for each component. Implement functions that have those types. Replace them as needed. There's no magic, and language features like type classes make it easy for the compiler to ensure you can substitute methods in an interface.
The previous Haskell-specific answer, shows how to use Haskell types for the API: https://stackoverflow.com/a/14329487/83805
I'm writing an event manager that will take a lot of different event handlers. This event manager will be notified with a lot of different events. Each handler only handle certain events, and ignore the rest. Each handler can also trigger certain other events based on situation.
For example, first handler to handle Event1
-module (first_handler).
-behavior (gen_event).
...
handle_event(Event1, State) -> {ok, State};
handle_event(_, State) -> {ok, State}.
Second handler to handle Event2
-module (second_handler).
-behavior (gen_event).
...
handle_event(Event2, State) ->
gen_event:notify(self(), Event1),
{ok, State};
handle_event(_, State) -> {ok, State}.
The event triggering can be done by calling gen_event:notify(self(), NewEvent) within a handle_event of the handler, but I would rather abstract and export that out so that it can be called from the event manager.
Since pattern matching and ignoring events and triggering events are common to all the handlers, is there anyway I can extend gen_event behavior to provide those as built-ins?
I'll start with the default way to create a custom behavior:
-module (gen_new_event).
-behaviour (gen_event).
behaviour_info(Type) -> gen_event:behaviour_info(Type).
I'm not sure what to do next.
What are you trying to do exactly? I could not understand from the examples you provided. In second_handler's handle_event/2, Event1 is unbound. Also, does using self() work? Shouldn't that be the registered name of the manager. Not sure whether handle_event/2 gets executed by the manager or each handler process (but the latter makes more sense).
By implementing your gen_new_event module, you are implementing a handler (i.e. a callback module), and not an event manager. The fact that you have -behaviour(gen_event) means that you're asking the compiler to check that gen_new_event actually implements all the functions listed by gen_event:behaviour_info(callbacks), thereby making gen_new_event an eligible handler which you could add to an event manager via gen_event:add_handler(manager_registered_name, gen_new_event, []).
Now, if you take away -behaviour (gen_event), gen_new_event no longer has to implement the following functions:
35> gen_event:behaviour_info(callbacks).
[{init,1},
{handle_event,2},
{handle_call,2},
{handle_info,2},
{terminate,2},
{code_change,3}]
You could make gen_new_event a behaviour (i.e. an interface) by adding more functions which you will be requiring any module which uses -behaviour(gen_new_event) to implement:
-module (gen_new_event).
-export([behaviour_info/1]).
behaviour_info(callbacks) ->
[{some_fun, 2}, {some_other_fun, 3} | gen_event:behaviour_info(callbacks)].
Now, if in some module, for e.g. -module(example), you add the attribute -behaviour(gen_new_event), then the module example will have to implement all the gen_event callback functions + some_fun/2 and some_other_fun/3.
I doubt that's what you were looking for, but your last example seemed to suggest that you wanted to implement a behaviour. Note that, all you're doing by implementing a behaviour is requiring other modules to implement certain functions should they use -behaviour(your_behaviour).
(Also, if I understood you correctly, if you want to extend gen_event then you could always simply copy the code in gen_event.erl and extend it ... I guess, but is this really necessary for what you're trying to do?).
Edit
Objective: extract common code out of gen_event implementations. So for e.g. there's a handle_event/2 clause which you want in every one of your gen_events.
One way of going about it: You could use a parameterized module. This module would implement the gen_event behaviour, but, only the common behaviour which all your gen_event callback modules should have. Anything which is not "common" can be delegated to the module's parameter (which you'd bind to a module name containing the "custom" implementation of the gen_event callback.
E.g.
-module(abstract_gen_event, [SpecificGenEvent]).
-behaviour(gen_event).
-export(... all gen_event functions).
....
handle_event({info, Info}, State) ->
%% Do something which you want all your gen_events to do.
handle_event(Event, State) ->
%% Ok, now let the particular gen_event take over:
SpecificGenEvent:handle_event(Event, State).
%% Same sort of thing for other callback functions
....
Then you'd implement one or more gen_event modules which you'll be plugging into abstract_gen_event. Lets say one of them is a_gen_event.
Then you should be able to do:
AGenEvent = abstract_gen_event:new(a_gen_event). %% Note: the function new/x is auto-generated and will have arity according to how many parameters a parameterized module has.
Then, I guess you could pass AGenEvent to gen_event:add_handler(some_ref, AGenEvent, []) and it should work but note that I have never tried this out.
Perhaps you could also get around this using macros or (but this is a bit overkill) do some playing around at compilation time using parse_transform/2. Just a thought though. See how this parameterized solution goes first.
2nd Edit
(Note: not sure whether I should delete everything prior to what is in this section. Please let me know or just delete it if you know what you're doing).
Ok, so I tried it out myself and yes, the return value of a parameterized module will crash when feeding it to gen_event:add_handler/3's second argument... too bad :(
I can't think of any other way of going about this then other than a) using macros b) using parse_transform/2.
a)
-module(ge).
-behaviour(gen_event).
-define(handle_event,
handle_event({info, Info}, State) ->
io:format("Info: ~p~n", [Info]),
{ok, State}).
?handle_event;
handle_event(Event, State) ->
io:format("got event: ~p~n", [Event]),
{ok, State}.
So basically you would have all the callback function clauses for the common functionality defined in macro definitions in a header file which you include in every gen_event which uses this common functionality. Then you ?X before/after each callback function which uses the common functionality... I know it's not that clean and I'm generally weary of using macros myself but hey... if the problem is really nagging you that's one way to go about it.
b) Google around for some info on using parse_transform/2 in Erlang. You could implement a parse_transform which looks for the callback functions in you gen_event modules which have the specific cases for the callbacks but do not have the generic cases (i.e. clauses like the ({info, Info}, State) in the macro above). Then you would simply add the forms which make up the generic cases.
I would suggest doing something like this (add exports):
-module(tmp).
parse_transform(Forms, Options) ->
io:format("~p~n", [Forms]),
Forms.
-module(generic).
gen(Event, State) ->
io:format("Event is: ~p~n", [Event]),
{ok, State}.
Now you can compile with:
c(tmp).
c(generic, {parse_transform, tmp}).
[{attribute,1,file,{"../src/generic.erl",1}},
{attribute,4,module,generic},
{attribute,14,compile,export_all},
{function,19,gen,2,
[{clause,19,
[{var,19,'Event'},{var,19,'State'}],
[],
[{call,20,
{remote,20,{atom,20,io},{atom,20,format}},
[{string,20,"Event is: ~p~n"},
{cons,20,{var,20,'Event'},{nil,20}}]},
{tuple,21,[{atom,21,ok},{var,21,'State'}]}]}]},
{eof,28}]
{ok,generic}
That way you can copy-paste the forms you'll be injecting. You would copy them into a proper parse_transform/2 which, rather than just printing, would actually go through your source's code and inject the code you want where you want it.
As a side note, you could include the attribute -compile({parse_transform, tmp}) to every gen_event module of yours which needs to be parse_transformed in this way to add the generic functionality (i.e. and avoid having to pass this to the compiler yourself). Just make sure tmp or whichever module contains your parse_transform is loaded or compiled in a dir on the path.
b) seems like a lot of work I know...
Your installed handlers are already running in the context of the event manager which you start and then install handlers into. So if their handle-event function throws out data, they already do what you want.
You don't need to extend the event behaviour. What you do is:
handle_event(Event, State) ->
generic:handle_event(Event, State).
and then let the generic module handle the generic parts. Note that you could supply generic a way to callback to this handler module for specialized handler behaviour should you need it. For example:
generic:handle_event(fun ?MODULE:callback/2, Event, State)...
and so on.
I have an OTP application with an event that happens periodically. There are several actors that want to do stuff in response to the event. The stuff each actor does is a function of its own state, but otherwise they're identical.
My problem is with how I go about incorporating this setup into a supervision tree. I have a gen_event manager with each actor being an event handler. This would work well if it weren't for the fact that gen_event supervision is weird. Once my first handler is add_sup_handler'd the rest fail with already_started and my gen_server that's acting as a supervisor for the event handlers dies.
So what should I do here? I'm starting to think I should just write my own event manager that can keep track of all my actors and their state.
gen_event:add_handler/3:
Handler is the name of the callback module Module or a tuple
{Module,Id}, where Id is any term. The {Module,Id} representation
makes it possible to identify a specific event handler when there are
several event handlers using the same callback module.
One of the things that attracted me to Erlang in the first place is the Actor model; the idea that different processes run concurrently and interact via asynchronous messaging.
I'm just starting to get my teeth into OTP and in particular looking at gen_server. All the examples I've seen - and granted they are tutorial type examples - use handle_call() rather than handle_cast() to implement module behaviour.
I find that a little confusing. As far as I can tell, handle_call is a synchronous operation: the caller is blocked until the callee completes and returns. Which seems to run counter to the async message passing philosophy.
I'm about to start a new OTP application. This seems like a fundamental architectural decision so I want to be sure I understand before embarking.
My questions are:
In real practice do people tend to use handle_call rather than handle_cast?
If so, what's the scalability impact when multiple clients can call the same process/module?
Depends on your situation.
If you want to get a result, handle_call is really common. If you're not interested in the result of the call, use handle_cast. When handle_call is used, the caller will block, yes. This is most of time okay. Let's take a look at an example.
If you have a web server, that returns contents of files to clients, you'll be able to handle multiple clients. Each client have to wait for the contents of files to be read, so using handle_call in such a scenario would be perfectly fine (stupid example aside).
When you really need the behavior of sending a request, doing some other processing and then getting the reply later, typically two calls are used (for example, one cast and the one call to get the result) or normal message passing. But this is a fairly rare case.
Using handle_call will block the process for the duration of the call. This will lead to clients queuing up to get their replies and thus the whole thing will run in sequence.
If you want parallel code, you have to write parallel code. The only way to do that is to run multiple processes.
So, to summarize:
Using handle_call will block the caller and occupy the process called for the duration of the call.
If you want parallel activities to go on, you have to parallelize. The only way to do that is by starting more processes, and suddenly call vs cast is not such a big issue any more (in fact, it's more comfortable with call).
Adam's answer is great, but I have one point to add
Using handle_call will block the process for the duration of the call.
This is always true for the client who made the handle_call call. This took me a while to wrap my head around but this doesn't necessarily mean the gen_server also has to block when answering the handle_call.
In my case, I encountered this when I created a database handling gen_server and deliberately wrote a query that executed SELECT pg_sleep(10), which is PostgreSQL-speak for "sleep for 10 seconds", and was my way of testing for very expensive queries. My challenge: I don't want the database gen_server to sit there waiting for the database to finish!
My solution was to use gen_server:reply/2:
This function can be used by a gen_server to explicitly send a reply to a client that called call/2,3 or multi_call/2,3,4, when the reply cannot be defined in the return value of Module:handle_call/3.
In code:
-module(database_server).
-behaviour(gen_server).
-define(DB_TIMEOUT, 30000).
<snip>
get_very_expensive_document(DocumentId) ->
gen_server:call(?MODULE, {get_very_expensive_document, DocumentId}, ?DB_TIMEOUT).
<snip>
handle_call({get_very_expensive_document, DocumentId}, From, State) ->
%% Spawn a new process to perform the query. Give it From,
%% which is the PID of the caller.
proc_lib:spawn_link(?MODULE, query_get_very_expensive_document, [From, DocumentId]),
%% This gen_server process couldn't care less about the query
%% any more! It's up to the spawned process now.
{noreply, State};
<snip>
query_get_very_expensive_document(From, DocumentId) ->
%% Reference: http://www.erlang.org/doc/man/proc_lib.html#init_ack-1
proc_lib:init_ack(ok),
Result = query(pgsql_pool, "SELECT pg_sleep(10);", []),
gen_server:reply(From, {return_query, ok, Result}).
IMO, in concurrent world handle_call is generally a bad idea. Say we have process A (gen_server) receiving some event (user pressed a button), and then casting message to process B (gen_server) requesting heavy processing of this pressed button. Process B can spawn sub-process C, which in turn cast message back to A when ready (of to B which cast message to A then). During processing time both A and B are ready to accept new requests. When A receives cast message from C (or B) it e.g. displays result to the user. Of course, it is possible that second button will be processed before first, so A should probably accumulate results in proper order. Blocking A and B through handle_call will make this system single-threaded (though will solve ordering problem)
In fact, spawning C is similar to handle_call, the difference is that C is highly specialized, process just "one message" and exits after that. B is supposed to have other functionality (e.g. limit number of workers, control timeouts), otherwise C could be spawned from A.
Edit: C is asynchronous also, so spawning C it is not similar to handle_call (B is not blocked).
There are two ways to go with this. One is to change to using an event management approach. The one I am using is to use cast as shown...
submit(ResourceId,Query) ->
%%
%% non blocking query submission
%%
Ref = make_ref(),
From = {self(),Ref},
gen_server:cast(ResourceId,{submit,From,Query}),
{ok,Ref}.
And the cast/submit code is...
handle_cast({submit,{Pid,Ref},Query},State) ->
Result = process_query(Query,State),
gen_server:cast(Pid,{query_result,Ref,Result});
The reference is used to track the query asynchronously.
Can an OTP event manager process (e.g. a logger) have some state of its own (e.g. logging level) and filter/transform events based on it?
I also have a need to put some state into the gen_event itself, and my best idea at the moment is to use the process dictionary (get/put). Handlers are invoked in the context of the gen_event process, so the same process dictionary will be there for all handler calls.
Yes, process dictionaries are evil, but in this case they seem less evil than alternatives (ets table, state server).
The gen_event implementation as contained in the OTP does no provide means for adding state.
You could extend the implementation to achieve this and use your implementation instead of gen_event. However I would advise against it.
The kind of state you want to add to the event manager belongs really in the event handler for several reasons:
You might want to use different levels in different handlers, e.g. only show errors on the console but write everything to the disk.
If the event level would be changed in the manager event handlers depending on getting all unfiltered events might cease to function (events have more uses than just logging). This might lead to hard to debug problems.
If you want a event manager for multiple handlers that all only get filtered events you can easily achieve this by having two managers: one for unfiltered messages and one for e.g. level filtered messages. Then install a handler to the unfiltered one, filter in the handler by level (easy) and pass on the filtered events to the other manager. All handlers that only want to get filtered messages can be registered to the second manager.
The handlers can have their own state that gets passed on every callback like:
Module:handle_event(Event, State) -> Result
Filtering might look like this (assuming e.g. {level N, Content} events):
handle_event({level, Lvl, Content}, State#state{max_level=Max}) when Lvl >= Max ->
gen_event:notify(filtered_man, Content);
The State can be changed either by special events, by gen_event:call\3,4 (preferably) or by messages handled by handle_info.
For details see Gen_Event Behaviour and gen_event(3)
When you start_link a gen_event process - thing that you should always do via a supervisor -, you can merely specify a name for the new process, if you need/want it to be registered.
As far as I can see, there's no way to initiate a state of some sort using that behaviour.
Of course, you can write your own behaviour, on the top of a gen_event or of a simple gen_server.
As an alternative, you might use a separate gen_event process for each debugging level.
Or you can just filter the messages in the handlers.