Erlang: Hooks vs gen_event - erlang

Question is why some applications (like ejabberd) use own hooks module (e.g. ejabberd_hooks.erl) instead of gen_event?

Ejabberd hooks and gen_event are quite different things. Ejabberd hooks are evaluated by the process calling them - gen_event handlers run in one single gen_event process. As Ejabberd needs to run many hooks for most messages, sending each xmpp message to lots of different gen_event processes wouldn't get as high message throughput as the Ejabberd approach does.

Don't expect the answer to be too interesting. Either it was because the author wasnt familiar with gen_event, or it didnt perform well enough back in 2004 when ejabberd_hooks was added.

Related

which behaviour should be implemented?

So we have generic part and the specific part thanks to erlang/OTP behaviours which greatly simplifies our code and gives us a common skeleton that everybody understands.
I am an Erlang newbie and trying to get my head around concepts of OTP.
But amongst
gen_server
gen_fsm
gen_event
supervisor
which behaviour is suitable for what kind of application.
The reason why I am asking this question is because after going through some verbose theory,I am still unsure about which is more suited for a XYZ application.
Is there any criteria or a Rule of thumb,or is it just the programmers instincts that matter in this case?
Any help?
Remark, it is a very general question that doesn't fit well to the usage of this community.
Supervisor are here to control the way your application starts and evolves in time.
It starts processes in the right order with a verification that everything is going fine, manage the restart strategy in case of failure, pool of processes, supervision tree... The advantages are that it helps you to follow the "let it crash" rule in your application, and it concentrates in a few line of code the complex task to coordinate all your processes.
I use gen_event mainly for logging purposes. You can use this behavior to connect on demand different handlers depending on your target. In one of my application, I have by default a graphical logger that allows to check the behavior at a glance, and a textual file handler that allows deeper and post mortem analysis.
Gen_server is the basic brick of all applications. It allows to keep a state, react on call, cast and even simple messages. It manages the code changes and provides features for start and stop. Most of my modules are gen_servers, you can see it as an actor which is responsible to execute a single task.
Gen_fsm brings to gen server the notion of a current state with a specific behavior, and transition from state to state. It can be used for example in a game to control the evolution of the game (configuration, waiting for players, play, change level...)
When you create an application you will identify the different tasks yo have to do and assign them to either gen_server either gen_fsm. You will add gen_event to allow publish/subscribe mechanism. Then you will have to group all processes considering their dependency and error recovering. This will give you the supervision tree and restart strategy.
I use following rules.
If you have some long-living entity that serves multiple clients in request-response manner - you use gen_server.
If this long-living entity should support more complex protocol that just request/response, there is good idea to use gen_fsm. I like use gen_fsm for network protocol implementation, for example.
If you need some abstract event bus, where any process can publish messages, and any client may want receive this messages, you need gen_event.
If you have some proceeses which must be restarted together (for example, if one of them fails), you should put this processes under supervisor, which ties this processes together.

How to fit non-event driven processes into supervision tree?

I want to be able to spawn a lot of processes that process data and fit them into a supervision tree. However all default behaviours, namely gen_server, gen_fsm, and gen_event, are event-driven. They have to receive messages to do stuff. What I need are just processes that process data, and in case they terminate abnormally, they should be restarted by their supervisor. What's the best way to go about doing this?
Yes, the standard behaviours all function as servers in that they sit and wait for requests before they do something. However, OTP is open in the sense that it provides the tools you need to implement processes which are not behaviours but which fit into the supervision trees and do "the right thing". For a description on what needs to be done and how to do it see the section on Special processes in the Erlang documentation.
This is really not surprising as all of OTP behaviours are implemented in Erlang so all the "tools" are there in the libraries.

How to call (and sleep) on gen_tcp:accept and still process system messages at the same time?

I'm still kind of new to the erlang/otp world, so I guess this is a pretty basic question. Nevertheless I'd like to know what's the correct way of doing the following.
Currently, I have an application with a top supervisor. The latter will supervise workers that call gen_tcp:accept (sleeping on it) and then spawn a process for each accepted connection. Note: To this question, it is irrelevant where the listen() is done.
My question is about the correct way of making these workers (the ones that sleep on gen_tcp:accept) respect the otp design principles, in such a way that they can handle system messages (to handle shutdown, trace, etc), according to what I've read here: http://www.erlang.org/doc/design_principles/spec_proc.html
So,
Is it possible to use one of the available behaviors like gen_fsm or gen_server for this? My guess would be no, because of the blocking call to gen_tcp:accept/1. Is it still possible to do it by specifying an accept timeout? If so, where should I put the accept() call?
Or should I code it from scratch (i.e: not using an existant behavior) like the examples in the above link? In this case, I thought about a main loop that calls gen_tcp:accept/2 instead of gen_tcp:accept/1 (i.e: specifying a timeout), and immediately afterwards code a receive block, so I can process the system messages. Is this correct/acceptable?
Thanks in advance :)
As Erlang is event driven, it is awkward to deal with code that blocks as accept/{1,2} does.
Personally, I would have a supervisor which has a gen_server for the listener, and another supervisor for the accept workers.
Handroll an accept worker to timeout (gen_tcp:accept/2), effectively polling, (the awkward part) rather than receiving an message for status.
This way, if a worker dies, it gets restarted by the supervisor above it.
If the listener dies, it restarts, but not before restarting the worker tree and supervisor that depended on that listener.
Of course, if the top supervisor dies, it gets restarted.
However, if you supervisor:terminate_child/2 on the tree, then you can effectively disable the listener and all acceptors for that socket. Later, supervisor:restart_child/2 can restart the whole listener+acceptor worker pool.
If you want an app to manage this for you, cowboy implements the above. Although http oriented, it easily supports a custom handler for whatever protocol to be used instead.
I've actually found the answer in another question: Non-blocking TCP server using OTP principles and here http://20bits.com/article/erlang-a-generalized-tcp-server
EDIT: The specific answer that was helpful to me was: https://stackoverflow.com/a/6513913/727142
You can make it as a gen_server similar to this one: https://github.com/alinpopa/qerl/blob/master/src/qerl_conn_listener.erl.
As you can see, this process is doing tcp accept and processing other messages (e.g. stop(Pid) -> gen_server:cast(Pid,{close}).)
HTH,
Alin

Erlang Design Advice regarding HTTP services

I'm new to Erlang but I would like to get started with an application which feels applicable to the technology due to the concurrency desires I have.
This picture highlights what i want to do.
http://imagebin.org/163917
Where messages are pulled from a queue and routed to worker processes which have previously been setup as a result of a user making some input a form in a Django app. The setup requires some additional database (preexisting database so I don't want to use ETS/DETS for this bit) lookup which then talks to the message router and creates a relevant process.
My issue comes with given that I may want to ask my Django app in the future for all the workers that need to be setup and task them in the first place, what is the best way to communicate here. I favour HTTP/ json and have read up what little I can find on Mochiweb and MochiJson and I think that would do what I want. I was planning on having a OTP supervisor and application, so would it be sensible to have a seperate mochiweb process which then passes erlang messages to the router?
I have struggled a little with mochiweb due to all the tutorials talking about how you use a script to create a directory structure, which seems to put mochiweb centric to a design - which isn't want I want here, I want a lightweight mochiweb process that does occassional work.
Please tear this apart, all comments welcome.
Cheers
Dave
mochiweb is awesome but I think what you actually want is webmachine. The complete documentation is available here and here. In a nutshell, webmachine is a toolkit for making REST applications, which I think is what you want. It uses mochiweb behind the scenes but hides all of the complex (and undocumented) details. When you create a webmachine project you'll get a complete OTP application and a default resource. From there you'll do something like the following:
Add your own resources (or modify + rename the default one).
Modify the dispatcher so your resources and paths make sense for your app.
Add code to create and monitor your worker processes - probably a gen_server and a supervisor. See this and related articles for ideas. Note you'll want to start both under the main supervisor provided to you when you created your project.
Modify your resources to communicate with your gen_server.
I didn't quite follow everything else you are asking - it may be easier to answer any follow-up questions in comments.

mochiweb and gen_server

[This will only make sense if you've seen Kevin Smith's 'Erlang in Practice' screencasts]
I'm an Erlang noob trying to build a simple Erlang/OTP system with embedded webserver [mochiweb].
I've walked through the EIP screencasts, and I've toyed with simple mochiweb examples created using the new_mochiweb.erl script.
I'm trying to figure out how the webserver should relate to the gen_server modules. In the EIP examples [Ch7], the author creates a web_server.erl gen_server process and links the mochiweb_http process to it. However in a mochiweb project, the mochiweb_http process seems to be 'standalone'; it doesn't seem to be embedded in a separate gen_server process.
My question is, should one of these patterns be preferred over the other ? If so, why ? Or doesn't it matter ?
Thanks in advance.
You link processes to the supervisor hierarchy of your application for two reasons: 1) to be able to restart your worker processes if they crash, and 2) to be able to kill all your processes when you stop the application.
As the previous answer says, 1) is not the case for http requests handling processes. However, 2) is valid: if you let your processes alone, you can't guarantee that all your processes will be cleared from the VM after stopping your application (think of processes stuck in endless loops, waiting in receives, etc...).
The reason to embed a process in a supervision tree is so that you can restart it if it fails.
A process that handles an HTTP request is responding to an event generated externally - in a browser. It is not possible to restart it - that is the prerogative of the person running the browser - therefore it is not necessary to run it under OTP - you can just spawn it without supervision.

Resources