OTP behaviours: gen_fsm; gen_event. Practical examples? - erlang

I have used both the supervisor and gen_server behaviours, and I can understand the practical uses for both of them. However, I don't really understand the use of the gen_fsm and the gen_event behaviours. Can someone clarify with practical examples?
Thanks in advance

One classic example for FSM would be lock with timeout which is mentioned in manual,
Another example which I implemented in my experience would be telephone lines, because phone have states, like ringing, connected, disconnected etc and some operations are allowed and some are not allowed during this states.
An example for event is logging used in https://github.com/basho/lager

gen_fsm is a neat implementation of finite state machine, you cane do roughly the same thing that you do with a gen_server, and in addition manage easily the different states of your application (for example in a game server select a level, a table, modify player attribute, play, save, restore...).
gen-event is an easy way to dispach event, your application send all event to the gen_event knowing nothing about potential usage, and you dynamically add and delete handlers, with different behavior (log in file, in a database, display information in graphical interface...). I have used this to have a graphical view of the processes state and communication of my application, and file log for performance analysis.

Some good examples you can find them here:
"Event handlers" and "Finite State Machines"
gen_fsm:
The gen_fsm behaviour is somewhat similar to gen_server in that it is
a specialised version of it. The biggest difference is that rather
than handling calls and casts, we're handling synchronous and
asynchronous events. Much like our dog and cat examples, each state is
represented by a function. Again, we'll go through the callbacks our
modules need to implement in order to work.
gen_event:
The gen_event behaviour differs quite a bit from the gen_server and
gen_fsm behaviours in that you are never really starting a process.
The gen_event behaviour basically runs the process that accepts and
calls functions, and you only provide a module with these functions.
This is to say, you have nothing to do with regards to event
manipulation except give your callback functions in a format that
pleases the event manager. All managing is done for free; you only
provide what's specific to your application. This is not really
surprising given OTP is, again, all about separating what's generic
from specific.

Related

Clarification of Events vs Observer vs MailboxProcessor in F#

I have a system, connected to financial markets, that makes a very heavy use of events.
All the code is structured as a cascade of events with filters, aggregations, etc in between.
Originally the system was written in C# and then ported to F# (which in retrospect was a great move) and events in the C# code got replaced by events in F# without giving it much thoughts.
I have heard about the observer pattern, but I haven't really gone through the topic. And recently, I have read, through some random browsing, about F#'s Mailbox processor.
I read this: Difference between Observer Pattern and Event-Driven Approach and I didn't get it, but apparently over 150 people voted that the answer wasn't too clear as well :)
In an article like this: https://hackernoon.com/observer-vs-pub-sub-pattern-50d3b27f838c it seems like the observer pattern is strictly identical to events...
At first glance, they seem to be solving the same kind of problems, just with different interfaces but that got me to think about 2 questions:
Is the mailbox processor really a thing being used? it seems to appear mostly in older documentation and, in the packages I'm using, I haven't come across any using it
Regarding the observer pattern, only one package across the sizeable amount we're using makes internal use of it, but everything else is just using basic events.
Are there specific use cases fitting the Observable pattern and the MailboxProcessor? Do they have features that are unique? or are they just syntactic help around events in the end?
As simplified as possible:
Mailbox
This is a minimal implementation of the actor model.
You post messages to a queue, and your loop reads the messages from the queue, one by one. Maybe it posts to another mailbox or it does something with the messages.
Any action can only take place when a message is received.
Posting to the queue is non-blocking, i.e, no back-pressure.
All exceptions are caught and exposed as an event on the mailbox. They are expected to be handled by the actor above it.
Other actor frameworks provide features like supervisors, contracts, failover, etc.
Events
Events are a language supported callback mechanism.
It's a simple implementation. You register a callback delegate, and when the event is raised, your delegate is called.
Delegates are called in the order they are added.
Events are blocking, and synchronous. The one delegate blocks, the rest are delayed.
Events are about writing code to respond to events, as opposed what came before it, which was polling.
The handler for an event is usually the final end-point for that event, and it usually has side-effects.
Sharing a handler is common. For example, ten buttons might have the same function handling clicks, because the sender of the event is known.
You handle exceptions by yourself, typically in the handler code
Observables
There's a source (Observable) which you can subscribe to with a sink (Observer). An observable represents a bounded or un-bounded stream of values. An unbounded stream (an Observable which never completes) seems similar to an event, but there are several important properties to Observables.
An Observable emits a series of notifications, which follows this contract:
OnNext* (OnError|OnCompleted)+
All notifications are serialized
Notifications may or may not be synchronous. There's no guarantee of back-pressure.
The value of Observables lies in the fact that they are compose-able.
An observable represents a stream of future notifications, operators act to transform this stream.
This approach is sometimes called complex event processing (CEP).
Exception handling is part of the pipeline, and there are many combinators to deal with it.
You typically never implement an Observer yourself. You use combinators to set up a pipeline which models the behavior you want.

Is the process dictionary appropriate in this case?

I've read several comments here and elsewhere suggesting that Erlang's process dictionary was a bad idea and should die. Normally, as a total Erlang newbie, I'd just avoid it. However, in this situation my other options aren't great.
I have a main dispatcher function that looks something like this:
dispatch(State) ->
receive
{cmd1, Params} ->
NewState = do_cmd1_stuff(Params, State),
dispatch(NewState);
{cmd2, Params} ->
NewState = do_cmd2_stuff(Params, State),
dispatch(NewState);
BadMsg ->
log_error(BadMsg),
dispatch(State)
end.
Obviously, my names are more meaningful to me, but that's the gist of it. Deep down in a function called by a function called by a function called by do_cmd2_stuff(), I want to send out messages to all my users telling them about something I've done. In order to do that, I need to get the list of users from the point where I send the messages. The user list doesn't lend itself easily to sticking in the global state, since that's just one data structure representing the only block of data on which I operate.
The way I see it, I have a couple unpleasant options other than using the process dictionary. I can send the user list through all the various levels of functions down to the very bottom one that does the broadcasting. That's unpleasant because it causes all my functions to gain a parameter, whether they really care about it or not.
Alternatively, I could have all the do_cmdN_stuff() functions return a message to send. That's not great either though, since sending the message may not be the last thing I want to do and it clutters up my dispatcher with a bunch of {Msg, NewState} tuples. Furthermore, some of the functions might not have any messages to send some of the time.
Like I said earlier, I'm very new to Erlang. Maybe someone with more experience can point me at a better way. Is there one? Is the process dictionary appropriate in this case?
The general rule is that if you have doubts, you shouldn't use the process dictionary.
If the two options you mentioned aren't good enough (I personally like the one where you return the messages to send) and what you want is some particular piece of code to track users and forward messages to them, maybe what you want to do is have a process holding that info.
Pid ! {forward, Msg}
where Pid will take care of sending everything to a bunch of other processes. Now, you would still need to pass the Pid around, unless you give it a name in some registry to find it. Either with register/2, global or gproc.
A simple answer would be to nest your global within a state record, which is then threaded through the system, at least at the stop level. This makes it easy to add new fields to the state in the future, not an uncommon occurrence, and allow you to keep your global state data structure untouched. So initially
-record(state, {users=[],state_data}).
Defining it as a record makes it easy to access and extend when necessary.
As you mentioned you can always pass the user list as extra param, thats not so bad.
If you don't want to do this just put it in State. You can have a special State just for this part of the calculation that also contains the user list.
Then there always is the possibility of putting it in ETS or in another server process.
What exactly to do is hard to recommend since it depends a lot on your exact application and preferences.
Just choose from the mentioned possibilities as if the process dictionary doesn't exist. Maybe your code needs restructuring if none of the variants look elegant, there always is some better way without the process dictionary.
Its really bad it is still there, because its alluring to many beginning Erlang users.
You really should not use process dictionary. I accept using dictionary only if
It is short living process.
I have full control about the process from spawn to termination i.e. I use minimum and well known set of external modules.
I need performance gain badly. It means avoid copy of data when using ets and dict/gb_tree is too slow (for GC reason).
ad 1. is not your case, you are using in server. ad 2. I don't know if it is your case. ad 3. is not your case because you need list of recipient so you don't gain nothing from that process dictionary is very fast key/value storage. In your case I don't see any reason why you should not include what you need to your State. IMHO State is exactly the right place for it.
Its an interesting question because it involves the fundamentals of functional design.
My opinion:
Try as much as possible to make the function return the messages, then send them. This separates the two different tasks nicely, and separates the purely functional task from the one that causes side effects.
If this isn't possible, pass receivers as argument even if its a bit messy. If the broadcasting function uses that data, it should be given to it explicitly, for clarity and predictability.
Using ETS as Peer Stritzinger suggests is really not any better than the PD, both hides the fact that the broadcasting function uses the receiver list and makes it dependent on global data.
I'm not sure about the Erlang way of encapsulating some state in a process, as I GIVE TERRIBLE ADVICE suggests. Is it really any better that ETS or PD?
clutters up my dispatcher with a bunch
of {Msg, NewState}
This is my experience also, that you often end up like this. It's not particularly pretty, but functional design seems to encourage this. Could some language feature be introduced to make it more beautiful and natural?
EDIT:
6 years ago I wrote:
Could some language feature be introduced to make it more beautiful and natural?
After learning much more about functional programming I have realised that examples of this are state-monads and do-notation that are found in Haskell.
I would consider sending a special message to self() from deep inside the call stack, and handling it at the top level dispatch method that you've sketched, where list of users is available.

Event manager process in erlang. Named processes or Pids?

I have event manager process that dispatches events to subscribers (e.g. http_session_created, http_sesssion_destroyed). If Pid is used instead of named process, I must put it into functions to operate with event manager but if Named process is used, code will be more clear.
Which variant is right?
Thank you!
While there is no actual difference to the process naming a process, registering it, makes it global. You in essence you are telling the system that here is a global service which anyone can use.
From you description it more sounds like you are giving them names to save the, small, effort of carrying them around in your loop. If this is the case I would put their pids in a record with all the other state data you carry around. This much better indicates the type of the processes.
If you have a fixed set of "subscriber" processes, then use registered names IMO.
If, on the contrary, you have a publish/subscribe sort of architecture where subscribers come and go, then you need an infrastructure to track those and from this point it doesn't really matter if you use Pid() or names.
One of the drawbacks of using registered names is that you need to track them in your code base to avoid "collisions". So it is up to you: personally, I tend to favor named processes as, like you say, it makes reading the code clearer. One way or another, OTP doesn't care.

Erlang gen_server vs stateless module

I've recently finished Joe's book and quite enjoyed it.
I'm since then started coding a soft realtime application with erlang and I have to say I am a bit confused at the use of gen_server.
When should I use gen_server instead of a simple stateless module?
I define a stateless module as follow:
- A module that takes it's state as a parameter (much like ETS/DETS) as opposed to keeping it internally (like gen_server)
Say for an invoice manager type module, should it initialize and return state which I'd then pass subsequently to it?
SomeState = InvoiceManager:Init(),
SomeState = InvoiceManager:AddInvoice(SomeState, AnInvoiceFoo).
Suppose I'd need multiple instances of the invoice manager state (say my application manages multiple companies each with their own invoices), should they each have a gen_server with internal state to manage their invoices or would it better fit to simply have the stateless module above?
Where is the line between the two?
(Note the invoice manage example above is just that, an example to illustrate my question)
I don't really think you can make that distinction between what you call a stateless module and gen_server. In both cases there is a recursive receive loop which carries state in at least one argument. This main loop handles requests, does work depending on the requests and, when necessary, sends results back the requesters. The main loop will most likely handle a number of administrative requests as well which may not be part of the main API/protocol.
The difference is that gen_server abstracts away the main receive loop and allows the user to only the write the actual user code. It will also handle many administrative OTP functions for you. The main difference is that the user code is in another module which means that you see the passed through state more easily. Unless you actually manage to write your code in one big receive loop and not call other functions to do the work there is no real difference.
Which method is better depends very much on what you need. Using gen_server will simplify your code and give you added functionality "for free" but it can be more restrictive. Rolling your own will give you more power but also you give more possibilities to screww things up. It is probably a little faster as well. What do you need?
It strongly depend of your needs and application design. When you need shared state between processes you have to use process to keep this state. Then gen_server, gen_fsm or other gen_* is your friend. You can avoid this design when your application is not concurrent or this design doesn't bring you some other benefits. For example break your application to processes will lead to simpler design. In other case sometimes you can choose single process design and using "stateless" modules for performance or such. "stateless" module is best choice for very simply stateless (pure functional) tasks. gen_server is often best choice for thinks that seems naturally "process". You must use it when you want share something between processes (using processes can be constrained by scalability or concurrency).
Having used both models, I must say that using the provided gen_server helps me stay structured more easily. I guess this is why it is included in the OTP stack of tools: gen_server is a good way to get the repetitive boiler-plate out of the way.
If you have shared state over multiple processes you should probably go with gen_server and if the state is just local to one process a stateless module will do fine.
I suppose your invoices (or whatever they stand for) should be persistent, so they would end up in an ETS/Mnesia table anyway. If this is so, you should create a stateless module where you put your API for accessing the invoice table.

EventAggregator vs CompositeCommand

I worked my way through the Prism guidance and think I got a grasp of most of their communication vehicles.
Commanding is very straightforward, so it is clear that the DelegateCommand will be used just to connect the View with its Model.
It is somewhat less clear, when it comes to cross Module Communication, specifically when to use EventAggregation over Composite Commands.
The practical effect is the same e.g.
You publish an event -> all subscribers receive notice and execute code in response
You execute a composite command -> all registered commands get executed and with it their attached code
Both work along the lines of "fire and forget", that is they don't care about any responses from their subscribers after firing the event/executing the commands.
I have trouble seeing a practical difference in usage although I understand that the implementation of both (under the hood) is very different.
So should we think of what it actually means - Event? Is that when something happens (an event occurs)? Something the user did not directly request like a "web request completed"?
And Command? Does that mean a user clicked something and thus issued a command to our application, requesting a service directly?
Is that it? Or are there other ways to determine when to use one of these communication vehicles over the other. The guidance, although one of the best documentations I read, gives no specific explanation.
So I hope people involved in/using Prism can help in shedding some light on this.
There are two primary differences between these two.
CanExecute for Commands. A Command
can say whether or not it is valid
for execution by calling
Command.RaiseCanExecuteChanged() and
having its CanExecute delegate
return false. If you consider the
case of a "Save All"
CompositeCommand compositing several
"Save" commands, but one of the
commands saying that it can't
execute, the Save All button will
automatically disable (nice!).
EventAggregator is a Messaging
pattern and Commands are a
Commanding pattern. Although
CompositeCommands are not explicitly
a UI pattern, it is implicitly so
(generally they are hooked up to an
input action, like a Button click).
EventAggregator is not this way -
any part of the application
effectively raise an EventAggregator
event: background processes,
ViewModels, etc. It is a
brokered avenue for messaging
across your application with support
for things like filtering,
background thread execution, etc.
Hope this helps explain the differences. It's more difficult to say when to use each, but generally I use the rule of thumb that if it's user interaction that raises the event, use a command for anything else, use EventAggregator.
Hope this helps.
Additionally, there is one more important difference: With the current implementation, an event from the EventAggregator is asynchronous, while the CompositeCommand is synchronous.
If you want to implement something like "notify that event X happened; do something that relies on the event handlers for event X to have executed", you either have to do something like Application.DoEvents() or use CompositeCommands.

Resources