Architecture(structure)-oriented vs. feature-oriented project structure - project-organization

The project, I have involved, has an architecture-oriented project's file/folder structure:
Root
|____ Node1
|____ Event Handlers
| |___ <all event handlers of project>
|____ Events
| |___ <all events of project>
|____ Request Handlers
| |___ <all request handlers of project>
|____ Requests
| |___ <all requests of project>
|____ ...
It is a clear from the architectural point of view of system (has been proposed by development team).
It is a feature-oriented structure has been proposed by designer team:
Root
|____ Feature #1
|____ Event Handlers
| |___ <all event handlers of Feature #1>
|____ Events
| |___ <all events of Feature #1>
|____ Request Handlers
| |___ <all request handlers of Feature #1>
|____ Requests
| |___ <all requests of Feature #1>
|____ ...
This variant is closer to designers and it describes clearly a feature to be implemented.
Our teams have started a holy war: what is the best approach.
Could somebody help us and explain cons and pros of the first and second ones.
Maybe there is a third one that is more useful and beneficial for both of us.
Thank you.

I would choose the second option, in order to the mantainability of a long life application.
Let me explain it with an example:
One day, a year after the application release, and months after the team who worte the original code has left, a user detects and report a bug at a certain process. The ticket will surely look something like "This stuff does not work" which, after some email ping-pong it will end up being "I can't save a multi-product order for an australian customer".
Well, on the first project structure, you have to search among all your project Request and Event Handlers where the buggy code is.
On the second one, you can narrow your search at the order saving module (or depending your structure granularity , the "overseas/multiproduct order saving" module).
It can save a lot of time, and ease mantainability IMO.

Related

Message queue that removes messages on consumption

I'm working on a real-time pipeline (you know that the term "real-time" is tricky, so don't worry about delays) and my first thought was to use Kafka with some Python consumers. However, Kafka does not allow to remove messages once they are consumed, thus I don't like it so much.
I'm looking for something that:
It is a publisher-consumer framework, such that, I can delete the message as soon as it has been processed. I want to process each message only once.
I can develop the consumers using Python.
It works in a "Kafka style", i.e. it can scale in a horizontal manner, so it has a good performance.
Basically, what I want to build is a pipeline similar to this:
input stream -> queue 1 -> consumer 1.1 -> queue 2 -> consumer 2.1 -> queue 3 ...
consumer 1.2 ...
...
consumer 1.N
Right now I'm taking a look at RabbitMQ. At least, it seems that I'll be able to clean the queues with it.
Do you guys know some good alternatives?

NServicebus and multiple message types on same SQS Queue

I am fairly new to NServicebus and have run into a problem that I am thinking may have to do with my architecture.
I have one SQS queue with three SNS topics. So let's say:
Queue1
MessageType1
MessageType2
MessageType3
I have created three NServicebus subscribers that will all run as three separate services. Each Subscriber is monitoring Queue1, and each one has a handler for a different message type. This is a rough sketch of how I am envisioning this to work:
---------MessageType3----------
| |
| --MessageType2--------- |
| / | |
V V | |
[Outside Publisher] --MessageType1--> [Queue1] --MessageType1--> [Subscriber1]
| |
| |
/ \
[Subscriber3]<--MessageType3--- ---MessageType2--> [Subscriber2]
An outside service published MessageType1 to the Queue1. Subscriber1 picks up the message, does some processing, and publishes MessageType2 and MessageType3 back to the Queue1. Then Subscribers 2 & 3 pick up their respective messages and do their thing.
But what is happening is it is random which subscriber (1, 2, or 3) picks up the initial MessageType1. So then Sub2 picks it up, and errors because it doesn't have a handler for it.
Why is this happening? I thought the NServicebus would only pick up messages it has a handler for. Does NServicebus only like one message type per queue? Do I need to make separate queues for each message type?
I am hoping there is some way to configure the three subscriber services to only pick up the intended message, but I realize that maybe my understanding of NServicebus is lacking and I need to rethink my design.
Yep, you've got some misconceptions going on here, let's see if I can help clear them up.
What you call "subscribers" are not subscribers. A queue defines a logical endpoint, and multiple processes monitoring that single queue are endpoint instances, not subscribers. They cooperate to scale-out on the processing of a single queue.
A single queue can process multiple message types if you so choose, but when any endpoint instance asks for a message from the queue, it can't control which type it will get. The queue is just a line, it's going to get the message that's next, whatever that message is.
So all the endpoint instances have to have ALL of the message handlers for messages that could go through that queue, or you will get that error.
The only reason to have multiple endpoint instances is for scalability (process more messages at a time) or for availability (process messages on ServerA even if ServerB is getting rebooted.)
Actual subscribers are different. Subscribers are where (in SQS/SNS parlance) a single SNS topic delivers copies of a message to multiple queues. You publish OrderPlaced and one copy goes to the Sales queue just so we can store that a sale was made, and a copy goes to the Billing queue (so that the credit card can be charged) and another copy yet goes to the Warehouse queue (so that they can start the process of getting it ready to put in a box.)
The power of subscribers is that maybe 6 months down the line you create another Subscriber called CustomerCare that subscribes to OrderPlaced in order to store a running total of how much that customer has bought over the past year (see Death to the batch job) and the important bit is you DON'T have to go back to the original code where the order was placed—you just add another subscriber with its own queue.
You might want to check out the NServiceBus step-by-step tutorial which goes over this in a lot of detail.

which behaviour should be implemented?

So we have generic part and the specific part thanks to erlang/OTP behaviours which greatly simplifies our code and gives us a common skeleton that everybody understands.
I am an Erlang newbie and trying to get my head around concepts of OTP.
But amongst
gen_server
gen_fsm
gen_event
supervisor
which behaviour is suitable for what kind of application.
The reason why I am asking this question is because after going through some verbose theory,I am still unsure about which is more suited for a XYZ application.
Is there any criteria or a Rule of thumb,or is it just the programmers instincts that matter in this case?
Any help?
Remark, it is a very general question that doesn't fit well to the usage of this community.
Supervisor are here to control the way your application starts and evolves in time.
It starts processes in the right order with a verification that everything is going fine, manage the restart strategy in case of failure, pool of processes, supervision tree... The advantages are that it helps you to follow the "let it crash" rule in your application, and it concentrates in a few line of code the complex task to coordinate all your processes.
I use gen_event mainly for logging purposes. You can use this behavior to connect on demand different handlers depending on your target. In one of my application, I have by default a graphical logger that allows to check the behavior at a glance, and a textual file handler that allows deeper and post mortem analysis.
Gen_server is the basic brick of all applications. It allows to keep a state, react on call, cast and even simple messages. It manages the code changes and provides features for start and stop. Most of my modules are gen_servers, you can see it as an actor which is responsible to execute a single task.
Gen_fsm brings to gen server the notion of a current state with a specific behavior, and transition from state to state. It can be used for example in a game to control the evolution of the game (configuration, waiting for players, play, change level...)
When you create an application you will identify the different tasks yo have to do and assign them to either gen_server either gen_fsm. You will add gen_event to allow publish/subscribe mechanism. Then you will have to group all processes considering their dependency and error recovering. This will give you the supervision tree and restart strategy.
I use following rules.
If you have some long-living entity that serves multiple clients in request-response manner - you use gen_server.
If this long-living entity should support more complex protocol that just request/response, there is good idea to use gen_fsm. I like use gen_fsm for network protocol implementation, for example.
If you need some abstract event bus, where any process can publish messages, and any client may want receive this messages, you need gen_event.
If you have some proceeses which must be restarted together (for example, if one of them fails), you should put this processes under supervisor, which ties this processes together.

Simple_one_for_one application

I have a supervisor which starts simple_one_for_one children. Each child is in fact a supervisor which has its own tree. Each child is started with an unique ID, so I can distinguish them. Each gen_server is then started with start_link(Id), where:
-define(SERVER(Id), {global, {Id, ?MODULE}}).
start_link(Id) ->
gen_server:start_link(?SERVER(Id), ?MODULE, [Id], []).
So, each gen_server can easily be addresed with {global, {Id, module_name}}.
Now I'd like to make this child supervisor into application. So, my mother supervisor should start applications instead of supervisors. That should be straightforward, except one part: passing ID to an application. Starting supervisor with an ID is easy: supervisor:start_child(?SERVER, [Id]). How do I do it for application? How can I start several applications of the same name (so I can access the same .app file) with different ID (so I can start my children with supervisor:start_child(?SERVER, [Id]))?
If my question is not clear enough, here is my code. So, currently, es_simulator_dispatcher starts es_simulator_sup. I'd like to have this: es_simulator_dispatcher starts es_simulator_app which starts es_simulator_sup. That's all there is to it :-)
Thanks in advance,
dijxtra
Applications don't run under anything else, they are a top-level abstraction. When you start an application with application:start/1 the application is started by the application controller which manages applications. Applications contain code and data, and maybe at runtime a supervision tree of processes doing the applications thing at runtime. Running multiple invocations of an application does not really make sense because of the nature of applications.
I would suggest reading OTP Design Principles User's Guide for a description of the components of OTP, how they relate and how they are intended to be used.
I don't think applications where meant for dynamic construction like you want. I'd make a single application, because in Erlang, applications are bundles of code more than they are bundles of running processes (you can say they are an artifact of compile-time moreso than of runtime).
Usually you feed configuration to an application through the built-in configuration system. That is, you use application:get_env(Key) to read something it should use. There is also an application:set_env(...) to feed specific configuration into one - but the preferred way is the config file on disk. This may or may not work in your case.
In some sense, what you are up to corresponds to creating 200 Apache configuration files and then spawn 200 Apache systems next to each other, rather than running a single one and then handle the multiple domains inside it.

Erlang Design Advice regarding HTTP services

I'm new to Erlang but I would like to get started with an application which feels applicable to the technology due to the concurrency desires I have.
This picture highlights what i want to do.
http://imagebin.org/163917
Where messages are pulled from a queue and routed to worker processes which have previously been setup as a result of a user making some input a form in a Django app. The setup requires some additional database (preexisting database so I don't want to use ETS/DETS for this bit) lookup which then talks to the message router and creates a relevant process.
My issue comes with given that I may want to ask my Django app in the future for all the workers that need to be setup and task them in the first place, what is the best way to communicate here. I favour HTTP/ json and have read up what little I can find on Mochiweb and MochiJson and I think that would do what I want. I was planning on having a OTP supervisor and application, so would it be sensible to have a seperate mochiweb process which then passes erlang messages to the router?
I have struggled a little with mochiweb due to all the tutorials talking about how you use a script to create a directory structure, which seems to put mochiweb centric to a design - which isn't want I want here, I want a lightweight mochiweb process that does occassional work.
Please tear this apart, all comments welcome.
Cheers
Dave
mochiweb is awesome but I think what you actually want is webmachine. The complete documentation is available here and here. In a nutshell, webmachine is a toolkit for making REST applications, which I think is what you want. It uses mochiweb behind the scenes but hides all of the complex (and undocumented) details. When you create a webmachine project you'll get a complete OTP application and a default resource. From there you'll do something like the following:
Add your own resources (or modify + rename the default one).
Modify the dispatcher so your resources and paths make sense for your app.
Add code to create and monitor your worker processes - probably a gen_server and a supervisor. See this and related articles for ideas. Note you'll want to start both under the main supervisor provided to you when you created your project.
Modify your resources to communicate with your gen_server.
I didn't quite follow everything else you are asking - it may be easier to answer any follow-up questions in comments.

Resources