I have services A, B and C (and possibly more) and I may have NA, NB and NC running instances of each.
I want to be able to notify services A, B and C (and every other that may exist) that a given event has occurred (as in fanout), but I want that only exactly one instance of each service receives the notification (as in a work queue).
Can this be achieved with Spring AMQP?
Each your service must provide its own queue binded to that fanout exchange. Independently of instance (but not service) the queue must be the same. So, it will look like several consumer on the same queue, but only one of them will receive a message.
Frankly, you don't need to think about duplications at all. Just develop the app like it will be as only instance. And distribute it in cluster. Only one consumer (per service) will handle your message.
Related
I have several Python application that all connect to a Redis server and consume messages using the pubsub mechanism. I have containerized the applications with Docker and I would like to scale each application by replicating the number of container instances. The challenge is that I don’t want each container to act as an independent subscriber to Redis, meaning I would essentially like to load balance the network traffic so that, when a message is published, only one container receives it per service.
Let’s take the simple example of two services, Service A and Service B. Both services need to be subscribed to the same topic so that each is notified upon a message published to that topic. Each service will process the message differently; in other words the same message will trigger two different outcomes, one executed by Service A and one by Service B. Now, I am trying to imagine an architecture in which these services consist of replicated containers, let’s call them workers. Say Service A consists of two workers A1 and A2, and Service B consists of three workers B1, B2, and B3 (maybe it requires more processing power per message than Service A, so it requires more workers for the same message load). So my use case requires that both Service A and Service B need to subscribe to the same topic so that they both receive updates as they come in, but I only want one worker to handle the message per service. Imagine that a message comes in and worker A1 handles it for Service A while B3 handles it for Service B.
Overall this feels like it should be pretty straightforward, I essentially have multiple applications, each of which needs to scale horizontally and should handle network traffic as if they were sitting behind a load balancer.
I am intending to deploy these applications with something like Amazon ECS, where each application is essentially a service with task replication and all services connect to a centralized Redis cache acting as a message broker. In a situation like this, from the limited research I’ve done, it would be nice to just put a network load balancer up in front of each service so that published messages would be directed to what looks like a single subscriber, but behind the scenes is a collection of workers acting like they’re pulling off a task queue.
I haven’t had much luck finding examples of this kind of architecture, or for that matter any examples of tasks that use something like Redis in the way I’m imagining. This is an architecture I’ve more or less dreamed up, so I could just be thinking about this all wrong, but at the same time it doesn’t seem like a crazy use case to me. I’m looking for any advice about how this could be accomplished and/or if what I’m talking about just sounds insane and there’s a better way.
One of the new features of MQTT 5 is the shared subscriptions feature, which allows client-side load balancing between multiple workers, so that multiple workers can be responsible for handling messages, but every message is only ever sent to a single server.
By default, this works with a round-robin approach, but I am in the need of a slightly more advanced scenario:
What I want is some kind of routing, so that one of the messages' properties gets used as some kind of routing key. I.e., I want multiple workers to be responsible for the messages, but all messages with value X in their routing key property should always go to the same worker, and all messages with Y should do as well. The workers for X and Y may be different, but all messages with X should always go to the same one.
Question 1: Is this even possible with MQTT 5? If so, what is the term I need to look for? I tried googling for this, but wasn't really successful (mainly, I guess, because I don't know exactly what to look for).
Now, supposed this is possible: How can I then handle cases where nodes join or leave? Then I still want only a single node to be responsible, so it would be great if the assignment was not statically, but could be adjusted dynamically (or even better, would adjust itself automatically). However, what I strictly need to avoid is that two messages with X ever go to different servers at the same time.
Question 2: Supposed, this is not possible – what alternatives do I have to MQTT 5?
You don't at a protocol level. That is the whole point of a shared subscription to distribute the incoming messages evenly across all the subscribers.
This also goes against the pub/sub paradigm, that messages are published to a topic not an individual subscriber.
If you want to route messages differently publish them to different topics. There is nothing to stop you republishing a message on a separate topic based on it's meta data once it's been received by a client if needed.
I have a GenServer on a remote node with both implementation and client functions in the module. Can I use the GenServer client functions remotely somehow?
Using GenServer.call({RemoteProcessName, :"app#remoteNode"}, :get) works a I expect it to, but is cumbersome.
If I want clean this up am I right in thinking that I'd have to write the client functions on the calling (client) node?
You can use the :rpc.call/{4,5} functions.
:rpc.call(:"app#remoteNode", MyModule, :some_func, [arg1, arg2])
For large number of calls, It's better to user gen_server:call/2-3.
If you want to use rpc:call/4-5, you should know that it is just one process named rex on each node for handling all requests. So if it is running one Mod:Func(Arg1, Arg2, Argn), It can not response to other request at this time !
TL;DR
Yes
Discussion
There are PIDs, messages, monitors and links. Nothing more, nothing less. That is your universe. (Unless you get into some rather esoteric aspects of the runtime implementation -- but at the abstraction level represented by EVM languages the previously stated elements (should) constitute your universe.)
Within an Erlang environment (whether local or distributed in a mesh) any PID can send a message addressed to any other PID (no middle-man required), as well as establish monitors and so on.
gen_server:cast sends a gen_server packaged message (so it will arrive in the form handle_cast/2 will be called on). gen_server:call/2 establishes a monitor and a timeout for receiving a labeled reply. Simply doing PID ! SomeMessage does essentially the same thing as gen_server:cast (sends a message) without any of the gen_server machinery behind it (messier to abstract as an interface).
That's all there is to it.
With this in mind, of course you can use gen_server:call/2 across nodes, as long as they are connected into a cluster/mesh via disterl. Two disconnected nodes would have to communicate a different way (network sockets) and wouldn't have any knowledge of each other's internal mapping of PIDs, but as long as disterl is being used they all translate PIDs amongst themselves quite readily. Named processes is where things get a little tricky, but that is the purpose of the global module and utilities such as gproc (though dependence on such facilities beyond a certain point is usually an indication of an architectural problem).
Of course, just because PIDs from any node can communicate with PIDs from another node doesn't always means they should. The physical topology of the network (bandwidth, latency, jitter) comes into play when you start sending high-frequency or large messages (lots of gen_server:calls), and you have always got to think of partition tolerance -- but for off-loading heavy sorts of work (rare) or physically partitioning sub-systems within a very large system (more common) directly sending messages is a very simple way to take a program coded for a single node and distribute it across a cluster.
(With all that in mind, it is somewhat rare to see the rpc module used.)
Does tibco support "multicast" ?
I guess another term used is "worker queues". (as seen in the rabbitmq link below)
See : http://www.rabbitmq.com/tutorials/tutorial-two-dotnet.html
I call them "fighters", as in, several processes can be wired to one queue, and when a message arrives in the queue, ONE of the several processes will get the message,but not all of them.
In EMS and most JMS based messaging system (supporting Queues and Topics), this is ALREADY the default behavior.
In would not call that "multicast" or "worker queues", but simply "load sharing" or "load balancing". Active-Mq calls it "Clustering" (I don't like the term, but the diagram is neat).
The official name for the pattern is "Competing consumers (EIP)".
Whatever you call it, it's super easy to do in EMS. By default, queue accept multiple clients for reading (you can change this and make them exclusive, see the user doc). When a queue is read by 2 or more consumers, and a message is sent to the queue, the message will go to one of ANY consumers. Hence your expected behavior.
Please refer to the same link for another chapter (14, page 411) on "Multicast" with EMS. This is different... it's ACTUAL NETWORK BASED Multicast, meant for helping lowering network traffic when a topic does publications to a massive amount of subscribers.
FYI, EMS is only one out of three messaging solution from TIBCO. The other two are Rendez-vous(older, UDP based) and FTL (newer, low latency solution).
I'm using Spring-AMQP for communication between some services. Inside the services I'm declaring consumers. One queue per consumer is used to dispatch the corresponding messages. A consumer-manager holds all known consumers. The actual message dispatching runs over a message-handler, which looks up for consumers subscribing the corresponding message-types. In almost any use-case a service subscribes a message only with one consumer.
For a special case I have to distinct not only the message-types but also the queue, from where the message is coming from.
Is is possible to retrieve somehow the queue name, from which the message was read?