How to target specific workers with shared subscriptions in MQTT 5? - mqtt

One of the new features of MQTT 5 is the shared subscriptions feature, which allows client-side load balancing between multiple workers, so that multiple workers can be responsible for handling messages, but every message is only ever sent to a single server.
By default, this works with a round-robin approach, but I am in the need of a slightly more advanced scenario:
What I want is some kind of routing, so that one of the messages' properties gets used as some kind of routing key. I.e., I want multiple workers to be responsible for the messages, but all messages with value X in their routing key property should always go to the same worker, and all messages with Y should do as well. The workers for X and Y may be different, but all messages with X should always go to the same one.
Question 1: Is this even possible with MQTT 5? If so, what is the term I need to look for? I tried googling for this, but wasn't really successful (mainly, I guess, because I don't know exactly what to look for).
Now, supposed this is possible: How can I then handle cases where nodes join or leave? Then I still want only a single node to be responsible, so it would be great if the assignment was not statically, but could be adjusted dynamically (or even better, would adjust itself automatically). However, what I strictly need to avoid is that two messages with X ever go to different servers at the same time.
Question 2: Supposed, this is not possible – what alternatives do I have to MQTT 5?

You don't at a protocol level. That is the whole point of a shared subscription to distribute the incoming messages evenly across all the subscribers.
This also goes against the pub/sub paradigm, that messages are published to a topic not an individual subscriber.
If you want to route messages differently publish them to different topics. There is nothing to stop you republishing a message on a separate topic based on it's meta data once it's been received by a client if needed.

Related

How to tell MQTT to keep messages even if there is no subscriber right now?

In MQTT, if you publish to a topic where there is no subscriber for, the message gets dropped.
While this is fine for classic pub/sub messaging, it is not so great for shared subscriptions (which have been introduced in MQTT 5), since this pattern is typically used for some kind of job queue, and you usually don't want to drop jobs just because there is no worker there right now (maybe it just crashed and is restarting).
Is it possible to tell MQTT servers not to drop messages, at least for shared subscriptions, even if there are no subscribers right now? If so, how?
PS: This is not just a persistent session, since I do not want to keep the subscriptions per client. It's more like a "persistent session" that spans multiple clients.
I don't know if any of the brokers supporting MQTT v5 shared subscriptions support this, but I can foresee a way it could work in a way that is line with the spec and spirit of pub/sub messaging.
A MQTT broker will queue messages for topics subscribed to at QOS 1 or 2 for a client that is currently offline, with a persistent session. So I see no reason why shared subscriptions should be any different. I can see it might be a little bit more technically complicated to implement but should be possible (You would need to treat the shared group as a single session).
That said I think the main focus for shared subscriptions is load balancing, followed by HA. So unless you are running all your shared subscribers on the same machine it should be unlikely that they all fail at the same time.

How to display delivered and read receipts in MQTT broker Mosquitto?

I want to display delivered and read receipts to users in my messaging platform. I am using Eclipse's Paho library with Mosquitto as the broker. Since Mosquitto does not store messages, which is the best way/plugin to
Display delivered receipts - how to use QoS2 acknowledgement receipts to do this?
Display read receipts - suggest me way to do this
How to store messages so that users can view their chat history? Any architectural insights in mysql will be very helpful.
The quick answers to your questions:
High QOS (1/2) is not end to end delivery confirmation, it is only confirmation between the broker and a client. e.g. a publisher publishing at QOS 2 the confirmation is only between the publisher and the broker, not then onward to the subscriber (who may be subscribed at a different QOS anyway). The only way to do this is to send a separate message from the receiving end back to the sender. Also there may be more than one subscriber to any given topic, so you have to think how this would work.
Again, the only way to do this is with a separate message sent when the message is read
You will have to implement this yourself. The only thing that may help is something like the built in support for storing messages in a database present in some brokers (this is not part of the spec, so totally propitiatory to the implementation) e.g. hivemq
Hardlib's answers are 100% on target, but I'll add some thoughts on implementation.
I think a common misunderstanding with MQTT is that it is really a M2M (machine-to-machine) protocol, not a system for exchanging messages between users. That isn't to say you can't use it for messaging (facebook did) but that exists in a layer on top of MQTT. Put another way, MQTT is designed to route messages between machines with little care about the content of those messages. What this means is that user-level niceties (delivery confirmations etc.) aren't really part of it but instead something that you implement on top of MQTT.
So here are some thoughts about how to implement what you propose on top of MQTT:
Consider a situation in which you have two clients (X & Z) which both have access to the same broker (Y). To have client X confirm it has received a message from client Z, simply have client X send a message to a topic (lets say confirmations/z) that client Z is subscribed to. This is trivial to implement in Python or whatever you are writing your application in. (For example, I use that basic procedure to measure round-trip time on my broker.)
However, given that QoS can guarantee that a broker has received the message (and it could be retained or otherwise held for other clients), I would question if this is really necessary unless it is critical that client Z know exactly when client X receives the message.
Depending on your needs there are any number of ways of providing a history for a topic. See the answers here and here for details on MySQL. But if all you need is a local history of a chat or a record of the activity on a few topics, consider simply outputting payloads with timestamps to a text file or JSON. MySQL feels like overkill unless you are dealing with a high volume of messages or need to compose complex queries.

Distributing an Erlang Chat system

I just finished Erlang in Practice screencasts (code here), and have some questions about distribution.
Here's the is overall architecture:
Here is how to the supervision tree looks like:
Reading Distributed Applications leads me to believe that one of the primary motivations is for failover/takeover.
However, is it possible, for example, the Message Router supervisor and its workers to be on one node, and the rest of the system to be on another, without much changes to the code?
Or should there be 3 different OTP applications?
Also, how can this system be made to scale horizontally? For example if I realize now that my system can handle 100 users, and that I've identified the Message Router as the main bottleneck, how can I 'just add another node' where now it can handle 200 users?
I've developed Erlang apps only during my studies, but generally we had many small processes doing only one thing and sending messages to other processes. And the beauty of Erlang is that it doesn't matter if you send a message within the same Erlang VM or withing the same Computer, same LAN or over the Internet, the call and the pointer to the other process looks always the same for the developer.
So you really want to have one application for every small part of the system.
That being said, it doesn't make it any simpler to construct an application which can scale out. A rule of thumb says that if you want an application to work on a factor of 10-times more nodes, you need to rewrite, since otherwise the messaging overhead would be too large. And obviously when you start from 1 to 2 you also need to consider it.
So if you found a bottleneck, the application which is particularly slow when handling too many clients, you want to run it a second time and than you need to have some additional load-balancing implemented, already before you start the second application.
Let's assume the supervisor checks the message content for inappropriate content and therefore is slow. In this case the node, everyone is talking to would be simple router application which would forward the messages to different instances of the supervisor application, in a round robin manner. In case those 1 or 2 instances are not enough, you could have the router written in a way, that you can manipulate the number of instances by sending controlling messages.
However for this, to work automatically, you would need to have another process monitoring the servers and discovering that they are overloaded or under utilized.
I know that dynamically adding and removing resources always sounds great when you hear about it, but as you can see it is a lot of work and you need to have some messaging system built which allows it, as well as a monitoring system which can monitor the need.
Hope this gives you some idea of how it could be done, unfortunately it's been over a year since I wrote my last Erlang application, and I didn't want to provide code which would be possibly wrong.

Passing messages between remote MailboxProcessors?

I'm using MailboxProcessor classes in order to keep separate agents that do their own thing. Normally agents can communicate with one another in the same process, but I want agents to talk to one another when they are on separate processes or even different machines. What kind of mechanism is best for implementing communication between them? Is there some standard solution?
Please note that I'm using Ubuntu instances to run the agents.
I think you're going to have write your own routines to serialize messages, pass them accross the process boundaries and then dispatch them on the other side. This will also require a implementation of a ID system where each mailbox has an ID and processes can send messages to IDs instead of just Mailbox.Send. This is not easy, as local boxes will be able to access local memory, but remote mailboxes will not.
I would look at something like RPyC (http://rpyc.wikidot.com/) as it provides a protocol somewhat like you are looking for.
Basically the answer is 'no' there isn't really a good way to do this.

Is this a good reason to use a service bus, alternatives please

I'm in the planning phase of our new site - it's an extension of some mobile apps we've built. We want to provide our users with a central point for communication and also provide features for users who don't want to/can't use the mobile apps. One of the features we're looking at adding is a reputation system similar in nature to the SO badge system. We're designing the system to use SOA.
I don't want to have to code all of this logic into the main app as discreet chunks. I'm thinking of creating a means to accomplish this which will allow us to define new thresholds and rules for gaining reputation and have them injected into some service. The two ways I've thought of doing this so far are:
To look for certain traits in a users actions and respond, this would mean having a service running that can run through the 'plugged in' award definitions and check for thresholds that have been met and respond appropriately.
To fire events when the user performs actions - listen out for those events and respond appropriately. Because the services which will be carrying out these actions are running in separate app domains potentially on separate servers the only way I can see having a central message bus to listen and respond to these events is by using something like MassTransit, nServiceBus or Rhino.Esb.
I know that using a service bus can very easily be inappropriately designed into an application that simply doesn't need it and most times - unless you're integrating disparate, heterogenous systems - you most likely won't need one when designing a new system but I'm a bit lost for options as to the best way to do this. I don't like the idea of having a service hammer the Db all the time in the background. But it does sound like it might be a lot simpler early on - later on - I dread to think!
Has anyone here designed a system like this? How did you accomplish this? We're designing for high throughput as we expect there will be times when the system will need to be able to cope with bursts of users.
I've designed a system that had similar requirements. To achieve this the key elements were:
Plugins
Event messaging - using Emesary
The basic concept is that the core is not aware of exactly which module will perform any given task.
The messages are defined and at points within the system they are dispatched. The sender is not aware if the message is required. This effectively decouples vast chunks of the system.
So to perform a job some code is plugged in, that registers with the event messaging bus and will receive messages. When it receives a message that it needs to process it will process it.
The Emesary code is extremely small and efficient in the first instance I've called it (Emesary and you're free to use it; or from Emesary CodePlex
As the system becomes more complex it is possible that there are lots of events flying about, if you get more than 20k a second it was always in my design to add filtering and routing (implemented by the recipient interface being extended to allow a recipient to specify messages it wants to receive during registration). I've never needed to add this filtering because Emesary is sufficiently efficient that it is the processing of the messages that takes the time.
I've build a version of Emesary which bridges two Notifiers across disparate systems using WCF, Corba and TCP/IP. I investigated using RabbitMQ and decided it was possible to use this underneath Emesary if needed.
Base Class Diagram
Scalable server.
This is a fairly complex example however it shows where Emesary fits in. In this diagram anything with a drop shadow can have multiple instances and this is managed outside of what I'm trying to explain here.

Resources