MQTT cluster information

MQTT cluster information - mqtt

I am quite confused about MQTT clustering, it doesn't seem to be part of the MQTT protocol and I was wondering if each MQTT broker implementation has its own way to implement it. Also, do you know which kind of information are shared between cluster nodes? It seems like it retains information related to the session related to pub/sub but not the messages, is that correct? Thanks!

No, there is nothing in the MQTT protocol about clustering brokers. There is support for bridging topics between 2 brokers, but this is purely at the message level, it carries no information about clients or sessions.
Any clustering is implemented independently by a given broker, what information shared would also be dependent on the that implementation. But would need to include the following:
Client Session information, including subscriptions
Message
Information about which messages have been delivered to what clients

Related

Any implications of using UUIDs in MQTT topic names?

I am doing a request/response flow using a MQTT broker and I wondered if brokers like VerneMQ or Mosquitto deal well with huge amount of topics. Basically every time I want to do a request/response, I publish to a topic that looks like rpc/{UUID} meaning every request creates a new topic and then unsubscribe from it when the response is received. Will this come and bite me later ?

Topics are effectively ephemeral already.
Usually the only overhead to a topic is in the list of subscribed topic patterns (because they can be wildcards) held for each client. The topic is read from an incoming messages and checked against this list.
Using UUIDs in topics should not cause any problems.

How to display delivered and read receipts in MQTT broker Mosquitto?

I want to display delivered and read receipts to users in my messaging platform. I am using Eclipse's Paho library with Mosquitto as the broker. Since Mosquitto does not store messages, which is the best way/plugin to
Display delivered receipts - how to use QoS2 acknowledgement receipts to do this?
Display read receipts - suggest me way to do this
How to store messages so that users can view their chat history? Any architectural insights in mysql will be very helpful.

The quick answers to your questions:
High QOS (1/2) is not end to end delivery confirmation, it is only confirmation between the broker and a client. e.g. a publisher publishing at QOS 2 the confirmation is only between the publisher and the broker, not then onward to the subscriber (who may be subscribed at a different QOS anyway). The only way to do this is to send a separate message from the receiving end back to the sender. Also there may be more than one subscriber to any given topic, so you have to think how this would work.
Again, the only way to do this is with a separate message sent when the message is read
You will have to implement this yourself. The only thing that may help is something like the built in support for storing messages in a database present in some brokers (this is not part of the spec, so totally propitiatory to the implementation) e.g. hivemq

Hardlib's answers are 100% on target, but I'll add some thoughts on implementation.
I think a common misunderstanding with MQTT is that it is really a M2M (machine-to-machine) protocol, not a system for exchanging messages between users. That isn't to say you can't use it for messaging (facebook did) but that exists in a layer on top of MQTT. Put another way, MQTT is designed to route messages between machines with little care about the content of those messages. What this means is that user-level niceties (delivery confirmations etc.) aren't really part of it but instead something that you implement on top of MQTT.
So here are some thoughts about how to implement what you propose on top of MQTT:
Consider a situation in which you have two clients (X & Z) which both have access to the same broker (Y). To have client X confirm it has received a message from client Z, simply have client X send a message to a topic (lets say confirmations/z) that client Z is subscribed to. This is trivial to implement in Python or whatever you are writing your application in. (For example, I use that basic procedure to measure round-trip time on my broker.)
However, given that QoS can guarantee that a broker has received the message (and it could be retained or otherwise held for other clients), I would question if this is really necessary unless it is critical that client Z know exactly when client X receives the message.
Depending on your needs there are any number of ways of providing a history for a topic. See the answers here and here for details on MySQL. But if all you need is a local history of a chat or a record of the activity on a few topics, consider simply outputting payloads with timestamps to a text file or JSON. MySQL feels like overkill unless you are dealing with a high volume of messages or need to compose complex queries.

Using MQTT with multiple subscribers

I'm using mosquitto (http://mosquitto.org/) as an MQTT broker and am looking for advice on load balancing subscribers (to the same topic). How is this achieved? Everything I've read about the protocol states that all subscribers to the same topic will be given a published message.
This seems inefficient, and so I'm looking for a way for a published message to be given to one of the connected subscribers in a round-robin approach that would ensure a load balanced state.
If this is not possible with MQTT, how does a subscriber avoid being overwhelmed with messages?

Typically you design MQTT applications in a way that you don't have overwhelmed subscribers. You can achieve this by spreading load to different topics.
If you really can't do that, take a look at the shared subscription approach sophisticated MQTT brokers like MessageSight and HiveMQ have. This is exactly the feature you're looking for but is broker dependant and is not part of the official MQTT spec.

MQTT v5 has the support of shared subscriptions and mosquitto version 1.6 added support of MQTT v5.
Check release notes
Good article on shared subscriptions here

MQTT is a Pub/Sub protocol, the basis is for a 1 to many distribution of messages not a 1 to 1 (of many) you describe. What you describe would be more like a Message Queuing system which is distinctly different from Pub/Sub.
Mosquitto as a pure implementation of this protocol does not support the delivery as you describe it. One solution is to us a local queue with in the subscriber which incoming messages are added to and then consumed by a thread pool.
I do believe that the IBM Message Sight appliance may offer the type of message delivery you are looking for as an extension to the protocol called Shared Subscriptions, but with this enabled it is deviating from the pure MQTT spec.

why and when i need mqtt broker for IOT/M2M application

Just asking one silly question, hope someone can answer this.
I'm bit confused regarding MQTT broker. Basically, the confusion is, there are so many things being used for data storing, transfer and processing (like Flume, HDInsight, Spark etc). So, when and why I need to use one MQTT broker?
If I would like to use Windows 10 IoT application with HiveMQ, from where can I get the details? how to use it? How I get benefit out of this MQTT broker? Can I not send data from my IoT application directly using Azure or HDFS? So, how MQTT broker fits into it or helping me to achieve something?
I'm new to all these and tried to find some tutorials, however, I'm not getting anything proper. Please explain it in more details or give some tutorials for this?

MQTT is a client-server protocol for pub-sub based transport that has a comparatively small overhead, and thus applicable to mobile and IoT applications (unlike Flume, etc.). The MQTT broker is basically a server that handles messaging to/from MQTT clients and among them. The functionality pretty much stops at the transport layer, even though various MQTT add-ons exist.
If you are looking to implement a solution that would reliably transfer data from your IoT devices to the back-end system for processing, I would suggest you take a look into Kaa open-source IoT platform. It goes much further than MQTT by providing not only the transport layer, suitable for low-power IoT devices, but also a solid chunk of the application level logic (including the object bindings for your application-level data structures, temporary data persistence, etc.).
Here is a link to a webinar that explains how to build a scalable IoT analytics system with Kaa and Spark in less than an hour.

This is an architectural choice. IoT applications are possible without MQTT but there are some advantages when using MQTT. If you are completely new to MQTT, take a look at this in-depth MQTT series: http://forkbomb-blog.de/2015/all-you-need-to-know-about-mqtt
Basically the main architectural advantage is publish / subscribe designed for low-latency, high throughput (mobile) communication with minimal protocol overhead (which is important if bandwidth is at a premium). You can completely decouple consumers and producers.
HDFS is the (distributed) Hadoop file system and is the foundation for Map / Reduce processing. It is not comparable to a MQTT broker. The MQTT broker could write to the HDFS, though (in case of HiveMQ with a custom plugin).
Basically MQTT is a protocol while the products you are mentioning are, well, products which solve completely different problems:
Flume is basically used for log aggregation at scale. You won't use MQTT for that, at least there is not too much advantage because this is typically done in backend applications.
Spark and Hadoop shine at Big Data crunching. They are a framework and not a ready to use solution. They are not really comparable to MQTT. Often MQTT brokers like HiveMQ are used in conjunction with these, Spark / Hadoop for data processing and HiveMQ for communication.
I hope this helps you getting started. Best would be to read about typical use cases of all these technologies, this is a bit too broad for a single SO answer.

MQTT is a data transport, so the usual thing I have to compare it with is HTTP. HTTP has two important characteristics, a) It goes from one point to another, b) It is request/response, so only one end can start a data transfer. MQTT connects many end points to many end points, and either end can start a data transfer. So, if you have just one device and only one service or person that will ever access it, and only by polling, then HTTP is great. MQTT means many devices can post data to many services or people, AND the other way around. Your question assumes that your data is always going to land up in some sort of data store, but many interactions are about events and responding to them immediately, like ringing a doorbell, or lowering the landing gear. In these cases you will often want to both record the data, and have an immediate action occur, like your phone making a doorbell noise.
Finally, you send data to MQTT semantically, rather than by IP address.
This means that your services subscribes to /mikeshouse/doorbell rather than polling 192.168.22.4, which is a huge gain once you have a number of devices.

Is it possible to distribute reads of an MQTT topic over multiple consumers?

With an MQTT broker, is it possible to set up multiple consumers for a topic such that for any given message on that topic only one consumer will receive the message?

The short answer is no, not with any broker that purely implements the MQTT spec.
I suppose it would be possible to write a broker that talked to the clients using MQTT and only delivered messages to a single subscriber. (It would have to deliver with QOS2 to ensure that every message was consumed)
By coincidence I was talking to a colleague about something similar to this sort of thing earlier in the week, he had found a way to do it using IBM* MQ Light and something called 'Shared Destinations'. (MQ Light uses AMPQ not MQTT)
https://developer.ibm.com/messaging/mq-light/
full disclosure, I work for IBM
UPDATE:
I've since been informed that the IBM MessageSight v1.2 appliance can actually do shared destinations using MQTT (http://www-03.ibm.com/software/products/en/messagesight)
UPDATE 2:
Shared subscriptions is an optional part of the MQTT v5 spec so worth checking any v5 brokers for the option.

Look at Shared Subscriptions https://issues.oasis-open.org/browse/MQTT-234
some MQTT servers support it.
EMQTT (open source):
https://github.com/emqtt/emqttd/issues/639#issuecomment-247851593
HiveMQ:
http://www.hivemq.com/blog/mqtt-client-load-balancing-with-shared-subscriptions/
IBM MessageSight:
http://www.ibm.com/support/knowledgecenter/SSCGGQ_1.2.0/com.ibm.ism.doc/Developing/devsharedsubscriptions.html
VerneMQ:
https://vernemq.com/docs/configuration/balancing.html

That is not possible. In MQTT all subscribers to a particular topic receive messages published to said topic. In order to direct a message to a particular subscriber, both publisher and subscriber would have to use a particular topic different to that used by other subscribers.

Independent of the broker that you're are using, you can use Apache Camel to implement a route that copies all messages from Topic A to Topic B.
Or copy only specific messages that match an specific rule such as user, content pattern, QoS.
Other solution is using a multi-protocol broker such as ActiveMQ and copy specific message topics to a Queue (queues only can have one consumer) and consume the queue with another protocol such as JMS or STOMP.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart