Number of subscribers to an MQTT brokers to get all messages - mqtt

I'm running mosquttio as an MQTT broker and I have multiple devices sending sensory data periodically. I want to collect all the messages and store them. My question is, is there any advantage to having multiple connections (each connection has a unique id and subscribed to subset of the topics.) to the broker or is it preferable to have a single connection gathering all the data.
Note: the subscribers will be on the same machine as the broker.

It probably depends on the messages in question and what processing the client is going to do on those messages.
Having a single client subscribed to '#' will mean that there will only be a single entry on the subscribed topic to search for matches when processing a message. But this is probably a negligible amount of overhead under most situations.
If the message rate is high enough and there is any overhead to the storage then using something called Shared Subscriptions to allow a pool of client to all subscribe to the same topic (or wildcard) and ensure that any message is only delivered to a single client in the pool. This means that the processing of messages can be load balanced over the pool of clients.
Using Shared Subscriptions means that you can dynamically add or remove clients from the pool without having to repartition the topic space across the clients.

Related

Is MQTT efficient in case of real time communication beetween server and thousands of clients?

I have an issue where I have backend server and thousands of clients.
Each client has its own topic in MQTT.
Communication is bi-directional: clients can ask backend for smth, backend can respond or notify by some action in real-time.
How should I scale my backend subscribers to process a huge amount of messages from MQTT?
As MQTT implements pub/sub pattern, when I will scale subscribers to process more messages at the same time by adding one more instance, it will subscribe to the same topic and receive the same message as other subscribers.
Pub/sub scaling issue: Subscriber1, Subscriber2 will get Message1, then Subscriber1, Subscriber2 will get Message2.
It is opposite to AMQP when I have a consumer of the queue instead of pub/sub.
Consumer1 will get Message1, Consumer2 will get Message2, so scaling is efficient.
So is it a good choice for backend server to use MQTT in case of real-time communication with a huge amount of clients? How to deal with it?
I believe that shared subscriptions, a feature of MQTT v5, address your concern:
Like a Non‑shared Subscription, it has a Topic Filter and Subscription Options; however, a publication that matches its Topic Filter is only sent to one of its subscribing Sessions. Shared Subscriptions are useful where several consuming Clients share the processing of the publications in parallel.

Any implications of using UUIDs in MQTT topic names?

I am doing a request/response flow using a MQTT broker and I wondered if brokers like VerneMQ or Mosquitto deal well with huge amount of topics. Basically every time I want to do a request/response, I publish to a topic that looks like rpc/{UUID} meaning every request creates a new topic and then unsubscribe from it when the response is received. Will this come and bite me later ?
Topics are effectively ephemeral already.
Usually the only overhead to a topic is in the list of subscribed topic patterns (because they can be wildcards) held for each client. The topic is read from an incoming messages and checked against this list.
Using UUIDs in topics should not cause any problems.

Ordering MQTT messages across topics

I would like read messages from a MQTT broker ordered (chronologically as written) from more than one topic. For example I have topics (which are published to independently by clients at different rates all with QoS 2):
/foo/a
/foo/b
/foo/c
The messages are in a Persistent Session for a long period using Message Expiry Interval and the subscriber could come and go, on and offline, with any number of messages on each topic not yet read.
When I subscribe to: /foo/#, will I receive messages from topics /foo/a interleaved with messages from /foo/b and /foo/c in the order they were received by the broker ?
The specification on Message Ordering says:
... When a Server processes a message that has been published to an Ordered Topic, it MUST send PUBLISH packets to consumers (for the same Topic and QoS) in the order that they were received from any given Client [MQTT-4.6.0-5 ...
"(for the same Topic and QoS)" suggests ordering can only be guaranteed for the same Topic and QoS. So the answer my question of ordering across topics seems to be undefined.. ?
From the Mosquitto broker point of view, if a client has disconnected but has a long session expiry time and has active subscriptions with QoS>0, that is no different to a client that is connected - the session remains open. That means the messages will be delivered according to the ordering requirements in the spec.
This part of the answer covers retained messages only:
My understanding is that message ordering rules only apply for active sessions. That is to say, a client publishes messages and they must be delivered to current consumers only in the same order they were received.
It does not, however, apply to the situation when a client subscribes to a topic filter and receives retained messages. You can get a clue to the intent of the spec there, because the concept of messages being out of order for the same topic and QoS is nonsensical when there is only a single retained message per topic.
Ordering of delivery of retained messages that match a wildcard subscription is undefined. In Mosquitto is it roughly in order of delivery, breadth then depth. This is likely to change in the future to being sorted though.

Can MQTT (such as Mosquitto) be used so that a published topic is picked up by one, and only one, of the subscribers?

I have a system that relies on a message bus and broker to spread messages and tasks from producers to workers.
It benefits both from being able to do true pub/sub-type communications for the messages.
However, it also needs to communicate tasks. These should be done by a worker and reported back to the broker when/if the worker is finished with the task.
Can MQTT be used to publish this task by a producer, so that it is picked up by a single worker?
In my mind the producer would publish the task with a topic "TASK_FOR_USER_A" and there are X amount of workers subscribed to that topic.
The MQTT broker would then determine that it is a task and send it selectively to one of the workers.
Can this be done or is it outside the scope of MQTT brokers such as Mosquitto?
MQTT v5 has an optional extension called Shared Subscriptions which will deliver messages to a group of subscribers in a round robin approach. So each message will only be delivered to one of the group.
Mosquitto v1.6.x has implemented MQTT v5 and the shared subscription capability.
It's not clear what you mean by 1 message at a time. Messages will be delivered as they arrive and the broker will not wait for one subscriber to finish working on a message before delivering the next message to the next subscriber in the group.
If you have low enough control over the client then you can prevent the high QOS responses to prevent the client from acknowledging the message and force the broker to only allow 1 message to be in flight at a time which would effectively throttle message delivery, but you should only do this if message processing is very quick to prevent the broker from deciding delivery has failed and attempting to deliver the message to another client in the shared group.
Normally the broker will not do any routing above and beyond that based on the topic. The as mentioned in a comment on this answer the Flespi has implemented "sticky sessions" so that messages from a specific publisher will be delivered to the same client in the shared subscription pool, but this is a custom add on and not part of the spec.
What you're looking for is a message broker for a producer/consumer scenario. MQTT is a lightweight messaging protocol which is based on pub/sub model. If you start using any MQTT broker for this, you might face issues depending upon your use case. A few issues to list:
You need ordering of the messages (consumer must get the messages in the same order the producer published those). While QoS 2 guarantees message order without having shared subscriptions, having shared subscriptions doesn't provide ordered topic guarantees.
Consumer gets the message but fails before processing it and the MQTT broker has already acknowledged the message delivery. In this case, the consumer needs to specifically handle the reprocessing of failed messages.
If you go with a single topic with multiple subscribers, you must have idempotency in your consumer.
I would suggest to go for a message broker suitable for this purpose, e.g. Kafka, RabbitMQ to name a few.
As far as I know, MQTT is not meant for this purpose. It doesn't have any internal working to distribute the tasks on workers (consumers). On the Otherhand, AMQP can be used here. One hack would be to conditionalize the workers to accept only a particular type of tasks, but that needs producers to send task type as well. In this case, you won't be able to scale as well.
It's better if you explore other protocols for this type of usecase.

What is maximum connection of Mosquitto?

I'm programming chat app with MQTT protocol
You can see here mqtt.org about MQTT
My used broker is mosquito
What struck my mind is that
How many users can connect to Mosquito broker?
I saw 100k in some website but i am not sure
This is dependent on a number of factors:
What OS your running on
The size of the machine(s) you run the broker on
What broker you choose to use
How much and what type of load you are generating:
How many clients subscribed to each topic
How big the messages are
How many retained messages you are generating
What the message rates is
Are you queuing message for offline clients
You also need to configure the broker/OS to get the most out of it, e.g. for mosquitto you need to set the number of open file handles on Linux to the maximum.
For a large scale app you may want to look at one of the brokers that supports federation/clustering to spread the load and allow fail over.

Resources