Determining how many subtopics are inside of an MQTT topic - mqtt

So I am using the Mosquito MQTT broker for a project and I'm subscribing to a topic that has a variable number of subtopics. Is there any way of knowing how many subtopics a topic has without just getting the update of all the subtopics first?

No
Topics are ephemeral, they don't really exist except at the instant a message is published to one.
Subscribing clients supply a pattern (as it can include wildcards that match 1 or many levels of a topic + or #) and the broker matches that against the topic in the published message to decide if it should forward it to the subscriber.
The only time the broker keeps track of a topic is if it is storing a message with the retained flag set, or queuing a message for an offline client with a persistent subscription.

Related

Number of subscribers to an MQTT brokers to get all messages

I'm running mosquttio as an MQTT broker and I have multiple devices sending sensory data periodically. I want to collect all the messages and store them. My question is, is there any advantage to having multiple connections (each connection has a unique id and subscribed to subset of the topics.) to the broker or is it preferable to have a single connection gathering all the data.
Note: the subscribers will be on the same machine as the broker.
It probably depends on the messages in question and what processing the client is going to do on those messages.
Having a single client subscribed to '#' will mean that there will only be a single entry on the subscribed topic to search for matches when processing a message. But this is probably a negligible amount of overhead under most situations.
If the message rate is high enough and there is any overhead to the storage then using something called Shared Subscriptions to allow a pool of client to all subscribe to the same topic (or wildcard) and ensure that any message is only delivered to a single client in the pool. This means that the processing of messages can be load balanced over the pool of clients.
Using Shared Subscriptions means that you can dynamically add or remove clients from the pool without having to repartition the topic space across the clients.

Any implications of using UUIDs in MQTT topic names?

I am doing a request/response flow using a MQTT broker and I wondered if brokers like VerneMQ or Mosquitto deal well with huge amount of topics. Basically every time I want to do a request/response, I publish to a topic that looks like rpc/{UUID} meaning every request creates a new topic and then unsubscribe from it when the response is received. Will this come and bite me later ?
Topics are effectively ephemeral already.
Usually the only overhead to a topic is in the list of subscribed topic patterns (because they can be wildcards) held for each client. The topic is read from an incoming messages and checked against this list.
Using UUIDs in topics should not cause any problems.

Ordering MQTT messages across topics

I would like read messages from a MQTT broker ordered (chronologically as written) from more than one topic. For example I have topics (which are published to independently by clients at different rates all with QoS 2):
/foo/a
/foo/b
/foo/c
The messages are in a Persistent Session for a long period using Message Expiry Interval and the subscriber could come and go, on and offline, with any number of messages on each topic not yet read.
When I subscribe to: /foo/#, will I receive messages from topics /foo/a interleaved with messages from /foo/b and /foo/c in the order they were received by the broker ?
The specification on Message Ordering says:
... When a Server processes a message that has been published to an Ordered Topic, it MUST send PUBLISH packets to consumers (for the same Topic and QoS) in the order that they were received from any given Client [MQTT-4.6.0-5 ...
"(for the same Topic and QoS)" suggests ordering can only be guaranteed for the same Topic and QoS. So the answer my question of ordering across topics seems to be undefined.. ?
From the Mosquitto broker point of view, if a client has disconnected but has a long session expiry time and has active subscriptions with QoS>0, that is no different to a client that is connected - the session remains open. That means the messages will be delivered according to the ordering requirements in the spec.
This part of the answer covers retained messages only:
My understanding is that message ordering rules only apply for active sessions. That is to say, a client publishes messages and they must be delivered to current consumers only in the same order they were received.
It does not, however, apply to the situation when a client subscribes to a topic filter and receives retained messages. You can get a clue to the intent of the spec there, because the concept of messages being out of order for the same topic and QoS is nonsensical when there is only a single retained message per topic.
Ordering of delivery of retained messages that match a wildcard subscription is undefined. In Mosquitto is it roughly in order of delivery, breadth then depth. This is likely to change in the future to being sorted though.

Can MQTT (such as Mosquitto) be used so that a published topic is picked up by one, and only one, of the subscribers?

I have a system that relies on a message bus and broker to spread messages and tasks from producers to workers.
It benefits both from being able to do true pub/sub-type communications for the messages.
However, it also needs to communicate tasks. These should be done by a worker and reported back to the broker when/if the worker is finished with the task.
Can MQTT be used to publish this task by a producer, so that it is picked up by a single worker?
In my mind the producer would publish the task with a topic "TASK_FOR_USER_A" and there are X amount of workers subscribed to that topic.
The MQTT broker would then determine that it is a task and send it selectively to one of the workers.
Can this be done or is it outside the scope of MQTT brokers such as Mosquitto?
MQTT v5 has an optional extension called Shared Subscriptions which will deliver messages to a group of subscribers in a round robin approach. So each message will only be delivered to one of the group.
Mosquitto v1.6.x has implemented MQTT v5 and the shared subscription capability.
It's not clear what you mean by 1 message at a time. Messages will be delivered as they arrive and the broker will not wait for one subscriber to finish working on a message before delivering the next message to the next subscriber in the group.
If you have low enough control over the client then you can prevent the high QOS responses to prevent the client from acknowledging the message and force the broker to only allow 1 message to be in flight at a time which would effectively throttle message delivery, but you should only do this if message processing is very quick to prevent the broker from deciding delivery has failed and attempting to deliver the message to another client in the shared group.
Normally the broker will not do any routing above and beyond that based on the topic. The as mentioned in a comment on this answer the Flespi has implemented "sticky sessions" so that messages from a specific publisher will be delivered to the same client in the shared subscription pool, but this is a custom add on and not part of the spec.
What you're looking for is a message broker for a producer/consumer scenario. MQTT is a lightweight messaging protocol which is based on pub/sub model. If you start using any MQTT broker for this, you might face issues depending upon your use case. A few issues to list:
You need ordering of the messages (consumer must get the messages in the same order the producer published those). While QoS 2 guarantees message order without having shared subscriptions, having shared subscriptions doesn't provide ordered topic guarantees.
Consumer gets the message but fails before processing it and the MQTT broker has already acknowledged the message delivery. In this case, the consumer needs to specifically handle the reprocessing of failed messages.
If you go with a single topic with multiple subscribers, you must have idempotency in your consumer.
I would suggest to go for a message broker suitable for this purpose, e.g. Kafka, RabbitMQ to name a few.
As far as I know, MQTT is not meant for this purpose. It doesn't have any internal working to distribute the tasks on workers (consumers). On the Otherhand, AMQP can be used here. One hack would be to conditionalize the workers to accept only a particular type of tasks, but that needs producers to send task type as well. In this case, you won't be able to scale as well.
It's better if you explore other protocols for this type of usecase.

MQTT catch-up missed messages, looking for feedback on design/assumptions

I would like some feedback on this problem and my proposed solution to catching up after missed MQTT messages please:
[Update 1] Simplified problem diagram and added solution diagram. Added mention of QoS
Scenario:
Client A publishes messages that we wish Client B to receive, even if connections are temporarily dropped then restored.
Config
Client A: connect with clean=false. Publish stateful messages with retain = true, non-stateful messages published with retain = false
Client B: connect with clean=false
What will happen
Each time Client A publishes to topic "foo", previous messages are replaced on the broker. Ex. Client A publishes 111, 222, 333. Client B connects after the messages are published. Client B will receive only, 333. Thus, messages 111 and 222 were missed because each message replaced the previous one on that same topic (different topics do not replace each other).
Proposed solution
I envision two types of messages. Stateful and non-stateful. Stateful messages would be things like, voltage, temperature, gps location, pressure. Non-stateful messages would be things like a chat message where history is more likely to be important for context. Missed stateful messages are more likely to be tolerable while non-stateful messages might not be tolerable.
All messages are published with QoS 1 in my case.
For the stateful messages I am thinking Client A will publish with retain = true.
For the non-stateful messages, I am thinking Client A will publish with retain = false (because what good is the last message if we don't have the full historical context of previous messages). When Client B connects/reconnects, I will publish a catch-up (arbitrary name) message containing all the ids of the messages it received, which when Client A receives it, will respond by publishing the whole history of messages minus those in the id list (ids maintained in Client A db). This might work for me if the total aggregate message history isn't too big.
The alternative might be for Client B to send read receipts for each message received.
For me, these two solutions will require a database of messages and some custom logic
This is a follow-up question to this one which I tried answering but was asked to instead form it as an independent, follow-up question.

Resources