The packet identifier is required for certain MQTT control packets (http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/csprd02/mqtt-v3.1.1-csprd02.html#_Toc385349761). It's defined by the standard as a 16bit integer, and is generated by each client. The identifiers are reusable by the client after the acknowledgement packet is received. So the standard allows up to 64k in-flight messages. In practice, the clients I've looked at seem to just increment a counter, and so allow a total of 64k messages to be sent by a client. Both of rust MQTT client libraries panic when that counter overflows. (UPDATED 2016-09-07: if the rust clients are compiled in release mode then they don't panic, the value of the Packet Identifier becomes 0 -- in normal circumstances this will work, but...)
Does anyone know of an MQTT client that allows more than 64k messages/client (i.e. re-uses packet identifiers)? I'm wondering if this is a limitation that I need to be aware of in general, or if it's just a few clients. I've taken a quick look at compliance tests and haven't yet seen much to indicate that this is checked -- I'll keep looking.
Edit: It could be that some clients achieve this as a side-effect of limiting the number of in-flight messages. UPDATE 2016-09-07 the rust clients do it by assuming they can wrap on overflow and never catch up to lagging messages (maybe a good bet, but not assured, and with an ugly outcome if it happens)
As you have pointed out, the packet identifier are intended as temporary value that must persist until the published packet is received and acknowledged.
Once acknowledged, you can reuse the identifier (or not).
Most client runs on embedded system and they don't track more than a single packet (so only a single identifier is being handled) since they wait for ACK or REC/COMP before making any other publishing.
So for these clients, even a single identifier would be enough.
Please notice that for QoS 1, remembering the identifier is futile since it's valid to resend the packet if the next packet is not an ACK (so you have the identifier to reply with in the packet you are receiving).
For the rare clients that do support interleaved publish packets, they only need to support 2 valid identifiers at any time (that is, in the case they have received a QoS 2 packet, answered with PUBREC and then receive another QoS 1 or 2 packet).
Once they receive a PUBREL packet they can reply with a PUBCOMP without needing to remember the identifier (it's in the PUBREL header), so the only time they do need to remember identifier is between the PUBLISH and the PUBREC packet. Provided they allow interleaved publish packets, the only case where a second identifier is required is when they are publishing while receiving a published packet at the same time.
Now, from the point of view of the broker, most implementation use a 16-bit counter per client so they could support, in theory, up to 65535 in-transit packets.
In reality, since the minimum size of a publish packet is 8 bytes (usually more), that means having to store at least 9 bytes per potential packet (the additional byte is for remembering the current state in case of QoS 2) so that's half a MB of memory in the minimal case, but likely much more in real life, since you never have an empty publish payload and topic name.
As you see, it's almost impossible for an embedded system to implement with such storage requirement so shortcut are taken.
In most scenario, either the server does not allow so many un-acknowledged packet (by simply replying to the client in order to release the identifier) or use the identifiers pool between different clients.
So typically, again, the worst case for the broker can only happen if the client does not acknowledge the published packets. If the broker does not get any answer from the client it can:
close the connection
refuse to send new published information or
ignore the answer and republish
All of these strategies needs to be implemented anyway since you could have the same issue with a slow client and a fast publisher and your 65535 identifiers.
Since you have these strategies, there is no need to waste a MB of memory per client and instead cut your leg much earlier (while keeping a reasonable working condition).
In the end, the packet identifiers are a tool to deal with identification of recent packets, not a tool to index all packet received. A counter is good enough for this case and a wrapping around should not pose any issue when you account for the memory and bandwidth requirement.
Related
In the context on the MQTT protocol, is there a way to make a client not send publish messages when there are no subscribers to that topic?
In other words, is there a standard way to perform subscriber-aware publishing, reducing network traffic from publishing clients to the broker?
This important in applications where we have many sensors capable of producing huge amounts of data, but most of the time nobody will be interested in all of that data but on a small subset, and we want to save battery or avoid network congestion.
In the upcoming MQTT v5 specification the broker can indicate to a client that there are no subscribers for a topic when the client publishes to that topic. This is only possible for QoS 1 or QoS 2 publishes because a QoS 0 message does not result in a reply.
No, the publisher has absolutely no idea how many subscribers to a given topic there are, there could be zero or thousands.
This is a key point to pub/sub messaging, the near total decoupling of the information producer and consumer.
Presumably you can design your devices and applications so that the device as well as publishing data to a 'data topic', it also subscribes to another device-specific 'command topic' which controls the device data publishing. If the application is interested in data from a specific device it must know which device that is to know which data topic to subscribe to, so it can publish the 'please publish data now' command to the corresponding command topic.
I suppose there might be a somewhere-in-between solution where devices publish data less frequently when no apps are interested, and faster when at least one app is asking for data to be published.
Seems to me that one thing about MQTT is that you should ideally design the devices and applications as a system, not in isolation.
How many times per minute does the MQTT client poll the server? Is it a big data traffic or not? I know the size of the packet can be small, but how many times the client ping the broker to make itself "online" in the broker.
If I was not clear please comment this question and I'll try explain better my doubt.
My broker is Mosquitto and the clients are small device (sensors and etc.)
Assuming no data flow (which is of course application dependent), the client will periodically send a PINGREQ message to the broker. This is a 2 byte message and the broker replies with a PINGRESP, also 2 bytes.
The rate at which PINGREQ is sent depends on the keepalive parameter set when you connect. This tells the broker the interval at which it should expect at least one message from the client. In the absence of any other message, the client sends a PINGREQ.
60 seconds is often used as a default value (whether or not this is appropriate for you depends on how quickly you want the client/broker to respond to a hung connection). In the absence of any other messages flowing, maintaining the keepalive guarantee would mean 4 bytes total transferred every minute. This is only the application level data of course, the length of the data on the wire will be bigger.
How much of memory used whenever we use from # (wildcard) to subscription into many topics? for example if we have over 10M topics, it's possible to use # to subscribe into all of them, or it caused to memory leaks?
This problem is strictly related to the MQTT broker and client implementation.
Of course, the MQTT standard specification doesn't provide any information on the features related to such implementation.
Paolo.
Extending on ppatierno's answer.
For most well designed brokers the number or scope (for wild card) subscriptions shouldn't really change the amount of memory used under normal circumstances . At most the storage should equate to the topic string that the client subscribes to, this will be matched against a incoming message to see if it should be delivered.
Where this may not hold true is with persistent subscriptions (where the clean session value is not set to true). In this case if a client disconnects then messages may be queued until it reconnects. The amount of memory consumed here will be a function of the number of messages and their size (plus what discard policy the broker may have) and not directly a function of the number of subscribed topics.
To answer the second part of your question, subscribing to 10,000,000 topics using the wildcard is not likely to cause a memory leak, but it may very well flood the client depending on how often messages are published on those topics.
I work on IOCP Server in windows. And i have to send buffer to all connected socket.
The buffer size is small - up to 10 bytes. When i get notification for each wsasend in GetQueuedCompletionStatus, is there guarantee that the buffer was sent in one piece by single wsasend? Or should i put additional code, that check if all 10 bytes was sent, and post another wsasend if necessary?
There is no guarantee but it's highly unlikely that a send that is less than a single operating system page size would partially fail.
Failures are more likely if you're sending a buffer that is more than a single operating system page size in length and if you're not actively managing how many outstanding operations you have and how many your system can support before running out of "non paged pool" or hitting the "I/O page lock limit"
It's only possible to recover from a partial failure if you never have any other sends pending on that connection.
I tend to check that the value is as expected in the completion handler and abort the connection with an RST if it's not. I've never had this code execute in production and I've been building lots of different kinds of IOCP based client and server systems for well over 10 years now.
I am looking at developing my first multiplayer RTS game and I'm naturally going to be using UDP sockets for receiving/sending data.
One thing that I've been trying to figure out is how to protect these ports from being flooded by fake packets in a DoS attack. Normally a firewall would protect against flood attacks but I will need to allow packets on the ports that I'm using and will have to rely on my own software to reject bogus packets. What will stop people from sniffing my packets, observing any authentication or special structure I'm using and spamming me with similar packets? Source addresses can easily be changed to make detecting and banning offenders nearly impossible. Are there any widely accepted methods for protecting against these kind of attacks?
I know all about the differences between UDP and TCP and so please don't turn this into a lecture about that.
===================== EDIT =========================
I should add that I'm also trying to work out how to protect against someone 'hacking' the game and cheating by sending packets that I believe are coming from my game. Sequencing/sync numbers or id's could easily be faked. I could use an encryption but I am worried about how much this would slow the responses of my server and this wouldn't provide protection from DoS.
I know these are basic problems every programmer using a UDP socket must encounter, but for the life of me I cannot find any relevant documentation on methods for working around them!
Any direction would be appreciated!
The techniques you need would not be specific to UDP: you are looking for general message authentication to handle spoofing, rate throttling to handle DoS, and server-side state heuristics ("does this packet make sense?") to handle client hacks.
For handling DoS efficiently, you need layers of detection. First drop invalid source addresses without even looking at the contents. Put a session ID at the start of each packet with an ID that isn't assigned or doesn't match the right source. Next, keep track of the arrival rates per session. Start dropping from addresses that are coming in too fast. These techniques will block everything except someone who is able to sniff legitimate packets in real-time.
But a DoS attack based on real-time sniffing would be very rare and the rate of attack would be limited to the speed of a single source network. The only way to block packet sniffing is to use encryption and checksums, which is going to be a lot of work. Since this is your "first multiplayer RTS", I suggest doing everything short of encryption.
If you do decide to use encryption, AES-128 is relatively fast and very secure. Brian Gladman's reference Rijndael implementation is a good starting point if you really want to optimize, or there are plenty of AES libraries out there. Checksumming the clear-text data can be done with a simple CRC-16. But that's probably overkill for your likely attack vectors.
Most important of all: Never trust the client! Always keep track of everything server-side. If a packet arrives that seems bogus (like a unit moving Y units per second while it should only be able to mov X units per second) then simply drop the packet.
Also, if the number of packets per second grows to big, start dropping packets as well.
And don't use the UDP packets for "unimportant" things... In-game chat and similar things can go though normal TCP streams.