Message queues: is there a use case for multiple consumers to consume the message? - twitter

Let's say on twitter, a celebrity update her status, and it's pushed to all of her followers.
If we set up queue, where the publisher is service that fetches the celebrity's status, and the consumer are the individual followers. But with a million followers, whoever receives that message first will see the update, while others will not ? What is a common pattern to use here so that every one of her followers will see the update and not 'compete' with each other to consume the message first?

I guess you are thinking of queueing system as only having the ability to make point-to-point communication, ie, producer to queue to consumer. This is partly correct. Most queueing systems have atleast two patterns:
Producer-consumer: In this scenario, a message is delivered to just one consumer, ie, if there are multiple consumers, they are competing against each other to get the message from the "queue".
Publish-subscribe: Here, publishers push the message on a "topic" and consumers subscribe to the topic to get messages. The consumers are not competing - each consumer gets all the messages once they have subscribed.
In your example, it's publish-subscribe pattern in action. Underlying implementation may be different, but the basic pattern is that of publish-subscribe.
Refer: https://stackoverflow.com/a/42477769/6886283

Related

NServicebus and multiple message types on same SQS Queue

I am fairly new to NServicebus and have run into a problem that I am thinking may have to do with my architecture.
I have one SQS queue with three SNS topics. So let's say:
Queue1
MessageType1
MessageType2
MessageType3
I have created three NServicebus subscribers that will all run as three separate services. Each Subscriber is monitoring Queue1, and each one has a handler for a different message type. This is a rough sketch of how I am envisioning this to work:
---------MessageType3----------
| |
| --MessageType2--------- |
| / | |
V V | |
[Outside Publisher] --MessageType1--> [Queue1] --MessageType1--> [Subscriber1]
| |
| |
/ \
[Subscriber3]<--MessageType3--- ---MessageType2--> [Subscriber2]
An outside service published MessageType1 to the Queue1. Subscriber1 picks up the message, does some processing, and publishes MessageType2 and MessageType3 back to the Queue1. Then Subscribers 2 & 3 pick up their respective messages and do their thing.
But what is happening is it is random which subscriber (1, 2, or 3) picks up the initial MessageType1. So then Sub2 picks it up, and errors because it doesn't have a handler for it.
Why is this happening? I thought the NServicebus would only pick up messages it has a handler for. Does NServicebus only like one message type per queue? Do I need to make separate queues for each message type?
I am hoping there is some way to configure the three subscriber services to only pick up the intended message, but I realize that maybe my understanding of NServicebus is lacking and I need to rethink my design.
Yep, you've got some misconceptions going on here, let's see if I can help clear them up.
What you call "subscribers" are not subscribers. A queue defines a logical endpoint, and multiple processes monitoring that single queue are endpoint instances, not subscribers. They cooperate to scale-out on the processing of a single queue.
A single queue can process multiple message types if you so choose, but when any endpoint instance asks for a message from the queue, it can't control which type it will get. The queue is just a line, it's going to get the message that's next, whatever that message is.
So all the endpoint instances have to have ALL of the message handlers for messages that could go through that queue, or you will get that error.
The only reason to have multiple endpoint instances is for scalability (process more messages at a time) or for availability (process messages on ServerA even if ServerB is getting rebooted.)
Actual subscribers are different. Subscribers are where (in SQS/SNS parlance) a single SNS topic delivers copies of a message to multiple queues. You publish OrderPlaced and one copy goes to the Sales queue just so we can store that a sale was made, and a copy goes to the Billing queue (so that the credit card can be charged) and another copy yet goes to the Warehouse queue (so that they can start the process of getting it ready to put in a box.)
The power of subscribers is that maybe 6 months down the line you create another Subscriber called CustomerCare that subscribes to OrderPlaced in order to store a running total of how much that customer has bought over the past year (see Death to the batch job) and the important bit is you DON'T have to go back to the original code where the order was placed—you just add another subscriber with its own queue.
You might want to check out the NServiceBus step-by-step tutorial which goes over this in a lot of detail.

MQXR load balancing and request/response

Is there any way MQXR can be considered scalable?
If the subscribers don't know who the publishers are - how is it possible to ensure load-balanced publishers can publish to load balanced subscribers?
If my terminology is all wrong I'm happy to be corrected.
For any given message there is only ever a single publisher, so I'm not sure how you would get "load balanced publishers"
Load balancing messages across subscribers (to handle more messages than a single subscriber could process or to provide redundancy if a processing subscriber fails) can be handled by using Shared Subscriptions where the broker will ensure that each message is only sent to a single subscribe as part of a group.

Make permissions for SQS FIFO queue based on Message Group ID?

I'm building a system which sends commands to external systems (all identical but with different locations and IDs).
The commands are sent to a FIFO SQS queue and the external systems read and delete from that queue.
Currently the plan is to create one queue for each external system, so I'd just have a Lambda that updates the list of queues when the DB table of systems is changed.
But I can see that the SQS FIFO supports message group IDs so I wonder if I should just have one single queue, where all systems only read from their own message group ID.
I like the simplicity of this solution - however, I cannot see a way to limit access for reading and deleting messages for a specific message group, which means that if one external system is compromised, its credentials can be used to hijack the shared queue for all external systems and therefore, take down everything.
Is there a workaround for this, so I can set some permissions for a specific queue and message group ID, in any way?
I am also concerned about the missing option of purging only one group of messages, not the entire queue.
You can't read "from" a specific message group in a FIFO Queue, and there are no related permissions.
Message groups are opaque labels that tell the FIFO queue whether any two messages must be delivered to consumers in strict FIFO order relative to each other. If two messages share the same message group, they must be strictly ordered, but two messages with different message-group-ids do not need to be strictly ordered.
This capability allows faster overall processing of messages when there are parallel identical consumers, because without this feature, only one consumer could be handling messages at any one time, and the overall throughput of the queue would be limited to how quickly a single consumer could handle a message (since no messages would be delivered to another consumer as long as a single message was in flight).
Message Group ID
The tag that specifies that a message belongs to a specific message group. Messages that belong to the same message group are always processed one by one, in a strict order relative to the message group (however, messages that belong to different message groups might be processed out of order).
[...]
If you require a single group of ordered messages, provide the same message group ID for messages sent to the FIFO queue.
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/FIFO-queues.html
Note also that while the ReceiveMessage API allows you to ask that the message-group-id be returned with each message, it has no provision for specifying which message-group-id you want to receive messages from, because that isn't the purpose of this feature.

Controlled concurrency with amazon SQS

I have a multiple publishers publishing events for a shipment entity on an SQS queue and I have multiple listeners on it for parallel processing. But I want events for a particular shipment (having some identifier) to be processed sequentially in order. Is there any in-built feature to support this?
ActiveMQ has a similar concept of Exclusive Consumer which is not exactly what I need but could be adapted
Yes, there is; they are called FIFO (First-In-First-Out) queues
FIFO (First-In-First-Out) queues are designed to enhance messaging between applications when the order of operations and events is critical, or where duplicates can't be tolerated.
You will need to ensure that the messages you want processed in the correct order belong to the same Message Group ID:-
The tag that specifies that a message belongs to a specific message group. Messages that belong to the same message group are always processed one by one, in a strict order relative to the message group (however, messages that belong to different message groups might be processed out of order).
Hope that helps!

How to guarantee that Amazon SQS will receive a message only once?

I'm using an Amazon SQS queue to send notifications to an external system.
If the HTTP request fails when using SQS' SendMessage, I don't know whether the message has been queued or not. My default policy would be to retry posting the message to the queue, but there's a risk to post the message twice, which might not be acceptable depending on the use case.
Is there a way to have SQS refuse the message if there is a duplicate on the message body (or some kind of message metadata, such as a unique ID we could provide) so that we could retry until the message is accepted, and be confident that there won't be a duplicate if the first request had been already queued, but the response had been lost?
No, there's no such mechanism in SQS. Going further, it is also possible that a message will be delivered twice or more (at-least-once delivery semantics). So even if such a mechanism existed, you wouldn't be able to guarantee that the message isn't delivered multiple times.
See: http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/DistributedQueues.html
For exactly-once deliveries, you need some form of transactions (and HTTP isn't a transactional protocol) both on the sending and receiving end.
AFAIK, right now SQS does support what was asked!
Please see the "What's new" post entitled Amazon SQS Introduces FIFO Queues with Exactly-Once Processing and Lower Prices for Standard Queues
According to SQS FAQ:
FIFO queues provide exactly-once processing, which means that each message is delivered once and remains available until a consumer processes it and deletes it. Duplicates are not introduced into the queue.
There's also an AWS Blog post with a bit more insight on the subject:
These queues are designed to guarantee that messages are processed exactly once, in the order that they are sent, and without duplicates.
......
Exactly-once processing applies to both single-consumer and multiple-consumer scenarios. If you use FIFO queues in a multiple-consumer environment, you can configure your queue to make messages visible to other consumers only after the current message has been deleted or the visibility timeout expires. In this scenario, at most one consumer will actively process messages; the other consumers will be waiting until the first consumer finishes or fails.
Duplicate messages can sometimes occur when a networking issue outside of SQS prevents the message sender from learning the status of an action and causes the sender to retry the call. FIFO queues use multiple strategies to detect and eliminate duplicate messages. In addition to content-based deduplication, you can include a MessageDeduplicationId when you call SendMessage for a FIFO queue. The ID can be up to 128 characters long, and, if present, takes higher precedence than content-based deduplication.

Categories

Resources