How to guarantee that Amazon SQS will receive a message only once? - amazon-sqs

I'm using an Amazon SQS queue to send notifications to an external system.
If the HTTP request fails when using SQS' SendMessage, I don't know whether the message has been queued or not. My default policy would be to retry posting the message to the queue, but there's a risk to post the message twice, which might not be acceptable depending on the use case.
Is there a way to have SQS refuse the message if there is a duplicate on the message body (or some kind of message metadata, such as a unique ID we could provide) so that we could retry until the message is accepted, and be confident that there won't be a duplicate if the first request had been already queued, but the response had been lost?

No, there's no such mechanism in SQS. Going further, it is also possible that a message will be delivered twice or more (at-least-once delivery semantics). So even if such a mechanism existed, you wouldn't be able to guarantee that the message isn't delivered multiple times.
See: http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/DistributedQueues.html
For exactly-once deliveries, you need some form of transactions (and HTTP isn't a transactional protocol) both on the sending and receiving end.

AFAIK, right now SQS does support what was asked!
Please see the "What's new" post entitled Amazon SQS Introduces FIFO Queues with Exactly-Once Processing and Lower Prices for Standard Queues
According to SQS FAQ:
FIFO queues provide exactly-once processing, which means that each message is delivered once and remains available until a consumer processes it and deletes it. Duplicates are not introduced into the queue.
There's also an AWS Blog post with a bit more insight on the subject:
These queues are designed to guarantee that messages are processed exactly once, in the order that they are sent, and without duplicates.
......
Exactly-once processing applies to both single-consumer and multiple-consumer scenarios. If you use FIFO queues in a multiple-consumer environment, you can configure your queue to make messages visible to other consumers only after the current message has been deleted or the visibility timeout expires. In this scenario, at most one consumer will actively process messages; the other consumers will be waiting until the first consumer finishes or fails.
Duplicate messages can sometimes occur when a networking issue outside of SQS prevents the message sender from learning the status of an action and causes the sender to retry the call. FIFO queues use multiple strategies to detect and eliminate duplicate messages. In addition to content-based deduplication, you can include a MessageDeduplicationId when you call SendMessage for a FIFO queue. The ID can be up to 128 characters long, and, if present, takes higher precedence than content-based deduplication.

Related

Is there a way to receive most messages out of the standard SQS Queue? [NOT FIFO]

I tried using parallel requests but the due to retention by AWS, it does not allow to poll back the same queue unless previously polled messages are deleted.
I however achieved doing the same using the FIFO, but not the standard queue.
Thanks in Advance!
:)
When you say "it does not allow to poll back the same queue unless previously polled messages are deleted", I assume you're talking about the inflight messages per queue limit, which is pretty high at 120,000:
For most standard queues (depending on queue traffic and message backlog), there can be a maximum of approximately 120,000 inflight messages (received from a queue by a consumer, but not yet deleted from the queue). If you reach this limit, Amazon SQS returns the OverLimit error message. To avoid reaching the limit, you should delete messages from the queue after they're processed. You can also increase the number of queues you use to process your messages. To request a limit increase, file a support request.
The expected use case of SQS is to have workers that receive a message, do some work, then delete the message. If you're not following this pattern, I'd strongly recommend reevaluating whether SQS is the right tool for what you're trying to do.
However, if you really have a valid use case for having more than 120K messages inflight at once, you'll need to describe your use case to AWS and get their approval to increase that limit.

Make permissions for SQS FIFO queue based on Message Group ID?

I'm building a system which sends commands to external systems (all identical but with different locations and IDs).
The commands are sent to a FIFO SQS queue and the external systems read and delete from that queue.
Currently the plan is to create one queue for each external system, so I'd just have a Lambda that updates the list of queues when the DB table of systems is changed.
But I can see that the SQS FIFO supports message group IDs so I wonder if I should just have one single queue, where all systems only read from their own message group ID.
I like the simplicity of this solution - however, I cannot see a way to limit access for reading and deleting messages for a specific message group, which means that if one external system is compromised, its credentials can be used to hijack the shared queue for all external systems and therefore, take down everything.
Is there a workaround for this, so I can set some permissions for a specific queue and message group ID, in any way?
I am also concerned about the missing option of purging only one group of messages, not the entire queue.
You can't read "from" a specific message group in a FIFO Queue, and there are no related permissions.
Message groups are opaque labels that tell the FIFO queue whether any two messages must be delivered to consumers in strict FIFO order relative to each other. If two messages share the same message group, they must be strictly ordered, but two messages with different message-group-ids do not need to be strictly ordered.
This capability allows faster overall processing of messages when there are parallel identical consumers, because without this feature, only one consumer could be handling messages at any one time, and the overall throughput of the queue would be limited to how quickly a single consumer could handle a message (since no messages would be delivered to another consumer as long as a single message was in flight).
Message Group ID
The tag that specifies that a message belongs to a specific message group. Messages that belong to the same message group are always processed one by one, in a strict order relative to the message group (however, messages that belong to different message groups might be processed out of order).
[...]
If you require a single group of ordered messages, provide the same message group ID for messages sent to the FIFO queue.
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/FIFO-queues.html
Note also that while the ReceiveMessage API allows you to ask that the message-group-id be returned with each message, it has no provision for specifying which message-group-id you want to receive messages from, because that isn't the purpose of this feature.

Is there a way to get messages from reply queue after restart of the producer in spring-amqp

So my scenario is the following:
Producer will send a message to a queue using AsyncRabbitTemplate#sendAndReceive
Consumer will proccess the message and send a reply to the reply queue
so until now everything will work fine when the producer is up and running. The message from the reply queue will be received and everything is ok.
But when the producer is going down before all replies have been received there is no way to get them later. All pending replies will produce a warning "No pending reply - perhaps timed out:". I totally understand why this is happening when I look at the code.
Is there no way to have an persistent store of the information of incoming reply messages? Am I doing something comepletely wrong or is it just not possible to cover my use case with spring-amqp?
So the question is what is the best way to receive replies from a fixed reply queue after a restart of the producer.
There's not currently any support for persisting pending reply information; typically with request/reply scenarios, if the requestor dies the reply makes no sense. But I can see there are scenarios where that might not be the case.
Instead of using the async template, you could simply use the RabbitTemplate send() and use a listener container configured to handle and route the replies.
You would need to do your own request/reply correlation (e.g. with the correlationId header), persisting the pending reply correlations someplace.
Spring Integration provides a Metadatastore abstraction with several implementations which might be suitable.

Monitor Amazon SQS delayed processing

I have a series of applications that consume messages from SQS Queues. If for some reason one of these consumers fails and stop consuming messages I'd like to be notified. What's the best way to do this?
Note that some of these queues could only have one message placed into the queue every 2 - 3 days, so waiting for the # of messages in the queue to trigger a notification is not a good option for me.
What I'm looking for is something that can monitor an SQS queue and say "This message has been here for an hour and nothing has processed it ... let someone know."
Possible solution off the top of my head (possibly not the most elegant one) which does not require using CloudWatch at all (according to the comment from OP the required tracking cannot be implemented through CloudWatch alarms). Assume you have the Queue to be processed at Service and the receiving side is implemented through long polling.
Run a Lambda function (say hourly) listening to the Queue and reading messages, however never deleting (Service deletes the messages once processed). On the Queue set the Maximum Receives to any value u want, let's say 3. If Lambda function ran 3 times and all three times message was present in the queue, the message will be pushed to Dead Letter Queue (automatically if the redrive policy is set). Whenever new message is pushed to dead letter queue, it is a good indicator that your service is either down or not handling the requests fast enough. All variables can be changed to suit your needs

how to retrieve nth item in a queue with amazon sqs and ruby

Iam sending messages to the queue and using amazon sqs queuing system in a rails application. But since the queue follows FIFO process, it will get the next items in the same fashion. Suppose if I have 100 items in a queue, how can I retrieve the 35th item from the queue and process it. As far as I know, there is no such method that amazon sqs provides for doing it. So is there any other method/workaround where I can achieve the this functionality.
There is no method to do that; SQS does not guarantee order of items in the queue due to its geographically redundant nature; it can't even guarantee FIFO. If you absolutely must process things in order, and need the ability to 'look ahead' in the queue, SQS may not be your best choice. Perhaps a custom made queue in something like DynamoDB may be work better.
SQS is designed to guarantee at-least-once delivery and does not take into account the order of messages. So the simple answer to your question on whether you can do that, is no.
A work around would depend on your use-case:
To split work among different processes handling queue messages and making sure they don't both process the same item - Different queues is one approach, or prefixing every message with an identifier denoting which process is supposed to work on it. For example, if I have 4 daemons's running, I could prefix every message in the queue with the ID of the process which should work on it - 1,2,3 or 4. Every process would only process messages with the number corresponding to it's ID.
Order of arrival is critical - In this case, you're better off not using SQS because it wasn't to be used this way. CloudAMQP is a cloud based service that is based off RabbitMQ which is a true FIFO queue and would suit this case better than SQS.

Resources