What is a good practice to achieve the "Exactly-once delivery" behavior with Amazon SQS? - amazon-sqs

According to the documentation:
Q: How many times will I receive each message?
Amazon SQS is
engineered to provide “at least once” delivery of all messages in its
queues. Although most of the time each message will be delivered to
your application exactly once, you should design your system so that
processing a message more than once does not create any errors or
inconsistencies.
Is there any good practice to achieve the exactly-once delivery?
I was thinking about using the DynamoDB “Conditional Writes” as distributed locking mechanism but... any better idea?
Some reference to this topic:
At-least-once delivery (Service Behavior)
Exactly-once delivery (Service Behavior)

FIFO queues are now available and provide ordered, exactly once out of the box.
https://aws.amazon.com/sqs/faqs/#fifo-queues
Check your region for availability.

The best solution really depends on exactly how critical it is that you not perform the action suggested in the message more than once. For some actions such as deleting a file or resizing an image it doesn't really matter if it happens twice, so it is fine to do nothing. When it is more critical to not do the work a second time I use an identifier for each message (generated by the sender) and the receiver tracks dups by marking the ids as seen in memchachd. Fine for many things, but probably not if life or money depends on it, especially if there a multiple consumers.
Conditional writes sound like a clever solution, but it has me wondering if perhaps AWS isn't such a great solution for your problem if you need a bullet proof exactly-once solution.

Another alternative for distributed locking is Redis cluster, which can also be provisioned with AWS ElasticCache. Redis supports transactions which guarantee that concurrent calls will get executed in sequence.
One of the advantages of using cache is that you can set expiration timeouts, so if your message processing fails the lock will get timed release.

In this blog post the usage of a low-latency control database like Amazon DynamoDB is also recommended:
https://aws.amazon.com/blogs/compute/new-for-aws-lambda-sqs-fifo-as-an-event-source/
Amazon SQS FIFO queues ensure that the order of processing follows the
message order within a message group. However, it does not guarantee
only once delivery when used as a Lambda trigger. If only once
delivery is important in your serverless application, it’s recommended
to make your function idempotent. You could achieve this by tracking a
unique attribute of the message using a scalable, low-latency control
database like Amazon DynamoDB.
In short - we can put item or update item in dynamodb table with condition expretion attribute_not_exists(for put) or if_not_exists(for update), please check example here
https://stackoverflow.com/a/55110463/9783262
If we get an exception during put/update operations, we have to return from a lambda without further processing, if not get it then process the message (https://aws.amazon.com/premiumsupport/knowledge-center/lambda-function-idempotent/)
The following resources were helpful for me too:
https://ably.com/blog/sqs-fifo-queues-message-ordering-and-exactly-once-processing-guaranteed
https://aws.amazon.com/blogs/aws/introducing-amazon-sns-fifo-first-in-first-out-pub-sub-messaging/
https://youtu.be/8zysQqxgj0I

Related

Background Tasks in Spring (AMQP)

I need to handle a time-consuming and error-prone task (e.g., invoking a SOAP endpoint that will trigger the delivery of an SMS) whenever a given endpoint of my REST API is invoked, but I'd prefer not to make my users wait for that before sending a response back. Spring AMQP is already part of my stack, so I though about leveraging it to establish a "work queue" and have a number of worker processes consuming from the queue and taking care of the "work units". I have, however, the following requirements:
A work unit is guaranteed to be delivered, and delivered to exactly one worker.
Shall a work unit fail to be completed for any reason it must get placed back in the queue so that another worker can pick it up later.
Work units survive server reboots and crashes. This is mandatory because I won't be using a DB of any kind to store them.
I know RabbitMQ and Spring AMQP can be configured in such a way that ensures these three requirements, but I've only ever used it to achieve RPC so I don't know much about anything other than that. Is there any example I might follow? What are some of the pitfalls to watch out for?
While creating queues, rabbitmq gives you two options; transient or durable. Durable messages will be available until you acknowledge them. And messages won't expire if you do not give queue a ttl. For starters you can enable rabbitmq management plugin and play around a little.
But if you really want to guarantee the safety of your messages against hard resets or hardware problems, i guess you need to use a rabbitmq cluster.
Rabbitmq Clustering and you can find high availability subject on the right side of the page.
This guy explaines how to cluster
By the way i like beanstalkd too. You can make it write messages to disk and they will be safe except disk failures.

MSMQ max count message notification

We are in process of implementing msmq for the quick storage of the messages and process them in disconnected mode. Typical usage of any message broker.
One of the administration requirement is to send the automatic notification to administrator/developers if the queue messages (unprocessed) count reaches 1000.
Can it be done out of the box? If yes then how?
If no then do I need to write some windows service (or any sort of scheduler) to check the count every x-seconds?
Any suggestions or past experience is welcome..
The only (partially) built-in solution would be to set up the MSMQ Queue performance counter which gives you this information for private queues on the server.
There are a number of other solutions, including a SCOM management pack, and some third party solutions like evtools, or you could roll you own using System.Messaging.
Hope this is of help.
There's commercial solution for this - QueueMonitor.
Disclaimer: I'm the author of that software.
Edit
Few tips for this scenario:
set message's UseDeadLetterQueue to true - this way if there's any issue with delivering messages at least they won't be lost but moved to system's dead letter queue.
set message's Recoverable property to true - it does reduce performance, but for this kind of long running scenario there's too much risk that some restart or failure would loose messages which are only stored in memory.
if messages are no longer valid after some period, you can use TimeToReachQueue to automatically delete them.

how to retrieve nth item in a queue with amazon sqs and ruby

Iam sending messages to the queue and using amazon sqs queuing system in a rails application. But since the queue follows FIFO process, it will get the next items in the same fashion. Suppose if I have 100 items in a queue, how can I retrieve the 35th item from the queue and process it. As far as I know, there is no such method that amazon sqs provides for doing it. So is there any other method/workaround where I can achieve the this functionality.
There is no method to do that; SQS does not guarantee order of items in the queue due to its geographically redundant nature; it can't even guarantee FIFO. If you absolutely must process things in order, and need the ability to 'look ahead' in the queue, SQS may not be your best choice. Perhaps a custom made queue in something like DynamoDB may be work better.
SQS is designed to guarantee at-least-once delivery and does not take into account the order of messages. So the simple answer to your question on whether you can do that, is no.
A work around would depend on your use-case:
To split work among different processes handling queue messages and making sure they don't both process the same item - Different queues is one approach, or prefixing every message with an identifier denoting which process is supposed to work on it. For example, if I have 4 daemons's running, I could prefix every message in the queue with the ID of the process which should work on it - 1,2,3 or 4. Every process would only process messages with the number corresponding to it's ID.
Order of arrival is critical - In this case, you're better off not using SQS because it wasn't to be used this way. CloudAMQP is a cloud based service that is based off RabbitMQ which is a true FIFO queue and would suit this case better than SQS.

Parallel asynchronous requests in SOA using a messaging broker

I've been looking at an SOA using a messaging broker (rabbitmq / rails), however there are still a few niggles I cant get my head around.
If I wanted to run parallel requests as you would using something like Typhoeus with http
a) how in an asynchronous system like this - when you have potentially multiple threads publishing to the same topic exchange do you connect the response message with your request - would you add a unique routing key?
c) what would be the best way initiating and managing multiple parallel calls of this nature in ruby?
Many thanks
In answer to a), yes you use a routing key, or in the parlance of messaging, a correlation identifier.
In answer to c), sorry I haven't a clue about Ruby, but messaging by nature supports parallelism by using queues to manage throughput. I assume that whatever broker you choose would provide the appropriate samples and tooling for your needs.
I would use at sidekiq or rescue for jobs like that. If your system is larger and distributed you can create a module/class which takes your job including key as argument, sends it to rabbitmq, some worker which is subscribed to fan out or channel picks it up and sends the result back as POST to your app (web hook approach).
For simplicity you can also just put some sort of Ajax spinner on your view and poll every 10 seconds or whatever suits you if the result is back. For sure you should have some kind of id for every job. If you have questions about it I could elaborate more. My apps crunch a lot if data in long running tasks with up to 500,000,000 items in rabbit queues.

Executing large numbers of asynchronous IO-bound operations in Rails

I'm working on a Rails application that periodically needs to perform large numbers of IO-bound operations. These operations can be performed asynchronously. For example, once per day, for each user, the system needs to query Salesforce.com to fetch the user's current list of accounts (companies) that he's tracking. This results in huge numbers (potentially > 100k) of small queries.
Our current approach is to use ActiveMQ with ActiveMessaging. Each of our users is pushed onto a queue as a different message. Then, the consumer pulls the user off the queue, queries Salesforce.com, and processes the results. But this approach gives us horrible performance. Within a single poller process, we can only process a single user at a time. So, the Salesforce.com queries become serialized. Unless we run literally hundreds of poller processes, we can't come anywhere close to saturating the server running poller.
We're looking at EventMachine as an alternative. It has the advantage of allowing us to kickoff large numbers of Salesforce.com queries concurrently within a single EventMachine process. So, we get great parallelism and utilization of our server.
But there are two problems with EventMachine. 1) We lose the reliable message delivery we had with ActiveMQ/ActiveMessaging. 2) We can't easily restart our EventMachine's periodically to lessen the impact of memory growth. For example, with ActiveMessaging, we have a cron job that restarts the poller once per day, and this can be done without worrying about losing any messages. But with EventMachine, if we restart the process, we could literally lose hundreds of messages that were in progress. The only way I can see around this is to build a persistance/reliable delivery layer on top of EventMachine.
Does anyone have a better approach? What's the best way to reliably execute large numbers of asynchronous IO-bound operations?
I maintain ActiveMessaging, and have been thinking about the issues of a multi-threaded poller also, though not perhaps at the same scale you guys are. I'll give you my thoughts here, but am also happy to discuss further o the active messaging list, or via email if you like.
One trick is that the poller is not the only serialized part of this. STOMP subscriptions, if you do client -> ack in order to prevent losing messages on interrupt, will only get sent a new message on a given connection when the prior message has been ack'd. Basically, you can only have one message being worked on at a time per connection.
So to keep using a broker, the trick will be to have many broker connections/subscriptions open at once. The current poller is pretty heavy for this, as it loads up a whole rails env per poller, and one poller is one connection. But there is nothing magical about the current poller, I could imagine writing a poller as an event machine client that is implemented to create new connections to the broker and get many messages at once.
In my own experiments lately, I have been thinking about using Ruby Enterprise Edition and having a master thread that forks many poller worker threads so as to get the benefit of the reduced memory footprint (much like passenger does), but I think the EM trick could work as well.
I am also an admirer of the Resque project, though I do not know that it would be any better at scaling to many workers - I think the workers might be lighter weight.
http://github.com/defunkt/resque
I've used AMQP with RabbitMQ in a way that would work for you. Since ActiveMQ implements AMQP, I imagine you can use it in a similar way. I have not used ActiveMessaging, which although it seems like an awesome package, I suspect may not be appropriate for this use case.
Here's how you could do it, using AMQP:
Have Rails process send a message saying "get info for user i".
The consumer pulls this off the message queue, making sure to specify that the message requires an 'ack' to be permanently removed from the queue. This means that if the message is not acknowledged as processed, it is returned to the queue for another worker eventually.
The worker then spins off the message into the thousands of small requests to SalesForce.
When all of these requests have successfully returned, another callback should be fired to ack the original message and return a "summary message" that has all the info germane to the original request. The key is using a message queue that lets you acknowledge successful processing of a given message, and making sure to do so only when relevant processing is complete.
Another worker pulls that message off the queue and performs whatever synchronous work is appropriate. Since all the latency-inducing bits have already performed, I imagine this should be fine.
If you're using (C)Ruby, try to never combine synchronous and asynchronous stuff in a single process. A process should either do everything via Eventmachine, with no code blocking, or only talk to an Eventmachine process via a message queue.
Also, writing asynchronous code is incredibly useful, but also difficult to write, difficult to test, and bug-prone. Be careful. Investigate using another language or tool if appropriate.
also checkout "cramp" and "beanstalk"
Someone sent me the following link: http://github.com/mperham/evented/tree/master/qanat/. This is a system that's somewhat similar to ActiveMessaging except that it is built on top of EventMachine. It's almost exactly what we need. The only problem is that it seems to only work with Amazon's queue, not ActiveMQ.

Resources