Jobs pushing to queue, but not processing - ruby-on-rails

I am using AWS SQS. I am getting 2 issues.
Sometime, messages are present in the queue but I am not able to read that.
When I fetch, I am getting blank array, same like not any messages found in queue.
When I am deleting a message from queue then it gives me like
sqs.delete_message({queue_url: queue_url, receipt_handle: receipt_handle})
=> Aws::EmptyStructure
When I check in SQS (In AWS), message still present even I refresh page more then 10 times.
Can you help me why this happens ?

1. You may need to implement Long Polling.
SQS is a distributed system. By default, when you read from a queue, AWS returns you the response only from a small subset of its servers. That's why you receive empty array some times. This is known as Short Polling.
When you implement Long Polling, AWS waits until it gets the response from all it's servers.
When you're calling ReceiveMessage API, set the parameter WaitTimeSeconds > 0.
2. Visibility Timeout may be too short.
The Visibility Timeout controls how long a message currently being read by one poller is invisible to other pollers. If the visibility timeout is too short, then other pollers may start reading the message before your first poller has processed and deleted it.
Since SQS supports multiple pollers reading the same message. From the docs -
The ReceiptHandle is associated with a specific instance of receiving a message. If you receive a message more than once, the ReceiptHandle is different each time you receive a message. When you use the DeleteMessage action, you must provide the most recently received ReceiptHandle for the message (otherwise, the request succeeds, but the message might not be deleted).

Related

Is there any latency in SQS while creating it using AWS API and sending messages immediately after creating it

I want to create SQS using code whenever it is required to send messages and delete it after all messages are consumed.
I just wanted to know if there is some delay required between creating an SQS using Java code and then sending messages to it.
Thanks.
Virendra Agarwal
You'll have to try it and make observations. SQS is a dostributed system, so there is a possibility that a queue might not immediately be usable, though I did not find a direct documentation reference for this.
Note the following:
If you delete a queue, you must wait at least 60 seconds before creating a queue with the same name.
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_CreateQueue.html
This means your names will always need to be different, but it also implies something about the internals of SQS -- deleting a queue is not an instantaneous process. The same might be true of creation, though that is not necessarily the case.
Also, there is no way to know with absolute certainty that a queue is truly empty. A long poll that returns no messages is a strong indication that there are no messages remaining, as long as there are also no messages in flight (consumed but not deleted -- these will return to visibility if the consumer resets their visibility or improperly handles an exception and does not explicitly reset their visibility before the visibility timeout expires).
However, GetQueueAttributes does not provide a fail-safe way of assuring a queue is truly empty, because many of the counter attributes are the approximate number of messages (visible, in-flight, etc.). Again, this is related to the distributed architecture of SQS. Certain rare, internal failures could potentially cause messages to be stranded internally, only to appear later. The significance of this depends on the importance of the messages and the life cycle of the queue, and the risks of any such an issue seem -- to me -- increased when a queue does not have an indefinite lifetime (i.e. when the plan for a queue is to delete it when it is "empty"). This is not to imply that SQS is unreliable, only to make the point that any and all systems do eventually behave unexpectedly, however rare or unlikely.

Speed up the proces of requesting messages from SQS

We need to process a big number of messages stored in SQS (the messages originate from Amazon store and SQS is the only place we can save them to) and save the result to our database. The problem is, SQS can only return 10 messages at a time. Considering we can have up to 300000 messages in SQS, even if requesting and processing a 10 messages takes little time, the whole process takes forever with the main culprit being actually requesting and receiving the messages from SQS.
We're looking for a way to speed this up. The intended result would be dumping the results to our database. The process would probably run a few times per day (the number of messages would likely be less per run in that scenario).
Like Michael-sqlbot wrote, parallel requests were the solution. By rewriting our code to use async and making 10 requests at the same time, we managed to reduce the execution time to something much reasonable.
I guess it's because I rarely use multithreading directly in my job, that I haven't thought of using it to solve this problem.

Looking to implement write timeout when there is a delay in writing message to a queue

We are working on a billing invoice system. As a part of processing our request, we need to make an asynchronous call by placing a message in a queue. We work at 20TPS and have SLA for entire transaction of 12 sec. Occasionally, we have observed that when MQ server becomes very slow but still operational it's taking a lot of time just to write the message in the queue. We want to handle this scenario and have a system that throws an exception when it exceeds a predefined limit for writing the message in the queue.
In simple words, we want to implement a write timeout when there is a delay in writing a message in the queue. Any help is appreciated.
We are aware of mentioning timeout for receiving the response but we are unable to find any fix for mentioning timeout while writing the message in the queue.
We have found some suggestions on revalidating the destination. But in our case, we already know the destination is operational and our system becomes slow only during the response.

Erlang dead letter queue

Let's say my Erlang application receives an important message from the outside (through an exposed API endpoint, for example). Due to a bug in the application or an incorrectly formatted message the process handling the message crashes.
What happens to the message? How can I influence what happens to the message? And what happens to the other messages waiting in the process mailbox? Do I have to introduce a hierarchy of processes just to make sure that no messages are lost?
Is there something like Akka's dead letter queue in Erlang? Let's say I want to handle the message later - either by fixing the message or fixing the bug in the application itself, and then rerunning the message processing.
I am surprised how little information about this topic is available.
There is no information because there is no dead letter queue, if you application crashed while processing your message the message would be already received, why would it go on a dead letter queue (if one existed).
Such a queue would be a major scalability issue with not much use (you would get arbitrary messages which couldn't be sent and would be totally out of context)
If you need to make sure a message is processed you usually use a way to get a reply back when the message is processed like a gen_server call.
And if your messages are such important that it would be a catastrophe if lost you should probably persist it in a external DB, because otherwise if your computer crashes what would happen to all the messages in transit?

Amazon SQS End of Queue Detection

I was wondering if there was a best practice for notifying the end of an sqs queue. I am spawning a bunch of generic workers to consume data from a queue and I want to notify them that they can stop processing once they detect no more messages in the queue. Does sqs provide this type of feature?
By looking at the right_aws ruby gem source code for SQS I found that there is the ApproximateNumberOfMessages attribute on a queue. Which you can request using a standard API call.
You can find more information including examples here:
http://docs.amazonwebservices.com/AWSSimpleQueueService/latest/APIReference/Query_QueryGetQueueAttributes.html
For more information on how to do this using the right_aws gem in ruby look at:
https://github.com/rightscale/right_aws/blob/master/lib/sqs/right_sqs_gen2_interface.rb#L187
https://github.com/rightscale/right_aws/blob/master/lib/sqs/right_sqs_gen2_interface.rb#L389
Do you mean "is there a way for the producer to notify consumers that it has finished sending messages?" . If so, then no there isn't. If a consumer calls "ReceiveMessage" and gets nothing back, or "ApproximateNumberOfMessages" returns zero, that's not a guarantee that no more messages will be sent or even that there are no messages in flight. And the producer can't send any kind of "end of stream" message because only one consumer will receive it, and it might arrive out of order. Even if you used a separate notification mechanism such as an SNS topic to notify all consumers, there's no guarantee that the SNS notification won't arrive before all the messages have been delivered.
But if you just want your pool of workers to back off when there are no messages left in the queue, then consider setting the "ReceiveMessageWaitTimeSeconds" property on your queue to its maximum value of 20 seconds. When there are no more messages to process, a ReceiveMessage call will block for up to 20s to see if a message arrives instead of returning immediately.
You could have whatever's managing your thread pool query ApproximateNumberOfMessages to regularly scale/up down your thread pool if you're concerned about releasing resources. If you do, then beware that the number you get back is Approximate, and you should always assume there may be one or more messages left on the queue even if ApproximateNumberOfMessages returns zero.

Resources