Amazon SQS End of Queue Detection - amazon-sqs

I was wondering if there was a best practice for notifying the end of an sqs queue. I am spawning a bunch of generic workers to consume data from a queue and I want to notify them that they can stop processing once they detect no more messages in the queue. Does sqs provide this type of feature?

By looking at the right_aws ruby gem source code for SQS I found that there is the ApproximateNumberOfMessages attribute on a queue. Which you can request using a standard API call.
You can find more information including examples here:
http://docs.amazonwebservices.com/AWSSimpleQueueService/latest/APIReference/Query_QueryGetQueueAttributes.html
For more information on how to do this using the right_aws gem in ruby look at:
https://github.com/rightscale/right_aws/blob/master/lib/sqs/right_sqs_gen2_interface.rb#L187
https://github.com/rightscale/right_aws/blob/master/lib/sqs/right_sqs_gen2_interface.rb#L389

Do you mean "is there a way for the producer to notify consumers that it has finished sending messages?" . If so, then no there isn't. If a consumer calls "ReceiveMessage" and gets nothing back, or "ApproximateNumberOfMessages" returns zero, that's not a guarantee that no more messages will be sent or even that there are no messages in flight. And the producer can't send any kind of "end of stream" message because only one consumer will receive it, and it might arrive out of order. Even if you used a separate notification mechanism such as an SNS topic to notify all consumers, there's no guarantee that the SNS notification won't arrive before all the messages have been delivered.
But if you just want your pool of workers to back off when there are no messages left in the queue, then consider setting the "ReceiveMessageWaitTimeSeconds" property on your queue to its maximum value of 20 seconds. When there are no more messages to process, a ReceiveMessage call will block for up to 20s to see if a message arrives instead of returning immediately.
You could have whatever's managing your thread pool query ApproximateNumberOfMessages to regularly scale/up down your thread pool if you're concerned about releasing resources. If you do, then beware that the number you get back is Approximate, and you should always assume there may be one or more messages left on the queue even if ApproximateNumberOfMessages returns zero.

Related

Jobs pushing to queue, but not processing

I am using AWS SQS. I am getting 2 issues.
Sometime, messages are present in the queue but I am not able to read that.
When I fetch, I am getting blank array, same like not any messages found in queue.
When I am deleting a message from queue then it gives me like
sqs.delete_message({queue_url: queue_url, receipt_handle: receipt_handle})
=> Aws::EmptyStructure
When I check in SQS (In AWS), message still present even I refresh page more then 10 times.
Can you help me why this happens ?
1. You may need to implement Long Polling.
SQS is a distributed system. By default, when you read from a queue, AWS returns you the response only from a small subset of its servers. That's why you receive empty array some times. This is known as Short Polling.
When you implement Long Polling, AWS waits until it gets the response from all it's servers.
When you're calling ReceiveMessage API, set the parameter WaitTimeSeconds > 0.
2. Visibility Timeout may be too short.
The Visibility Timeout controls how long a message currently being read by one poller is invisible to other pollers. If the visibility timeout is too short, then other pollers may start reading the message before your first poller has processed and deleted it.
Since SQS supports multiple pollers reading the same message. From the docs -
The ReceiptHandle is associated with a specific instance of receiving a message. If you receive a message more than once, the ReceiptHandle is different each time you receive a message. When you use the DeleteMessage action, you must provide the most recently received ReceiptHandle for the message (otherwise, the request succeeds, but the message might not be deleted).

Is there any latency in SQS while creating it using AWS API and sending messages immediately after creating it

I want to create SQS using code whenever it is required to send messages and delete it after all messages are consumed.
I just wanted to know if there is some delay required between creating an SQS using Java code and then sending messages to it.
Thanks.
Virendra Agarwal
You'll have to try it and make observations. SQS is a dostributed system, so there is a possibility that a queue might not immediately be usable, though I did not find a direct documentation reference for this.
Note the following:
If you delete a queue, you must wait at least 60 seconds before creating a queue with the same name.
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_CreateQueue.html
This means your names will always need to be different, but it also implies something about the internals of SQS -- deleting a queue is not an instantaneous process. The same might be true of creation, though that is not necessarily the case.
Also, there is no way to know with absolute certainty that a queue is truly empty. A long poll that returns no messages is a strong indication that there are no messages remaining, as long as there are also no messages in flight (consumed but not deleted -- these will return to visibility if the consumer resets their visibility or improperly handles an exception and does not explicitly reset their visibility before the visibility timeout expires).
However, GetQueueAttributes does not provide a fail-safe way of assuring a queue is truly empty, because many of the counter attributes are the approximate number of messages (visible, in-flight, etc.). Again, this is related to the distributed architecture of SQS. Certain rare, internal failures could potentially cause messages to be stranded internally, only to appear later. The significance of this depends on the importance of the messages and the life cycle of the queue, and the risks of any such an issue seem -- to me -- increased when a queue does not have an indefinite lifetime (i.e. when the plan for a queue is to delete it when it is "empty"). This is not to imply that SQS is unreliable, only to make the point that any and all systems do eventually behave unexpectedly, however rare or unlikely.

Looking to implement write timeout when there is a delay in writing message to a queue

We are working on a billing invoice system. As a part of processing our request, we need to make an asynchronous call by placing a message in a queue. We work at 20TPS and have SLA for entire transaction of 12 sec. Occasionally, we have observed that when MQ server becomes very slow but still operational it's taking a lot of time just to write the message in the queue. We want to handle this scenario and have a system that throws an exception when it exceeds a predefined limit for writing the message in the queue.
In simple words, we want to implement a write timeout when there is a delay in writing a message in the queue. Any help is appreciated.
We are aware of mentioning timeout for receiving the response but we are unable to find any fix for mentioning timeout while writing the message in the queue.
We have found some suggestions on revalidating the destination. But in our case, we already know the destination is operational and our system becomes slow only during the response.

Erlang message processing transaction

When is the "transaction" of a process trying to fetch a message from its message queue considered to be committed or rolled back? In other words, at what point of execution is the message removed permanently from the message queue?
When it is read by a receive call.
If a message is in a message queue and read by the process calling receive then it's just memory manipulation, and no other process can contend for the data so there's no transactional nature to it as such; there's no need for locking or rolling back, etc, but because it's just memory manipulation it doesn't matter.
The language you use makes me worry you think there are more guarantees than there are. It's important to remember that at the fundamental message send and receive level (without any extra layer on top that OTP might provide or you might write yourself) you are sending messages without any guarantee they will be delivered, or that the process you are sending to even exists.

A data buffer to subscribe and unsubcribe with real-time data using rabbitmq

Basically I want to create a data buffer that a client could occasionally subscribe to, get all data from the last while, keep listening on it for real-time data, then unsubscribe after some time, and repeat.
I'm thinking of using a TTL rabbitmq queue that expires. The idea is for a client to occasionally subscribe and unsubsribe from it. When the client subscribe to the queue, it should fetch all available messages on the queue. Then the client would keep on the channel to have real-time data pushed to them.
Is this a good way to go about this? I know how to pub/sub on rabbitmq. how do I make it so it pushes all data on queue everytime a client subscribe?
It depends on how much data you are talking about. The drawback to your method is that the queue could fill up with a large amount of data, if the data rate is high and the TTL is set for a long time. You also have to keep the queue alive. And you must have one queue alive from the start for every possible subscriber.
I would suggest the Recent History Exchange perhaps modifying it so that it holds more messages.

Resources