I am ruby on rails developer. i am using rabbitMQ in my project to processed some data as soon as the data comes in queue. i am using bunny gem a rabbitMQ client that provide interface to interact with RabbitMq.
My issue is that whenever an exceptions occurred or server stops unexpectedly while processing data from queue my message from the queue is lost.
I want to know how people deal with lost messages from the rabbitMQ queue. is there any way to get those messages back for processing.
There is no way to get the messages back when they're lost. Maybe you could try and track down some entries in RMQ's database cache - but that's just a wild guess/long shot and I don't think that it will help.
What you do need to do for the future is:
in case you are using a single server: make the queues and messages durable, and explicitly acknowledge (so switch off the auto-ACK flag) messages on consumer side only once they're processed.
in case you are using cluster of RMQ nodes (which is of course recommended exactly to avoid these situations): set up queue mirroring
Take a look at RMQ persistance and high availability.
I need to handle a time-consuming and error-prone task (e.g., invoking a SOAP endpoint that will trigger the delivery of an SMS) whenever a given endpoint of my REST API is invoked, but I'd prefer not to make my users wait for that before sending a response back. Spring AMQP is already part of my stack, so I though about leveraging it to establish a "work queue" and have a number of worker processes consuming from the queue and taking care of the "work units". I have, however, the following requirements:
A work unit is guaranteed to be delivered, and delivered to exactly one worker.
Shall a work unit fail to be completed for any reason it must get placed back in the queue so that another worker can pick it up later.
Work units survive server reboots and crashes. This is mandatory because I won't be using a DB of any kind to store them.
I know RabbitMQ and Spring AMQP can be configured in such a way that ensures these three requirements, but I've only ever used it to achieve RPC so I don't know much about anything other than that. Is there any example I might follow? What are some of the pitfalls to watch out for?
While creating queues, rabbitmq gives you two options; transient or durable. Durable messages will be available until you acknowledge them. And messages won't expire if you do not give queue a ttl. For starters you can enable rabbitmq management plugin and play around a little.
But if you really want to guarantee the safety of your messages against hard resets or hardware problems, i guess you need to use a rabbitmq cluster.
Rabbitmq Clustering and you can find high availability subject on the right side of the page.
This guy explaines how to cluster
By the way i like beanstalkd too. You can make it write messages to disk and they will be safe except disk failures.
I have a system that wraps RabbitMQ using erlang and the erlang client. We have the occasional situation where a subscriber goes down and messages queue. We will be implementing a dead-letter queue in the near future but I would like to implement a tool in the mean time to bind to a given queue and PULL all messages. I can then push them off somewhere else and replay them when the subscriber comes back online. However, I am having a hard time determining the best way to do this with the Rabbit tutorials/docs/ Mainly because the tutorials are a bit lacking for erlang clients.
Does anybody have experience with this or something similar?
I think the best thing to do is make the queue set to not auto delete. That way the queue will stay alive when the subscriber goes down. The exchange will continue to push messages to the queue which will store them until the subscriber comes back up and starts reading again.
We have an existing API where a client asks our server for information that we have to get from another external server. When the external server takes a long time, say 10 seconds, it holds up a Rails passenger instance for that whole 10 seconds.
Is there some way to pass the rendering of our reply to delayed_job so that I can free up the Rails instance?
NOTE: Ideally, we would just update our API and reply to our API client that we are busy and to try back again in a few seconds to see if we are ready. However, there are already thousands of clients out there and changing them is not practical at this time.
The usual way to handle this is to queue up the job and return immediately, then poll or use some async notification framework like Pusher or Faye to update the remote client. You definitely cannot pass the connection to DJ as you describe. Another avenue you might investigate is using EventMachine to handle it, a lá http://railstips.org/blog/archives/2011/05/04/eventmachine-and-passenger/. A third alternative would be to precache the data from the remote web service, but that is an avenue very dependent on what you're doing (authorization, for example, is not something you could do there.)
The basic bottom-line is that you're dealing with a bit of an architecture issue. If you absolutely have to talk to the remote service AND output the results in the request cycle, there's not a lot you can do about it short of changing to a more evented backend like EventMachine or Node.js.
I'm working on a Rails application that periodically needs to perform large numbers of IO-bound operations. These operations can be performed asynchronously. For example, once per day, for each user, the system needs to query Salesforce.com to fetch the user's current list of accounts (companies) that he's tracking. This results in huge numbers (potentially > 100k) of small queries.
Our current approach is to use ActiveMQ with ActiveMessaging. Each of our users is pushed onto a queue as a different message. Then, the consumer pulls the user off the queue, queries Salesforce.com, and processes the results. But this approach gives us horrible performance. Within a single poller process, we can only process a single user at a time. So, the Salesforce.com queries become serialized. Unless we run literally hundreds of poller processes, we can't come anywhere close to saturating the server running poller.
We're looking at EventMachine as an alternative. It has the advantage of allowing us to kickoff large numbers of Salesforce.com queries concurrently within a single EventMachine process. So, we get great parallelism and utilization of our server.
But there are two problems with EventMachine. 1) We lose the reliable message delivery we had with ActiveMQ/ActiveMessaging. 2) We can't easily restart our EventMachine's periodically to lessen the impact of memory growth. For example, with ActiveMessaging, we have a cron job that restarts the poller once per day, and this can be done without worrying about losing any messages. But with EventMachine, if we restart the process, we could literally lose hundreds of messages that were in progress. The only way I can see around this is to build a persistance/reliable delivery layer on top of EventMachine.
Does anyone have a better approach? What's the best way to reliably execute large numbers of asynchronous IO-bound operations?
I maintain ActiveMessaging, and have been thinking about the issues of a multi-threaded poller also, though not perhaps at the same scale you guys are. I'll give you my thoughts here, but am also happy to discuss further o the active messaging list, or via email if you like.
One trick is that the poller is not the only serialized part of this. STOMP subscriptions, if you do client -> ack in order to prevent losing messages on interrupt, will only get sent a new message on a given connection when the prior message has been ack'd. Basically, you can only have one message being worked on at a time per connection.
So to keep using a broker, the trick will be to have many broker connections/subscriptions open at once. The current poller is pretty heavy for this, as it loads up a whole rails env per poller, and one poller is one connection. But there is nothing magical about the current poller, I could imagine writing a poller as an event machine client that is implemented to create new connections to the broker and get many messages at once.
In my own experiments lately, I have been thinking about using Ruby Enterprise Edition and having a master thread that forks many poller worker threads so as to get the benefit of the reduced memory footprint (much like passenger does), but I think the EM trick could work as well.
I am also an admirer of the Resque project, though I do not know that it would be any better at scaling to many workers - I think the workers might be lighter weight.
http://github.com/defunkt/resque
I've used AMQP with RabbitMQ in a way that would work for you. Since ActiveMQ implements AMQP, I imagine you can use it in a similar way. I have not used ActiveMessaging, which although it seems like an awesome package, I suspect may not be appropriate for this use case.
Here's how you could do it, using AMQP:
Have Rails process send a message saying "get info for user i".
The consumer pulls this off the message queue, making sure to specify that the message requires an 'ack' to be permanently removed from the queue. This means that if the message is not acknowledged as processed, it is returned to the queue for another worker eventually.
The worker then spins off the message into the thousands of small requests to SalesForce.
When all of these requests have successfully returned, another callback should be fired to ack the original message and return a "summary message" that has all the info germane to the original request. The key is using a message queue that lets you acknowledge successful processing of a given message, and making sure to do so only when relevant processing is complete.
Another worker pulls that message off the queue and performs whatever synchronous work is appropriate. Since all the latency-inducing bits have already performed, I imagine this should be fine.
If you're using (C)Ruby, try to never combine synchronous and asynchronous stuff in a single process. A process should either do everything via Eventmachine, with no code blocking, or only talk to an Eventmachine process via a message queue.
Also, writing asynchronous code is incredibly useful, but also difficult to write, difficult to test, and bug-prone. Be careful. Investigate using another language or tool if appropriate.
also checkout "cramp" and "beanstalk"
Someone sent me the following link: http://github.com/mperham/evented/tree/master/qanat/. This is a system that's somewhat similar to ActiveMessaging except that it is built on top of EventMachine. It's almost exactly what we need. The only problem is that it seems to only work with Amazon's queue, not ActiveMQ.