I have a web app which has a single long running task - generating a PDF report. Various graphs are generated, and it takes about 15 sec to process in all. The report is generated by a user.
Processing the report at the time of request currently causes the process to be tied up, and more importantly (given that use of this website is not heavy) sometimes the request times out.
I am therefore redesigning the architecture of this section of the app (Rails 2.3.8). To put this in context, it's unlikely that more than a couple of these reports will be generated per day, and this is an extremely niche application, so significant further scaling isn't a major concern. I do intend to hand off the project in the future though, so stability is.
The most obvious solution I think is to use Spawn to generate a report, and fire a download link to the user in an email once it's complete. Another solution I've looked into is DelayedJob.
Can anyone who's done something similar recommend one approach over another?
delayed_job, or some other queueing mechanism is going to be the easiest thing to set up. With delayed_job you would just enqueue your worker instead of creating the PDF, and a background process on the server would be working from the queue doing whatever work was available. Using spawn to fork your whole process seems a little heavy-handed, and doesn't seem to lend itself well to other minor, but still longer running tasks (like sending emails).
Related
I've currently got a ruby on rails app hosted on Heroku that I'm monitoring with New Relic. My app is somewhat laggy when using it, and my New Relic monitor shows me the following:
Given that majority of the time is spent in Request Queuing, does this mean my app would scale better if I used an extra worker dynos? Or is this something that I can fix by optimizing my code? Sorry if this is a silly question, but I'm a complete newbie, and appreciate all the help. Thanks!
== EDIT ==
Just wanted to make sure I was crystal clear on this before having to shell out additional moolah. So New Relic also gave me the following statistics on the browser side as you can see here:
This graph shows that majority of the time spent by the user is in waiting for the web application. Can I attribute this to the fact that my app is spending majority of its time in a requesting queue? In other words that the 1.3 second response time that the end user is experiencing is currently something that code optimization alone will do little to cut down? (Basically I'm asking if I have to spend money or not) Thanks!
Request Queueing basically means 'waiting for a web instance to be available to process a request'.
So the easiest and fastest way to gain some speed in response time would be to increase the number of web instances to allow your app to process more requests faster.
It might be posible to optimize your code to speed up each individual request to the point where your application can process more requests per minute -- which would pull requests off the queue faster and reduce the overall request queueing problem.
In time, it would still be a good idea to do everything you can to optimize the code anyway. But to begin with, add more workers and your request queueing issue will more than likely be reduced or disappear.
edit
with your additional information, in general I believe the story is still the same -- though nice work in getting to a deep understanding prior to spending the money.
When you have request queuing it's because requests are waiting for web instances to become available to service their request. Adding more web instances directly impacts this by making more instances available.
It's possible that you could optimize the app so well that you significantly reduce the time to process each request. If this happened, then it would reduce request queueing as well by making requests wait a shorter period of time to be serviced.
I'd recommend giving users more web instances for now to immediately address the queueing problem, then working on optimizing the code as much as you can (assuming it's your biggest priority). And regardless of how fast you get your app to respond, if your users grow you'll need to implement more web instances to keep up -- which by the way is a good problem since your users are growing too.
Best of luck!
I just want to throw this in, even though this particular question seems answered. I found this blog post from New Relic and the guys over at Engine Yard: Blog Post.
The tl;dr here is that Request Queuing in New Relic is not necessarily requests actually lining up in the queue and not being able to get processed. Due to how New Relic calculates this metric, it essentially reads a time stamp set in a header by nginx and subtracts it from Time.now when the New Relic method gets a hold of it. However, New Relic gets run after any of your code's before_filter hooks get called. So, if you have a bunch of computationally intensive or database intensive code being run in these before_filters, it's possible that what you're seeing is actually request latency, not queuing.
You can actually examine the queue to see what's in there. If you're using Passenger, this is really easy -- just type passenger status on the command line. This will show you a ton of information about each of your Passenger workers, including how many requests are sitting in the queue. If you run with preceded with watch, the command will execute every 2 seconds so you can see how the queue changes over time (so just execute watch passenger status).
For Unicorn servers, it's a little bit more difficult, but there's a ruby script you can run, available here. This script actually examines how many requests are sitting in the unicorn socket, waiting to be picked up by workers. Because it's examining the socket itself, you shouldn't run this command any more frequently than ~3 seconds or so. The example on GitHub uses 10.
If you see a high number of queued requests, then adding horizontal scaling (via more web workers on Heroku) is probably an appropriate measure. If, however, the queue is low, yet New Relic reports high request queuing, what you're actually seeing is request latency, and you should examine your before_filters, and either scope them to only those methods that absolutely need them, or work on optimizing the code those filters are executing.
I hope this helps anyone coming to this thread in the future!
Should verification emails be processed / sent in the background using some kind of background job (Resque, Delayed_Job, etc)?
The app does have a tendency to hang until a verification email is sent. However, it only hangs for a split second or two. Don't know if its worth sending over as a background task.
What are your thoughts?
The main thing to keep in mind is that this delay isn't affecting just the current user, it's also affecting other users because it's holding up the Rails process (the exact effects of this will depend on the web server you're using and your setup).
If this application doesn't have a lot of users (and won't in the future) and you don't have any background job processing at the moment, then it may not be worth adding it. Otherwise it's probably a good idea.
If it's a few split seconds then I think that you can live with it. But if it takes longer then you may move it to delayed_job unless your delayed_job is always clogged and you are too much into background jobs. In that case it may take a bit longer to send the email to user and may ruin user experience. Even in that case you could set a higher priority for sending the verification emails.
I have no doubts about the capability of delayed_job and have been personally using it for quite some time now. Even Github recommends it while releasing resque:
https://github.com/blog/542-introducing-resque
We need a background job system as serious as our web framework. I highly recommend DelayedJob to anyone whose site is not 50% background work.
When a background job starts, it's sent to the back of a queue where a worker handles it; a task clears and the other starts. I think I've got this one right except I don't understand the practical side of it in some cases. Sure, if you're a company sending out 15,000 newsletters once a week using a delayed job makes perfect sense. But when you have an application of even 100 users, in which some task is long enough to need background work (like sending/fetching emails that might take a minute) then each user will have to wait in line while another user gets cleared (in case there's a single worker).
This is the part I'm not sure I'm getting right. I'm talking about the same job, but individually for each user. Does that count as a job per user? If I have 100 users, do I need to keep 100 workers for each one's process to not get tied up?
I've tried using delayed_job to simulate that, and indeed when I sign in with a different account I have to wait until another user's email gets sent until mine is. While the plugin is swift and simple to work with, I think it's not the right approach here.
I've also tried using Ajax, but since it's an HTTP request it ties up the browser in loading mode until it gets a response from the server (even with async: true). Not sure if I ruled this one out too quickly, but I was sortof looking for a more elegant server solution.
Is there a way to achieve a background job like this? (I've heard of different, mostly commercial solutions promising little waiting time, but I'm interested in completely eliminating the queue between users). If not, is there a method to make an ajax request without waiting for a response? I realize my questions are both drastically different but both seem like an appropriate solution to this problem.
Resque is a background processing engine that can support multiple queues.
Ways you could use this:
Group your tasks into queues that make sense on their priority. If you need fast response times, use it in a 'foreground' queue. Slow? (like sending/receiving emails) can be in the 'background' queue
Have one queue per user (you will need to have many many workers for this)
This SO question also gives a way to use delayed_jobs with multiple queues/tables
The purpose of delayed_job and other message queues is to asynchronously process jobs outside of your core application. I always use a queue for sending email since I'm relying on an outside application (sometimes a third-party API like gmail) to send them and I can't guarantee available and operating efficiency.
So for your use case, even with very few users, I highly recommend offloading emails to delayed_job. This will speed up your front end (ajax) and will also give you retries upon failure. You could spin up multiple workers to process the queue, but it shouldn't be necessary with your numbers unless your calls to send mail are taking a really long time (more than a couple seconds?).
And yes in most situations I'd create separate jobs for each user even though the message might be identical. The only time I'd process them all together would be if the email application / API has bulk sending and you can reduce the number of calls significantly by sending a large payload in a few calls.
I'm working on a Rails application that periodically needs to perform large numbers of IO-bound operations. These operations can be performed asynchronously. For example, once per day, for each user, the system needs to query Salesforce.com to fetch the user's current list of accounts (companies) that he's tracking. This results in huge numbers (potentially > 100k) of small queries.
Our current approach is to use ActiveMQ with ActiveMessaging. Each of our users is pushed onto a queue as a different message. Then, the consumer pulls the user off the queue, queries Salesforce.com, and processes the results. But this approach gives us horrible performance. Within a single poller process, we can only process a single user at a time. So, the Salesforce.com queries become serialized. Unless we run literally hundreds of poller processes, we can't come anywhere close to saturating the server running poller.
We're looking at EventMachine as an alternative. It has the advantage of allowing us to kickoff large numbers of Salesforce.com queries concurrently within a single EventMachine process. So, we get great parallelism and utilization of our server.
But there are two problems with EventMachine. 1) We lose the reliable message delivery we had with ActiveMQ/ActiveMessaging. 2) We can't easily restart our EventMachine's periodically to lessen the impact of memory growth. For example, with ActiveMessaging, we have a cron job that restarts the poller once per day, and this can be done without worrying about losing any messages. But with EventMachine, if we restart the process, we could literally lose hundreds of messages that were in progress. The only way I can see around this is to build a persistance/reliable delivery layer on top of EventMachine.
Does anyone have a better approach? What's the best way to reliably execute large numbers of asynchronous IO-bound operations?
I maintain ActiveMessaging, and have been thinking about the issues of a multi-threaded poller also, though not perhaps at the same scale you guys are. I'll give you my thoughts here, but am also happy to discuss further o the active messaging list, or via email if you like.
One trick is that the poller is not the only serialized part of this. STOMP subscriptions, if you do client -> ack in order to prevent losing messages on interrupt, will only get sent a new message on a given connection when the prior message has been ack'd. Basically, you can only have one message being worked on at a time per connection.
So to keep using a broker, the trick will be to have many broker connections/subscriptions open at once. The current poller is pretty heavy for this, as it loads up a whole rails env per poller, and one poller is one connection. But there is nothing magical about the current poller, I could imagine writing a poller as an event machine client that is implemented to create new connections to the broker and get many messages at once.
In my own experiments lately, I have been thinking about using Ruby Enterprise Edition and having a master thread that forks many poller worker threads so as to get the benefit of the reduced memory footprint (much like passenger does), but I think the EM trick could work as well.
I am also an admirer of the Resque project, though I do not know that it would be any better at scaling to many workers - I think the workers might be lighter weight.
http://github.com/defunkt/resque
I've used AMQP with RabbitMQ in a way that would work for you. Since ActiveMQ implements AMQP, I imagine you can use it in a similar way. I have not used ActiveMessaging, which although it seems like an awesome package, I suspect may not be appropriate for this use case.
Here's how you could do it, using AMQP:
Have Rails process send a message saying "get info for user i".
The consumer pulls this off the message queue, making sure to specify that the message requires an 'ack' to be permanently removed from the queue. This means that if the message is not acknowledged as processed, it is returned to the queue for another worker eventually.
The worker then spins off the message into the thousands of small requests to SalesForce.
When all of these requests have successfully returned, another callback should be fired to ack the original message and return a "summary message" that has all the info germane to the original request. The key is using a message queue that lets you acknowledge successful processing of a given message, and making sure to do so only when relevant processing is complete.
Another worker pulls that message off the queue and performs whatever synchronous work is appropriate. Since all the latency-inducing bits have already performed, I imagine this should be fine.
If you're using (C)Ruby, try to never combine synchronous and asynchronous stuff in a single process. A process should either do everything via Eventmachine, with no code blocking, or only talk to an Eventmachine process via a message queue.
Also, writing asynchronous code is incredibly useful, but also difficult to write, difficult to test, and bug-prone. Be careful. Investigate using another language or tool if appropriate.
also checkout "cramp" and "beanstalk"
Someone sent me the following link: http://github.com/mperham/evented/tree/master/qanat/. This is a system that's somewhat similar to ActiveMessaging except that it is built on top of EventMachine. It's almost exactly what we need. The only problem is that it seems to only work with Amazon's queue, not ActiveMQ.
I have a Rails application that unfortunately after a request to a controller, has to do some crunching that takes awhile. What are the best practices in Rails for providing feedback or progress on a long running task or request? These controller methods usually last 60+ seconds.
I'm not concerned with the client side... I was planning on having an Ajax request every second or so and displaying a progress indicator. I'm just not sure on the Rails best practice, do I create an additional controller? Is there something clever I can do? I want answers to focus on the server side using Rails only.
Thanks in advance for your help.
Edit:
If it matters, the http request are for PDFs. I then have Rails in conjunction with Ruport generate these PDFs. The problem is, these PDFs are very large and contain a lot of data. Does it still make sense to use a background task? Let's assume an average PDF takes about one minute to two minutes, will this make my Rails application unresponsive to any other server request during this time?
Edit 2:
Ok, after further investigation, it seems my Rails application is indeed unresponsive to any other HTTP requests after a request comes in for a large PDF. So, I guess the question now becomes: What is the best threading/background mechanism to use? It must be stable and maintained. I'm very surprised Rails doesn't have something like this built in.
Edit 3:
I have read this page: http://wiki.rubyonrails.org/rails/pages/HowToRunBackgroundJobsInRails. I would love to read about various experiences with these tools.
Edit 4:
I'm using Passenger Phusion "modrails", if it matters.
Edit 5:
I'm using Windows Vista 64 bit for my development machine; however, my production machine is Ubuntu 8.04 LTS. Should I consider switching to Linux for my development machine? Will the solutions presented work on both?
The Workling plugin allow you to schedule background tasks in a queue (they would perform the lengthy task). As of version 0.3 you can ask a worker for its status, this would allow you to display some nifty progress bars.
Another cool feature with Workling is that the asynchronous backend can be switched: you can used DelayedJobs, Spawn (classic fork), Starling...
I have a very large volume site that generates lots of large CSV files. These sometimes take several minutes to complete. I do the following:
I have a jobs table with details of the requested file. When the user requests a file, the request goes in that table and the user is taken to a "jobs status" page that lists all of their jobs.
I have a rake task that runs all outstanding jobs (a class method on the Job model).
I have a separate install of rails on another box that handles these jobs. This box just does jobs, and is not accessible to the outside world.
On this separate box, a cron job runs all outstanding jobs every 60 seconds, unless jobs are still running from the last invocation.
The user's job status page auto-refreshes to show the status of the job (which is updated by the jobs box as the job is started, running, then finished). Once the job is done, a link appears to the results file.
It may be too heavy-duty if you just plan to have one or two running at a time, but if you want to scale... :)
Calling ./script/runner in the background worked best for me. (I was also doing PDF generation.) It seems like the lowest common denominator, while also being the simplest to implement. Here's a write-up of my experience.
A simple solution that doesn't require any extra Gems or plugins would be to create a custom Rake task for handling the PDF generation. You could model the PDF generation process as a state machine with states such as submitted, processing and complete that are stored in the model's database table. The initial HTTP request to the Rails application would simply add a record to the table with a submitted state and return.
There would be a cron job that runs your custom Rake task as a separate Ruby process, so the main Rails application is unaffected. The Rake task can use ActiveRecord to find all the models that have the submitted state, change the state to processing and then generate the associated PDFs. Finally, it should set the state to complete. This enables your AJAX calls within the Rails app to monitor the state of the PDF generation process.
If you put your Rake task within your_rails_app/lib/tasks then it has access to the models within your Rails application. The skeleton of such a pdf_generator.rake would look like this:
namespace :pdfgenerator do
desc 'Generates PDFs etc.'
task :run => :environment do
# Code goes here...
end
end
As noted in the wiki, there are a few downsides to this approach. You'll be using cron to regularly create a fairly heavyweight Ruby process and the timing of your cron jobs would need careful tuning to ensure that each one has sufficient time to complete before the next one comes along. However, the approach is simple and should meet your needs.
This looks quite an old thread. However, what I have down in my app, which required to run multiple Countdown Timers for different pages, was to use Ruby Thread. The timer must continue running even if the page was closed by users. Ruby makes it easy to write multi-threaded programs with the Thread class. Ruby threads are a lightweight and efficient way to achieve parallelism in your code. I hope this will help other wanderers who is looking to achieve background: parallelism/concurrent services in their app. Likewise Ajax makes it a lot easier to call a specific Rails [custom] action every second.
This really does sound like something that you should have a background process running rather than an application instance(passenger/mongrel whichever you use) as that way your application can stay doing what it's supposed to be doing, serving requests, while a background task of some kind, Workling is good, handles the number crunching. I know that this doesn't deal with the issue of progress, but unless it is absolutely essential I think that is a small price to pay.
You could have a user click the action required, have that action pass the request to the Workling queue, and have it send some kind of notification to the user when it is completed, maybe an email or something. I'm not sure about the practicality of that, just thinking out loud, but my point is that it really seems like that should be a background task of some kind.
I'm using Windows Vista 64 bit for my
development machine; however, my
production machine is Ubuntu 8.04 LTS.
Should I consider switching to Linux
for my development machine? Will the
solutions presented work on both?
Have you considered running Linux in a VM on top of Vista?
I recommend using Resque gem with it's resque-status plug-in for your heavy background processes.
Resque
Resque is a Redis-backed Ruby library for creating background jobs,
placing them on multiple queues, and processing them later.
Resque-status
resque-status is an extension to the resque queue system that provides
simple trackable jobs.
Once you run a job on a Resque worker using resque-status extension, you will be able to get info about your ongoing progresses and ability to kill a specific process very easily. See examples:
status.pct_complete #=> 0
status.status #=> 'queued'
status.queued? #=> true
status.working? #=> false
status.time #=> Time object
status.message #=> "Created at ..."
Also resque and resque-status has a cool web interface to interact with your jobs which is so cool.
There is the brand new Growl4Rails ... that is for this specific use case (among others as well).
http://www.writebetterbits.com/2009/01/update-to-growl4rails.html
I use Background Job (http://codeforpeople.rubyforge.org/svn/bj/trunk/README) to schedule tasks. I am building a small administration site that allows Site Admins to run all sorts of things you and I would run from the command line from a nice web interface.
I know you said you were not worried about the client side but I thought you might find this interesting: Growl4Rails - Growl style notifications that were developed for pretty much what you are doing judging by the example they use.
I've used spawn before and definitely would recommend it.
Incredibly simple to set up (which many other solutions aren't), and works well.
Check out BackgrounDRb, it is designed for exactly the scenario you are describing.
I think it has been around for a while and is pretty mature. You can monitor the status of the workers.
It's a pretty good idea to develop on the same development platform as your production environment, especially when working with Rails. The suggestion to run Linux in a VM is a good one. Check out Sun xVM for Open Source virtualization software.
I personally use active_messaging plugin with a activemq server (stomp or rest protocol). This has been extremely stable for us, processing millions of messages a month.