Polling, web sockets, or comet on a rails app - ruby-on-rails

I'm trying to determine the best way to go about doing something for a project I have where I rely on an external API/service which takes ~2.5-4 seconds for a reply.
Currently I'm using javascript to load the api/data after the DOM has loaded then jquery updates a partial on the page. Pretty as the loader I have is, it still locks up the server process, so I'd like to move it out into a Heroku worker using delayed_job or something else? And the info from the API is user specific and not something that could be in a cron job.
The data I'm pulling only needs to be updated every few hours and is recorded locally in the DB, so I'm guessing an all out web socket such as that provided by Pusherapp.com would be overkill?
I'm leaning towards polling using delayed_job and waiting for a status update to determine it's completeness. Has anyone done this with delayed_job? Hints or caveats?
Thanks

Yeah you can definitely do something like that with delayed_job... but ultimately it sounds like you need something cron-like for scheduling, right? Alternatively, can't you use cron on heroku to just run a rake task every couple of hours?

Related

Background process in Rails 3

I am writing a Web app that will need to run a background process that will poll a web service every minute or so and then query my Rails db and send out alerts to users of the app via Twitter. I have researched this a lot but I feel I am just going around in circles. I have come across delayed_job, background_job and a few other options like creating a custom daemon suggested in a Railscast. Does anyone have any suggestions for the best way to do this? The process will have to run constantly in the background and won't be triggered by an event in the front end. Any help or guidance would be appreciated.
Why don't you just create a rake task and add it to your CRON execution?
You can even use Whenever to configure this for you.
I used Beanstalkd for this and can recommend it.
http://railscasts.com/episodes/243-beanstalkd-and-stalker
You can simply use cron for tasks that has to be executed every X minutes, hours etc.
gem whenever is usefull to setup this with rails: https://github.com/javan/whenever
I don't know much about delayed_job. But you can check out some tutorials, for example this article on heroku: http://devcenter.heroku.com/articles/delayed-job
I used delayed_job for our application.
While working on this, we researched many sites and finally we are able to apply it.
We apply our experiences in the following link
http://www.kyybaventures.com/blog/rails-delayed-job#more-2916
Hope this will help to get started with background process in rails 3.
We can either use backgroundrb or unix crontab.
Crontab will do the job if you don't want to send any heavy loaded process to run asynchronously during the request process cycle of the application.
Backgroundrb consumes lot of memory and cpu in production environment if any of the process hangs out. Also we need to configure a monitor tool to make sure that the background process is running.

Real time Alternative to polling in ruby on rails (Jruby)

I have a long running operation that I activate by placing a message on a ruby resque queue. The endpoint does the work which might take many minutes.
I was going to periodically poll the database every few second but I think the world has moved on from polling. Is there a better way using ruby on rails to get the results as a push, perhaps using something like web sockets or comet?
Can anyone suggest anything to start my research?
Websockets is the thing to research, and Pusher is a good way to get started with it.
em-websocket..or websocket-rails

send delayed emails rails on heroku

I have a table in my database with a list of emails to be sent, each at a specific time (precision down to the minute).
I'm on heroku, and I don't want to spend anything right now.. Is there a way to do this? The only way I thought was to create a deamon/cron somewhere else and make it call a private url every minute.. any other idea? Any way to have some background process or something that can handle this (on Heroku and without paying extra for addons..)?
thanks!
Heroku's free cron addon runs only once a day, so it is not suitable. Their paid cron addon runs only once an hour, so it is also not suitable. Running a daemon/cron elsewhere is a hack that will become problematic very quickly. It's fundamentally bad architecture.
Using delayed_job with a single Heroku Worker makes sense. Plus, delayed_job lets you specify exactly when each job should be run, down to a 5-second granularity. Yes, it is $36/mo to do this. But it frees you from doing things the wrong way. Plus, if you expect that you will not need the Worker most of the time, you can look into auto-scaling delayed_job on Heroku so the Worker is only turned on when you need it.
There is a whole bunch of free online services that would be more than happy to request your web page on a schedule that you set. You don't need to spend or code anything. Just Google :)

Is it worth using a daemon?

Hey guys, I have a program that uses ajax to send a post to multiple social networks via their APIs based on user form input. I was wondering if this process (which doesn't take more than 2-3 seconds when I test it myself) is worth daemonizing with something like BackgroundRB? In other words, were this program to become used by 100+ people, would the simple call to an action via AJAX slow the entire application down?
Yeah I'd recommend using DelayedJob to accomplish this task. You want to avoid unnecessary HTTP requests to your app. With DelayedJob, it connects to your database and makes third party connections without initiating any HTTP requests to your app.
I wouldn't recommend BackgroundRB.
Sort answer: you have to go into background, use delayed_job
Longer answer:
The problem is that although it takes only 2-3 seconds, it completely locks the application server while it does it. so if you have lets say 5 mongrels, or passenger app servers running, it means that if 5 people decide to do this action within 2-3 seconds interval no other requests will be able to be processed.
So while its ok to do it during the development it's a must to move it to background in production.
I wouldn't recommend BackgroundRB. For what you need it seems you need delayed_job
You have a lot of solution to made that
bj
delayed_job
resque

Best practice for Rails App to run a long task in the background?

I have a Rails application that unfortunately after a request to a controller, has to do some crunching that takes awhile. What are the best practices in Rails for providing feedback or progress on a long running task or request? These controller methods usually last 60+ seconds.
I'm not concerned with the client side... I was planning on having an Ajax request every second or so and displaying a progress indicator. I'm just not sure on the Rails best practice, do I create an additional controller? Is there something clever I can do? I want answers to focus on the server side using Rails only.
Thanks in advance for your help.
Edit:
If it matters, the http request are for PDFs. I then have Rails in conjunction with Ruport generate these PDFs. The problem is, these PDFs are very large and contain a lot of data. Does it still make sense to use a background task? Let's assume an average PDF takes about one minute to two minutes, will this make my Rails application unresponsive to any other server request during this time?
Edit 2:
Ok, after further investigation, it seems my Rails application is indeed unresponsive to any other HTTP requests after a request comes in for a large PDF. So, I guess the question now becomes: What is the best threading/background mechanism to use? It must be stable and maintained. I'm very surprised Rails doesn't have something like this built in.
Edit 3:
I have read this page: http://wiki.rubyonrails.org/rails/pages/HowToRunBackgroundJobsInRails. I would love to read about various experiences with these tools.
Edit 4:
I'm using Passenger Phusion "modrails", if it matters.
Edit 5:
I'm using Windows Vista 64 bit for my development machine; however, my production machine is Ubuntu 8.04 LTS. Should I consider switching to Linux for my development machine? Will the solutions presented work on both?
The Workling plugin allow you to schedule background tasks in a queue (they would perform the lengthy task). As of version 0.3 you can ask a worker for its status, this would allow you to display some nifty progress bars.
Another cool feature with Workling is that the asynchronous backend can be switched: you can used DelayedJobs, Spawn (classic fork), Starling...
I have a very large volume site that generates lots of large CSV files. These sometimes take several minutes to complete. I do the following:
I have a jobs table with details of the requested file. When the user requests a file, the request goes in that table and the user is taken to a "jobs status" page that lists all of their jobs.
I have a rake task that runs all outstanding jobs (a class method on the Job model).
I have a separate install of rails on another box that handles these jobs. This box just does jobs, and is not accessible to the outside world.
On this separate box, a cron job runs all outstanding jobs every 60 seconds, unless jobs are still running from the last invocation.
The user's job status page auto-refreshes to show the status of the job (which is updated by the jobs box as the job is started, running, then finished). Once the job is done, a link appears to the results file.
It may be too heavy-duty if you just plan to have one or two running at a time, but if you want to scale... :)
Calling ./script/runner in the background worked best for me. (I was also doing PDF generation.) It seems like the lowest common denominator, while also being the simplest to implement. Here's a write-up of my experience.
A simple solution that doesn't require any extra Gems or plugins would be to create a custom Rake task for handling the PDF generation. You could model the PDF generation process as a state machine with states such as submitted, processing and complete that are stored in the model's database table. The initial HTTP request to the Rails application would simply add a record to the table with a submitted state and return.
There would be a cron job that runs your custom Rake task as a separate Ruby process, so the main Rails application is unaffected. The Rake task can use ActiveRecord to find all the models that have the submitted state, change the state to processing and then generate the associated PDFs. Finally, it should set the state to complete. This enables your AJAX calls within the Rails app to monitor the state of the PDF generation process.
If you put your Rake task within your_rails_app/lib/tasks then it has access to the models within your Rails application. The skeleton of such a pdf_generator.rake would look like this:
namespace :pdfgenerator do
desc 'Generates PDFs etc.'
task :run => :environment do
# Code goes here...
end
end
As noted in the wiki, there are a few downsides to this approach. You'll be using cron to regularly create a fairly heavyweight Ruby process and the timing of your cron jobs would need careful tuning to ensure that each one has sufficient time to complete before the next one comes along. However, the approach is simple and should meet your needs.
This looks quite an old thread. However, what I have down in my app, which required to run multiple Countdown Timers for different pages, was to use Ruby Thread. The timer must continue running even if the page was closed by users. Ruby makes it easy to write multi-threaded programs with the Thread class. Ruby threads are a lightweight and efficient way to achieve parallelism in your code. I hope this will help other wanderers who is looking to achieve background: parallelism/concurrent services in their app. Likewise Ajax makes it a lot easier to call a specific Rails [custom] action every second.
This really does sound like something that you should have a background process running rather than an application instance(passenger/mongrel whichever you use) as that way your application can stay doing what it's supposed to be doing, serving requests, while a background task of some kind, Workling is good, handles the number crunching. I know that this doesn't deal with the issue of progress, but unless it is absolutely essential I think that is a small price to pay.
You could have a user click the action required, have that action pass the request to the Workling queue, and have it send some kind of notification to the user when it is completed, maybe an email or something. I'm not sure about the practicality of that, just thinking out loud, but my point is that it really seems like that should be a background task of some kind.
I'm using Windows Vista 64 bit for my
development machine; however, my
production machine is Ubuntu 8.04 LTS.
Should I consider switching to Linux
for my development machine? Will the
solutions presented work on both?
Have you considered running Linux in a VM on top of Vista?
I recommend using Resque gem with it's resque-status plug-in for your heavy background processes.
Resque
Resque is a Redis-backed Ruby library for creating background jobs,
placing them on multiple queues, and processing them later.
Resque-status
resque-status is an extension to the resque queue system that provides
simple trackable jobs.
Once you run a job on a Resque worker using resque-status extension, you will be able to get info about your ongoing progresses and ability to kill a specific process very easily. See examples:
status.pct_complete #=> 0
status.status #=> 'queued'
status.queued? #=> true
status.working? #=> false
status.time #=> Time object
status.message #=> "Created at ..."
Also resque and resque-status has a cool web interface to interact with your jobs which is so cool.
There is the brand new Growl4Rails ... that is for this specific use case (among others as well).
http://www.writebetterbits.com/2009/01/update-to-growl4rails.html
I use Background Job (http://codeforpeople.rubyforge.org/svn/bj/trunk/README) to schedule tasks. I am building a small administration site that allows Site Admins to run all sorts of things you and I would run from the command line from a nice web interface.
I know you said you were not worried about the client side but I thought you might find this interesting: Growl4Rails - Growl style notifications that were developed for pretty much what you are doing judging by the example they use.
I've used spawn before and definitely would recommend it.
Incredibly simple to set up (which many other solutions aren't), and works well.
Check out BackgrounDRb, it is designed for exactly the scenario you are describing.
I think it has been around for a while and is pretty mature. You can monitor the status of the workers.
It's a pretty good idea to develop on the same development platform as your production environment, especially when working with Rails. The suggestion to run Linux in a VM is a good one. Check out Sun xVM for Open Source virtualization software.
I personally use active_messaging plugin with a activemq server (stomp or rest protocol). This has been extremely stable for us, processing millions of messages a month.

Resources