rails periodic task - ruby-on-rails

I have a ruby on rails app in which I'm trying to find a way to run some code every few seconds.
I've found lots of info and ideas using cron, or cron-like implementations, but these are only accurate down to the minute, and/or require external tools. I want to kick the task off every 15 seconds or so, and I want it to be entirely self contained within the application (if the app stops, the tasks stop, and no external setup).
This is being used for background generation of cache data. Every few seconds, the task will assemble some data, and then store it in a cache which gets used by all the client requests. The task is pretty slow, so it needs to run in the background and not block client requests.
I'm fairly new to ruby, but have a strong perl background, and the way I'd solve this there would be to create an interval timer & handler which forks, runs the code, and then exits when done.
It might be even nicer to just simulate a client request and have the rails controller fork itself. This way I could kick off the task by hitting the URI for it (though since the task will be running every few seconds, I doubt I'll ever need to, but might have future use). Though it would be trivial to just have the controller call whatever method is being called by the periodic task scheduler (once I have one).

I'd suggest the whenever gem https://github.com/javan/whenever
It allows you to specify a schedule like:
every 15.minutes do
MyClass.do_stuff
end
There's no scheduling cron jobs or monkeying with external services.

Generally speaking, there's no built in way that I know of to create a periodic task within the application. Rails is built on Rack and it expects to receive http requests, do something, and then return. So you just have to manage the periodic task externally yourself.
I think given the frequency that you need to run the task, a decent solution could be just to write yourself a simple rake task that loops forever, and to kick it off at the same time that you start your application, using something like Foreman. Foreman is often used like this to manage starting up/shutting down background workers along with their apps. On production, you may want to use something else to manage the processes, like Monit.

You can either write you own method, something like
class MyWorker
def self.work
#do you work
sleep 15
end
end
run it with rails runner MyWorker.work
There will be a separate process running in the background
Or you can use something like Resque, but that's a different approach. It works like that: something adds a task to the queue, meanwhile a worker is fetching whatever job it is in the queue, and tries to finish it.
So that depends on your own need.

I know it is an old question. But maybe for someone this answer could be helpful. There is a gem called crono.
Crono is a time-based background job scheduler daemon (just like Cron) for Ruby on Rails.
Crono is pure Ruby. It doesn't use Unix Cron and other platform-dependent things. So you can use it on all platforms supported by Ruby. It persists job states to your database using Active Record. You have full control of jobs performing process. It's Ruby, so you can understand and modify it to fit your needs.
The awesome thing with crono is that its code is self explained. In order to do a task periodically you can just do:
Crono.perform(YourJob).every 2.days
Maybe you can also do:
Crono.perform(YourJob).every 30.seconds
Anyway you really can do a lot of things. Another example could be:
Crono.perform(TestJob).every 1.week, on: :monday, at: "15:30"
I suggest this gem instead of whenever because whenever uses Unix Cron table which not always is available.

Throwing out a solution just because it looks somewhat elegant and answers the question without any extra gems. In my scenario I wanted to run some code, but only after all my Sidekiq workers were done doing their thing.
First I defined a method to check if any workers were working...
def workers_working?
workers = Sidekiq::Workers.new.map do |_process_id, _thread_id, work|
work
end
workers.size > 0
end
Then we just call the method with a loop which sleeps between calls.
sleep 5 while workers_working?

Use something like delayed job, and requeue it every so often?

Use thin or other server which uses eventmachine, then just use timers that are part of eventmachine. Example: in config/application.rb
EM.add_periodic_timer(2) do
do_this_every_2_sec
end

Related

Run a background job every few seconds

Say I have an application that needs to pull data from an API, but that there is a limit to how often I can send a query (i.e., caps at X requests / minute). In order to ensure I don't hit this limit, I want to add requests to a queue, and have a background job that will pull X requests and execute it every minute. I'm not sure what's the best method for this in Rails, however. From what I gather, DelayedJob is the better library for my needs, but I don't see any support for only running X jobs a minute. Does anyone know if there is a preferred way of implementing functionality like this?
I'm a little late but I would like to warn against using the whenever gem in your situation:
Since you're using Ruby on Rails, using the whenever gem will be loading the environment each time it gets called in cron.
Give rufus-scheduler a try.
Place the code below, for example, in config/initializers/cron_stuff.rb
require 'rufus/scheduler'
scheduler = Rufus::Scheduler.start_new
scheduler.every '20m' do
puts 'hello'
end
First, I would recommend using Sidekiq for processing background jobs. It's well supported and very simple to use. If you do use Sidekiq, then there is another gem, called Sidetiq, that will allow you to run recurring jobs.
Maybe you can try [whenever]: https://github.com/javan/whenever
Then you can add your tasks as bellow:
every 3.hours do
runner "MyModel.some_process"
rake "my:rake:task"
command "/usr/bin/my_great_command"
end
in a schedule.rb file

Are rake tasks suitable for long running processes in production?

I'm planing in using a rake task to develop a long running background process for my rails application. Are rake tasks appropriate for this kind of processes? Ideally, I would like wrap it inside a linux daemon to be able to start and end the process easily.
If it's not the best option, which are the alternatives? I'm trying to avoid using a cron-based solution to avoid having to worry about the schedule and the posibility of having diferent running instances of the same process overlapping between them.
Thanks!
You can try delayed job with this extension.
class MyJob
include Delayed::ScheduledJob
run_every 1.day
def display_name
"MyJob"
end
def perform
# code to run ...
end
end
Or manually enqueue another job with Time.now + 5.minutes for example after current job is finished inside perform method.
Have you looked at the delayed_job gem?
https://github.com/collectiveidea/delayed_job
From their documentation:
Delayed::Job (or DJ) encapsulates the common pattern of asynchronously executing longer tasks in the background.
It is a direct extraction from Shopify where the job table is responsible for a multitude of core tasks. Amongst those tasks are:
sending massive newsletters
image resizing
http downloads
updating smart collections
updating solr, our search server, after product changes
batch imports
spam checks
it might depend on the kind of background jobs you need to run.
Basically if you need some sort of post processing on data the users enter, like rendering images for posts, do some async integration with third party resources, etc. then you better off with using Sidekiq (yeah, it's better than DelayedJob as people suggested)
But if you need to run something on schedule, like some say night downloads, cleaning up blocked users and stuff, then writing a rake task and kick it in with a cron task might be a perfectly useful option, coz you could use those tasks from CLI whenever you need to run them on demand

Running delayed_job inside the main web process

Can i run delayed_job or similar schedule frameworks inside of the web server eg. thin or unicorn?
If yes how do i start it? (code example would be very cool!)
The reason is that i want to save money during my application is just in a build-up phase and it is hosted on heroku.
Officially
No, there is no supported way to run delayed_jobs asynchronously within the web framework. From the documentation on running jobs, it looks like the only supported way to run a job is to run a rake task or the delayed job script. Also, it seems conceptually wrong to bend a Rack server, which was designed to handle incoming client requests, to support pulling tasks off of some queue somewhere.
The Kludge
That said, I understand that saving money sometimes trumps being conceptually perfect. Take a look at these rake tasks. My kludge is to create a special endpoint in your Rails server that you hit periodically from some remote location. Inside this endpoint, instantiate a Delayed::Worker and call .start on it with the exit_on_complete option. This way, you won't need a new dyno or command.
Be warned, it's kind of a kludgy solution and it will tie up one of your rails processes until all delayed jobs are complete. That means unless you have other rails processes, all incoming requests will block until this queue request is finished. Unicorn provides facilities to spawn worker processes. Whether or not this solution will work will also depend on your jobs and how long they take to run and your application's delay tolerances.
Edit
With the spawn gem, you can wrap your instantiation of the Delayed::Worker with a spawn block, which will cause your jobs to be run in a separate process. This means your rails process will be available to serve web requests immediately instead of blocking while delayed jobs are run. However, the spawn gem has some dependencies on ActiveRecord and I do not know what DB/ORM you are using.
Here is some example code, because it's becoming a bit hazy:
class JobsController < ApplicationController
def run
spawn do
#options = {} # youll have to get these from that rake file
Delayed::Worker.new(#options.merge(exit_on_complete: true)).start
end
end
end
Here's a link to a similar question:
Is it feasible to run multiple processeses on a Heroku dyno?
Bear in mind, as the post says, if you're only using one web dyno, it will be shut down if there's no traffic going to it.
In a similar vein, you might look into:
http://blog.codeship.io/2012/05/06/Unicorn-on-Heroku.html
To save on the need for multiple web dynos whilst you're building your app (although it's still subject to the above shutdown issue).
I would suggest you might look at running on a VPS directly, rather than Heroku (check out the railscast):
http://railscasts.com/episodes/337-capistrano-recipes
Once set up, it's pretty easy to deploy to. Heroku cuts out the devops part for you.
You can run it inside a separate worker of Unicorn, so it shares memory with the master process and get restarted together with the app.
See https://gist.github.com/brauliobo/11298486

Some questions about using resque

I am using Resque to run a background process. This is how my background process works:
Scans through all the rows in an ActiveRecord model
Checks for a condition
Updates the row if the condition is met
And this needs to go on infinitely.
This is how I am trying to use Resque for my purpose, here's my worker class:
class ThumbnailMaker
#queue = :thumbnail_queue
def self.perform()
MyObj.check_thumbnails(root_url)
end
end
I understand the perform() method keeps a task in a queue, which is run periodically. In my case, I need a task that scans the whole table, so it runs for a longer time. Is it a good solution to my requirements?
On another note, I need the root url for my Rails application, which is easily obtained with the root_url in Rails Controller. But I need it in a class I have created, can you suggest me how I can get it here?
Resque is for queueing tasks to be run in the background; each item in the queue runs once and then is removed. What you want is more like a scheduled task--for example, a custom Rake task or other script that runs from time to time; there are many scheduling gems available for this kind of thing (wenever is very popular) or just use cron. There is a great RailsCasts episode about this very topic.
You might want to try putting your code in a rake task and running it periodically through a cron job. Resque/Redis seems a bit too much for your needs.
You may consider passing the root url in with as parameter if you are calling your class through your controller. Otherwise, you may want to set it as a ENV setting and configure each of your deployments accordingly.

Regular delayed jobs

I'm using Delayed Job to manage background work.
However I have some tasks that need to be executed at regular interval. Every hour, every day or every week for example.
For now, when I execute the task, I create a new one to be executed in one day/week/month.
However I don't really like it. If for any reason, the task isn't completely executed, we don't create the next one and we might lose the execution of the task.
How do you manage that kind of things (with delayed job) in your rails apps to be sure your regular tasks list remains correct ?
If you have access to Cron, I highly recommend Whenever
http://github.com/javan/whenever
You specify what you want to run and at what frequency in dead simple ruby, and whenever supplies rake tasks to convert this into a crontab and to update your system's crontab.
If you don't have access to frequent cron (like I don't, since we're on Heroku), then DJ is the way to go.
You have a couple options.
Do what you're doing. DJ will retry each task a certain number of times, so you have some leniency there
Put the code that creates the next DJ job in an ensure block, to make sure it gets created even after an exception or other bad event
Create another DJ that runs periodically, checks to make sure the appropriate DJs exist, and creates them if they don't. Of course, this is just as error prone as the other options, since the monitor and the actual DJ are both running in the same env, but it's something.
Is there any particular reason why you wouldn't use cron for this type of things?
Or maybe something more rubyish like rufus-scheduler, which is quite easy to use and very reliable.
If you don't need queuing, these tools are a way to go, I think.

Resources