Run a background job every few seconds - ruby-on-rails

Say I have an application that needs to pull data from an API, but that there is a limit to how often I can send a query (i.e., caps at X requests / minute). In order to ensure I don't hit this limit, I want to add requests to a queue, and have a background job that will pull X requests and execute it every minute. I'm not sure what's the best method for this in Rails, however. From what I gather, DelayedJob is the better library for my needs, but I don't see any support for only running X jobs a minute. Does anyone know if there is a preferred way of implementing functionality like this?

I'm a little late but I would like to warn against using the whenever gem in your situation:
Since you're using Ruby on Rails, using the whenever gem will be loading the environment each time it gets called in cron.
Give rufus-scheduler a try.
Place the code below, for example, in config/initializers/cron_stuff.rb
require 'rufus/scheduler'
scheduler = Rufus::Scheduler.start_new
scheduler.every '20m' do
puts 'hello'
end

First, I would recommend using Sidekiq for processing background jobs. It's well supported and very simple to use. If you do use Sidekiq, then there is another gem, called Sidetiq, that will allow you to run recurring jobs.

Maybe you can try [whenever]: https://github.com/javan/whenever
Then you can add your tasks as bellow:
every 3.hours do
runner "MyModel.some_process"
rake "my:rake:task"
command "/usr/bin/my_great_command"
end
in a schedule.rb file

Related

How to run an indefinite loop in a Rails application

I'm trying to short poll an external API which doesn't support websockets, so I need to constantly make API requests every few seconds or maybe multiple times per second.
I also need to be in control of calling the next poll run. For example I wouldn’t want to run the poll again while my request is pending or I might want to increase the timeout if I get a 500 error.
Currently, considering doing this in a separate node process and only notify the rails server when there's new data. But I'd rather just do everything in the rails codebase.
I don't think ActiveJobs is built for this purpose but I could be wrong. I think what I really need is a separate entry point in the rails app repository that loads all the models but doesn't start the server and then write the loop for short polling, But not sure if that's best practice or trivial to do with rails.
So should I proceed with the node approach or is there an easy Rails solution I'm missing? Any suggestion or guidance is appreciated.
Maybe you can try whenever.
It help you run a method or a rake task with crontab.
# schedule.rb
every 1.minute do
runner "YourClass.your_method"
end
every 1.minute do
rake "polling:task"
end
After finish schedule.rb file, you'll need to execute whenever --update-crontab in you deploy pipiline in order to update crontab.

Are rake tasks suitable for long running processes in production?

I'm planing in using a rake task to develop a long running background process for my rails application. Are rake tasks appropriate for this kind of processes? Ideally, I would like wrap it inside a linux daemon to be able to start and end the process easily.
If it's not the best option, which are the alternatives? I'm trying to avoid using a cron-based solution to avoid having to worry about the schedule and the posibility of having diferent running instances of the same process overlapping between them.
Thanks!
You can try delayed job with this extension.
class MyJob
include Delayed::ScheduledJob
run_every 1.day
def display_name
"MyJob"
end
def perform
# code to run ...
end
end
Or manually enqueue another job with Time.now + 5.minutes for example after current job is finished inside perform method.
Have you looked at the delayed_job gem?
https://github.com/collectiveidea/delayed_job
From their documentation:
Delayed::Job (or DJ) encapsulates the common pattern of asynchronously executing longer tasks in the background.
It is a direct extraction from Shopify where the job table is responsible for a multitude of core tasks. Amongst those tasks are:
sending massive newsletters
image resizing
http downloads
updating smart collections
updating solr, our search server, after product changes
batch imports
spam checks
it might depend on the kind of background jobs you need to run.
Basically if you need some sort of post processing on data the users enter, like rendering images for posts, do some async integration with third party resources, etc. then you better off with using Sidekiq (yeah, it's better than DelayedJob as people suggested)
But if you need to run something on schedule, like some say night downloads, cleaning up blocked users and stuff, then writing a rake task and kick it in with a cron task might be a perfectly useful option, coz you could use those tasks from CLI whenever you need to run them on demand

rails periodic task

I have a ruby on rails app in which I'm trying to find a way to run some code every few seconds.
I've found lots of info and ideas using cron, or cron-like implementations, but these are only accurate down to the minute, and/or require external tools. I want to kick the task off every 15 seconds or so, and I want it to be entirely self contained within the application (if the app stops, the tasks stop, and no external setup).
This is being used for background generation of cache data. Every few seconds, the task will assemble some data, and then store it in a cache which gets used by all the client requests. The task is pretty slow, so it needs to run in the background and not block client requests.
I'm fairly new to ruby, but have a strong perl background, and the way I'd solve this there would be to create an interval timer & handler which forks, runs the code, and then exits when done.
It might be even nicer to just simulate a client request and have the rails controller fork itself. This way I could kick off the task by hitting the URI for it (though since the task will be running every few seconds, I doubt I'll ever need to, but might have future use). Though it would be trivial to just have the controller call whatever method is being called by the periodic task scheduler (once I have one).
I'd suggest the whenever gem https://github.com/javan/whenever
It allows you to specify a schedule like:
every 15.minutes do
MyClass.do_stuff
end
There's no scheduling cron jobs or monkeying with external services.
Generally speaking, there's no built in way that I know of to create a periodic task within the application. Rails is built on Rack and it expects to receive http requests, do something, and then return. So you just have to manage the periodic task externally yourself.
I think given the frequency that you need to run the task, a decent solution could be just to write yourself a simple rake task that loops forever, and to kick it off at the same time that you start your application, using something like Foreman. Foreman is often used like this to manage starting up/shutting down background workers along with their apps. On production, you may want to use something else to manage the processes, like Monit.
You can either write you own method, something like
class MyWorker
def self.work
#do you work
sleep 15
end
end
run it with rails runner MyWorker.work
There will be a separate process running in the background
Or you can use something like Resque, but that's a different approach. It works like that: something adds a task to the queue, meanwhile a worker is fetching whatever job it is in the queue, and tries to finish it.
So that depends on your own need.
I know it is an old question. But maybe for someone this answer could be helpful. There is a gem called crono.
Crono is a time-based background job scheduler daemon (just like Cron) for Ruby on Rails.
Crono is pure Ruby. It doesn't use Unix Cron and other platform-dependent things. So you can use it on all platforms supported by Ruby. It persists job states to your database using Active Record. You have full control of jobs performing process. It's Ruby, so you can understand and modify it to fit your needs.
The awesome thing with crono is that its code is self explained. In order to do a task periodically you can just do:
Crono.perform(YourJob).every 2.days
Maybe you can also do:
Crono.perform(YourJob).every 30.seconds
Anyway you really can do a lot of things. Another example could be:
Crono.perform(TestJob).every 1.week, on: :monday, at: "15:30"
I suggest this gem instead of whenever because whenever uses Unix Cron table which not always is available.
Throwing out a solution just because it looks somewhat elegant and answers the question without any extra gems. In my scenario I wanted to run some code, but only after all my Sidekiq workers were done doing their thing.
First I defined a method to check if any workers were working...
def workers_working?
workers = Sidekiq::Workers.new.map do |_process_id, _thread_id, work|
work
end
workers.size > 0
end
Then we just call the method with a loop which sleeps between calls.
sleep 5 while workers_working?
Use something like delayed job, and requeue it every so often?
Use thin or other server which uses eventmachine, then just use timers that are part of eventmachine. Example: in config/application.rb
EM.add_periodic_timer(2) do
do_this_every_2_sec
end

Some questions about using resque

I am using Resque to run a background process. This is how my background process works:
Scans through all the rows in an ActiveRecord model
Checks for a condition
Updates the row if the condition is met
And this needs to go on infinitely.
This is how I am trying to use Resque for my purpose, here's my worker class:
class ThumbnailMaker
#queue = :thumbnail_queue
def self.perform()
MyObj.check_thumbnails(root_url)
end
end
I understand the perform() method keeps a task in a queue, which is run periodically. In my case, I need a task that scans the whole table, so it runs for a longer time. Is it a good solution to my requirements?
On another note, I need the root url for my Rails application, which is easily obtained with the root_url in Rails Controller. But I need it in a class I have created, can you suggest me how I can get it here?
Resque is for queueing tasks to be run in the background; each item in the queue runs once and then is removed. What you want is more like a scheduled task--for example, a custom Rake task or other script that runs from time to time; there are many scheduling gems available for this kind of thing (wenever is very popular) or just use cron. There is a great RailsCasts episode about this very topic.
You might want to try putting your code in a rake task and running it periodically through a cron job. Resque/Redis seems a bit too much for your needs.
You may consider passing the root url in with as parameter if you are calling your class through your controller. Otherwise, you may want to set it as a ENV setting and configure each of your deployments accordingly.

Is Rails's "delayed_job" for cron task really?

delayed_job is at http://github.com/collectiveidea/delayed_job
Can delayed_job have the ability to do cron task? Such as running a script every night at 1am. Or run a script every 1 hour.
If not, what are the suitable gems that can do that? And can it be monitored remotely using a browser, and have logging of success and error?
I worked on a project that tried to use DelayedJob to schedule future items. It sucked.
Instead I recommend you use the whenever gem:
http://github.com/javan/whenever
Whenever is a Ruby gem that provides a
clear syntax for defining cron jobs.
It outputs valid cron syntax and can
even write your crontab file for you.
It is designed to work well with Rails
applications and can be deployed with
Capistrano. Whenever works fine
independently as well.
Code looks like this (from github)
every 3.hours do
runner "MyModel.some_process"
rake "my:rake:task"
command "/usr/bin/my_great_command"
end
every 1.day, :at => '4:30 am' do
runner "MyModel.task_to_run_at_four_thirty_in_the_morning"
end
every :hour do # Many shortcuts available: :hour, :day, :month, :year, :reboot
runner "SomeModel.ladeeda"
end
every :sunday, :at => '12pm' do # Use any day of the week or :weekend, :weekday
runner "Task.do_something_great"
end
Here's a RailsCast video on how to use it.
And the corresponding ASCIICast.
I think cron is a better tool for this than delayed_job. I've used it in a project before, and it really excels at running at task in the background or at a particular time. But, for recurring tasks that happen at regular times, I think cron is the best tool.
Check out whenever (and its Railscast) to easily schedule cron jobs that can run rake tasks (or thor, or shell scripts, or anything else.) You can use the rake tasks to update your models and then have some sort of dashboard controller that looks at the various statuses.
You can also use the ClockWork gem:
https://github.com/adamwiggins/clockwork-rails-dj
Clockwork runs as a separate daemon and can be used to trigger jobs of any sort that either getting added to a job queueing system or run right away.
Use Delayed_Job for what it's good for, a job queueing system which can be distributed over multiple nodes (or not).
Use something else to add jobs to the queue at the right time.
I was using rake(or runner)/cron/whenever gem to schedule background tasks but was finding my server load was just so high because I would be getting hit constantly with rake/runner loading up the rails environment.
Delayed_Job workers are your rails daemons that stay running so you aren't constantly firing up Rails every time a background task is required.
Whenever works great.
I also like rufus-scheduler
/config/initializers/task_scheduler.rb
Then in that file:
scheduler = Rufus::Scheduler.start_new
scheduler.every("1m") do
DailyDigest.send_digest!
end
I originally found this posted here
I've tried it and it works well.
update
Now that I look back at that link it's pretty much the only rails company that I would want to work for. They have made some many gems and add such much to the community. Not to mention they have a huge team!
I run multiple cron delayed_jobs for nightly statistic and report generating and also for data scrapping at certain intervals. Here's how I do it:
https://aaronvb.com/articles/recurring-delayed-job-with-cron.html
I created a gem for this:
https://github.com/sellect/delayed_cron
It works with sidekiq and delayed_job currently. Looking to add resque soon. I know this is a bit late, but it does pretty much exactly what you were looking for.

Resources