Can I run a rake task to simulate sidekiq running? - ruby-on-rails

I have a rails app that I want to use sidekiq on.
Before I figure out how to setup sidekiq on the server etc., is it possible for me to create a rake task that will run the sidekiq workers for any pending jobs?
This isn't a production solution I know, but I just want to make sure everything else is working on the server and for the time being me running a rake task on the server is fine as it is more for QA'ing at this point.

Log into the server, cd to the app's dir and run 'bundle exec sidekiq' manually.

Referencing from
SortedEntry and ScheduledSet, I managed to come up with the following:
# ScheduledSet returns "pending" jobs
scheduled_jobs = Sidekiq::ScheduledSet.new
# puts scheduled_jobs.class
# => Sidekiq::ScheduledSet
scheduled_jobs.each do |scheduled_job|
# puts scheduled_job.class
# => Sidekiq::SortedEntry
job_class = scheduled_job.klass.constantize
job_args = scheduled_job.args
# you can also filter out only jobs in certain queue, i.e.:
# if scheduled_job.queue == 'mailers' ......
# run the job inline
job_class.new.perform(*job_args)
end
Then just wrap this into a rake task.

Related

Rails: managing multiple sidekiqs without upstart script

Once upon a time, I had one app - Cashyy. It used sidekiq. I deployed it and used this upstart script to manage sidekiq (start/restart).
I decide to deploy another app to the same server. The app (let's call it Giimee) also uses sidekiq.
And here is the issue. Sometimes I need to restart sidekiq for Cashyy, but not for Giimee. Now, as I understand I will need to hack something using index thing (in upstart script and managing sidekiqs: sudo restart sidekiq index=1) (if I understood it correctly).
BUT!
I have zero desire to dabble with these indexes (nightmare to support? Like you need to know how many apps are using sidekiq and be sure to assign unique index to each sidekiq. And to know assigned index if you want to restart specific sidekiq).
So here is the question: how can I isolate each sidekiq (so I would not need to maintain the index) and still get the stability and usability of upstart (starting the process, restarting, etc.)?
Or maybe I don't understand something and the thing with index is state of art?
You create two services:
cp sidekiq.conf /etc/init/cashyy.conf
cp sidekiq.conf /etc/init/glimee.conf
Edit each as necessary. sudo start cashyy, sudo stop glimee, etc. Now you'll have two completely separate Sidekiq processes running.
As an alternative to an upstart script, you can use Capistrano and Capistrano-Sidekiq to manage those Sidekiqs.
We have Sidekiq running on 3 machines and have had a good experience with these two libraries/tools.
Note: we currently use an older version of Capistrano (2.15.5)
In our architecture, the three machines are customized slightly on deploy. This led us to break up our capistrano deploy scripts by machine so that we could customize some classes, manage Sidekiq, etc. Our capistrano files are structured something like this:
- config/
- deploy.rb
- deploy/
- gandalf.rb
- gollum.rb
- legolas.rb
With capistrano-sidekiq, we are able to control, well, Sidekiq :) at any time (during a deploy or otherwise). We set up the Sidekiq aspects of our deploy scripts in the following way:
# config/deploy.rb
# global sidekiq settings
set :sidekiq_default_hooks, false
set :sidekiq_cmd, "#{fetch(:bundle_cmd, 'bundle')} exec sidekiq"
set :sidekiqctl_cmd, "#{fetch(:bundle_cmd, 'bundle')} exec sidekiqctl"
set :sidekiq_role, :app
set :sidekiq_pid, "#{current_path}/tmp/pids/sidekiq.pid"
set :sidekiq_env, fetch(:rack_env, fetch(:rails_env, fetch(:default_stage)))
set :sidekiq_log, File.join(shared_path, 'log', 'sidekiq.log')
# config/deploy/gandalf.rb
# Custom Sidekiq settings
set :sidekiq_timeout, 30
set :sidekiq_processes, 1
namespace :sidekiq do
# .. code omitted from methods and tasks for brevity
def for_each_process(&block)
end
desc 'Quiet sidekiq (stop accepting new work)'
task :quiet, :roles => lambda { fetch(:sidekiq_role) }, :on_no_matching_servers => :continue do
end
desc 'Stop sidekiq'
task :stop, :roles => lambda { fetch(:sidekiq_role) }, :on_no_matching_servers => :continue do
end
desc 'Start sidekiq'
task :start, :roles => lambda { fetch(:sidekiq_role) }, :on_no_matching_servers => :continue do
end
desc 'Restart sidekiq'
task :restart, :roles => lambda { fetch(:sidekiq_role) }, :on_no_matching_servers => :continue do
end
end
When I need to restart one of my Sidekiq instances, I can just go to my terminal and execute the following:
$ bundle exec cap gandalf sidekiq:restart
$ bundle exec cap gollum sidekiq:stop
It's made Sidekiq management quite painless for our team and thought it would be worth sharing in the event something similar could help you out.

Where do I enqueue jobs into ActiveJob in Rails 4.2?

I am a beginner when it comes to Rails. I am trying to follow this example:
http://ryanselk.com/2014/09/25/using-background-jobs-in-rails-42-with-active-job/
It says:
"Jobs can be added to the job queue from anywhere. We can add a job to the queue by: ResizeImage.perform_later 'http://example.com/ex.png' "
[UPDATE] Sorry, I am stupid. I came up with this task:
namespace :simple do
# call from command line:
# rake simple:resize_images
desc "Resize images"
task resize_images: :environment do
Dir.foreach('storage') do |next_image|
puts next_image
next if next_image == '.' or next_image == '..'
ResizeImage.perform_later next_image
end
end
end
but now I do:
rake simple:resize_images
and I get:
zacek2_phpP9JGif.jpg
rake aborted!
NameError: uninitialized constant ResizeImage
I've tried:
require ResizeImage
but that did not fix the problem.
I am afraid I don't understand how loading works in Rails. How do I load ResizeImage?
Do I set it up as a cron job?
No.
From the rails guides:
Active Job is a framework for declaring jobs and making them run on a variety of queueing backends.
Active Job is an interface to queueing backends such as sidekiq, delayed_job or resque. It's simply a way for you to write background jobs where you don't have to care about which of the queueing backends will be used.
How do I start ActiveJob?
So ActiveJob doesn't run background jobs on it's own. You're still missing one of the backends. Say you have decided to use delayed_job: Get it installed and then start it via:
script/delayed_job start
I don't understand where "anywhere" is.
That means anywhere in your code, you could write something like:
user.rb
def send_registration_email
UserRegistraionMailJob.perform_later self
end

Logging in delayed_job?

I can't get any log output from delayed_job, and I'm not sure my jobs are starting.
Here's my Procfile:
web: bundle exec rails server
worker: bundle exec rake jobs:work
worker: bundle exec clockwork app/clock.rb
And here's the job:
class ScanningJob
def perform
logger.info "logging from delayed_job"
end
def after(job)
Rails.logger.info "logging from after delayed_job"
end
end
I see that clockwork outputs to system out, and I can see worker executor starting, but I never see my log statements hit. I tried puts as well to no avail.
My clock file is pretty simple:
every(3.seconds, 'refreshlistings') { Delayed::Job.enqueue ScanningJob.new }
I just want to see this working, and lack of logging means I can't. What's going on here?
When I needed log output from Delayed Jobs, I found this question to be fairly helpful.
In config/initializers/delayed_job.rb I add the line:
Delayed::Worker.logger = Logger.new(File.join(Rails.root, 'log', 'dj.log'))
Then, in my job, I output to this log with:
Delayed::Worker.logger.info("Log Entry")
This results in the file log/dj.log being written to. This works in development, staging, and production environments.

Rake tasks and rails initializers

Kinda new to Rails, so please cope with me. What i'm doing now is background processing some Ruby code use Resque. To get the Rescque rake task started, I've been using (on heroku), I have a resque.rake file with that recommended code to attach into heroku's magical(or strange) threading architecture:
require "resque/tasks"
require 'resque_scheduler/tasks'
task "resque:setup" => :environment do
ENV['QUEUE'] = '*'
end
desc "Alias for resque:work (To run workers on Heroku)"
task "jobs:work" => "resque:work"
Since I need access to the Rails code, I reference :environment. If I set at least 1 worker dyno in the background on heroku, my Resque does great, gets cleared, everything is happy. Until i try to automate stuff...
So I wanted to evolve the code and automatically fill the queue with relevant tasks every minute or so. Do that (without using cron, because heroku is not adequate with cron), I declare an initializer named task_scheduler.rb that uses Rufus scheduler to run tasks:
scheduler = Rufus::Scheduler.start_new
scheduler.in '5s' do
autoprocessor_method
end
scheduler.every '1m' do
autoprocessor_method
end
Things appear to work awesome for a while....then the rake process just stops picking up from the queue unexplainably. The queue just gets larger and larger. Even if i have multiple worker dynos running, they all eventually get tired and stop processing the queue. I'm not sure what I am doing wrong, but I suspect the referencing of the Rails environment in my rake task is causing the task_scheduler.rb code to run again, causing duplicate scheduling. I'm wondering how to solve that problem if someone knows, and I'm also curious if that is the reason for the rake task to stop working.
Thank you
You should not be booting the scheduler in an initializer, you should have a daemon process running the scheduler and filling up your queue. It would be something like this ("script/scheduler"):
#!/usr/bin/env ruby
root = File.expand_path(File.join(File.dirname(__FILE__), '..'))
Dir.chdir(root)
require 'rubygems'
gem 'daemons'
require 'daemons'
options = {
:dir_mode => :normal,
:dir => File.join(root, 'log'),
:log_output => true,
:backtrace => true,
:multiple => false
}
Daemons.run_proc("scheduler", options) do
Dir.chdir(root)
require(File.join(root, 'config', 'environment'))
scheduler = Rufus::Scheduler.start_new
scheduler.in '5s' do
autoprocessor_method
end
scheduler.every '1m' do
autoprocessor_method
end
end
And you can call this script as a usual daemon from your app:
script/scheduler start
This is going to make sure you have only one process sending work for the resque workers instead of one for each mongrel that you're running.
First of all, if you are not running on Heroku, i would not recommend this approach. I'd look at Mauricio's answer, or consider using a classic cron job or using Whenever to schedule the cron job.
But if you are in the pain of running on heroku and trying to do this, here is how i got this to work.
I kept the same original Resque.rake code in place, as i pasted in the original question. In addition, i created another rake task that i attached to the jobs:work rake process, just like the first case:
desc "Scheduler processor"
task :scheduler => :environment do
autoprocess_method
scheduler = Rufus::Scheduler.start_new
scheduler.every '1m' do
twitter_autoprocess
end
end
desc "Alias for resque:work (To run workers on Heroku)"
task "jobs:work" => "scheduler"
Couple of notes:
This will be imperfect once you use more than one worker dyno because the scheduler will run in more than one spot. you can solve that by saving state somewhere, but its not as clean as I would like.
I found the original reason why the process would hang. It was this line of code:
scheduler.in '5s' do
autoprocessor_method
end
I'm not sure why, but when I removed that, it never hung again.

Start or ensure that Delayed Job runs when an application/server restarts

We have to use delayed_job (or some other background-job processor) to run jobs in the background, but we're not allowed to change the boot scripts/boot-levels on the server. This means that the daemon is not guaranteed to remain available if the provider restarts the server (since the daemon would have been started by a capistrano recipe that is only run once per deployment).
Currently, the best way I can think of to ensure the delayed_job daemon is always running, is to add an initializer to our Rails application that checks if the daemon is running. If it's not running, then the initializer starts the daemon, otherwise, it just leaves it be.
The question, therefore, is how do we detect that the Delayed_Job daemon is running from inside a script? (We should be able to start up a daemon fairly easily, bit I don't know how to detect if one is already active).
Anyone have any ideas?
Regards,
Bernie
Based on the answer below, this is what I came up with. Just put it in config/initializers and you're all set:
#config/initializers/delayed_job.rb
DELAYED_JOB_PID_PATH = "#{Rails.root}/tmp/pids/delayed_job.pid"
def start_delayed_job
Thread.new do
`ruby script/delayed_job start`
end
end
def process_is_dead?
begin
pid = File.read(DELAYED_JOB_PID_PATH).strip
Process.kill(0, pid.to_i)
false
rescue
true
end
end
if !File.exist?(DELAYED_JOB_PID_PATH) && process_is_dead?
start_delayed_job
end
Some more cleanup ideas: The "begin" is not needed. You should rescue "no such process" in order not to fire new processes when something else goes wrong. Rescue "no such file or directory" as well to simplify the condition.
DELAYED_JOB_PID_PATH = "#{Rails.root}/tmp/pids/delayed_job.pid"
def start_delayed_job
Thread.new do
`ruby script/delayed_job start`
end
end
def daemon_is_running?
pid = File.read(DELAYED_JOB_PID_PATH).strip
Process.kill(0, pid.to_i)
true
rescue Errno::ENOENT, Errno::ESRCH # file or process not found
false
end
start_delayed_job unless daemon_is_running?
Keep in mind that this code won't work if you start more than one worker. And check out the "-m" argument of script/delayed_job which spawns a monitor process along with the daemon(s).
Check for the existence of the daemons PID file (File.exist? ...). If it's there then assume it's running else start it up.
Thank you for the solution provided in the question (and the answer that inspired it :-) ), it works for me, even with multiple workers (Rails 3.2.9, Ruby 1.9.3p327).
It worries me that I might forget to restart delayed_job after making some changes to lib for example, causing me to debug for hours before realizing that.
I added the following to my script/rails file in order to allow the code provided in the question to execute every time we start rails but not every time a worker starts:
puts "cleaning up delayed job pid..."
dj_pid_path = File.expand_path('../../tmp/pids/delayed_job.pid', __FILE__)
begin
File.delete(dj_pid_path)
rescue Errno::ENOENT # file does not exist
end
puts "delayed_job ready."
A little drawback that I'm facing with this though is that it also gets called with rails generate for example. I did not spend much time looking for a solution for that but suggestions are welcome :-)
Note that if you're using unicorn, you might want to add the same code to config/unicorn.rb before the before_fork call.
-- EDITED:
After playing around a little more with the solutions above, I ended up doing the following:
I created a file script/start_delayed_job.rb with the content:
puts "cleaning up delayed job pid..."
dj_pid_path = File.expand_path('../../tmp/pids/delayed_job.pid', __FILE__)
def kill_delayed(path)
begin
pid = File.read(path).strip
Process.kill(0, pid.to_i)
false
rescue
true
end
end
kill_delayed(dj_pid_path)
begin
File.delete(dj_pid_path)
rescue Errno::ENOENT # file does not exist
end
# spawn delayed
env = ARGV[1]
puts "spawing delayed job in the same env: #{env}"
# edited, next line has been replaced with the following on in order to ensure delayed job is running in the same environment as the one that spawned it
#Process.spawn("ruby script/delayed_job start")
system({ "RAILS_ENV" => env}, "ruby script/delayed_job start")
puts "delayed_job ready."
Now I can require this file anywhere I want, including 'script/rails' and 'config/unicorn.rb' by doing:
# in top of script/rails
START_DELAYED_PATH = File.expand_path('../start_delayed_job', __FILE__)
require "#{START_DELAYED_PATH}"
# in config/unicorn.rb, before before_fork, different expand_path
START_DELAYED_PATH = File.expand_path('../../script/start_delayed_job', __FILE__)
require "#{START_DELAYED_PATH}"
not great, but works
disclaimer: I say not great because this causes a periodic restart, which for many will not be desirable. And simply trying to start can cause problems because the implementation of DJ can lock up the queue if duplicate instances are created.
You could schedule cron tasks that run periodically to start the job(s) in question. Since DJ treats start commands as no-ops when the job is already running, it just works. This approach also takes care of the case where DJ dies for some reason other than a host restart.
# crontab example
0 * * * * /bin/bash -l -c 'cd /var/your-app/releases/20151207224034 && RAILS_ENV=production bundle exec script/delayed_job --queue=default -i=1 restart'
If you are using a gem like whenever this is pretty straightforward.
every 1.hour do
script "delayed_job --queue=default -i=1 restart"
script "delayed_job --queue=lowpri -i=2 restart"
end

Resources