How to automate testing successful boot of Sidekiq Process - ruby-on-rails

We've had a couple of issues recently in our Rails 4 app that could have been prevented with a simple check to see that both the main app and a worker process correctly boot up our application.
Our Procfile for Heroku looks like so:
web: bundle exec unicorn -p $PORT -c config/heroku/unicorn.rb
sidekiq: bundle exec sidekiq -q high -q medium -q low
Right now, we run a very large test suite that takes 30 mins to complete, even when bisecting the suite across 10+ containers.
We use CircleCI and are considering adding a test like follows that simulates the Sidekiq bootup process.
require 'spec_helper'
RSpec.describe 'Booting processes' do
context 'Sidekiq server' do
it 'loads successfully' do
boot_sidekiq_sequence = <<-SEQ
require 'sidekiq/cli'
Sidekiq::CLI.instance.environment = '#{ENV['RAILS_ENV']}'
Sidekiq::CLI.instance.send(:boot_system)
SEQ
expect(system(%{echo "#{boot_sidekiq_sequence}" | ruby > /dev/null})).to be_truthy,
"The Sidekiq process could not boot up properly. Run `sidekiq` to troubleshoot"
end
end
end
The problem with this is that:
It's not quite a complete test of our application boot process which would require reading from our Procfile (we also want to protect devs from making breaking changes there)
It's slow and would add approx 30 secs to our test runtime
It's a bit low-level for my liking.
What makes testing the Sidekiq boot process in particular challenging is that it will only exit if there is an exception on boot.
Can anyone recommend a better, faster, more thorough approach to automating this kind of test?

For integration testing, it's always best to run the process as close to production as possible, so bundle exec sidekiq .... I wouldn't use the CLI API to boot it in-process.
I would first enqueue a special job which just does this:
def perform
puts "Sidekiq booted"
Process.kill($$, TERM) # terminate ourselves
end
Then execute Sidekiq by running the binary and monitoring STDOUT. If you see /booted/ within 30 seconds, PASS. If you see nothing within 30 seconds, FAIL and kill the child.

Thanks to the inspiration here I ended up with a test that executes the actual command from the Procfile:
require 'foreman/procfile'
require 'open3'
# Utilizes a Procfile like this:
# web: bundle exec puma -C config/puma.rb
# workers: bundle exec sidekiq
#
describe Sidekiq do
let(:procfile_name){ 'Procfile' }
let(:command_name){ 'workers' }
let(:command){ Foreman::Procfile.new(procfile_name)[command_name] || raise("'#{command_name}' not defined in '#{procfile_name}'") }
it 'launches without errors' do
expect(output_from(command)).to match(/Starting processing/)
end
def output_from(command)
Open3.popen2e(*command.split) do |_, out, wait_thr|
pid = wait_thr.pid
output_pending = IO.select([out], nil, nil, 10.seconds)
Process.kill 'TERM', pid
if output_pending
out.read
else
raise Timeout::Error
end
end
end
end

Related

Using foreman, can we run db migration once the database is started?

I want to run both the database and migration in foreman. However, I found that they usually run at the same time when I start foreman. Since at the time I run migration the database has not fully started yet, it causes the migration to fail.
Heroku using Procfile could facilitate a release phase. The phase would be run after all the commands are run. Can I do the same using foreman in my computer?
Heroku does not rely to Procfile to maintain the release process. The Heroku build stack does.
Since the foreman provides us the way to run multiple processes at the same time, not running processes in order, so your problem is not the responsibility of foreman
However, you have some other ways to do so.
Simple: since foreman can start your process with shell command, you can use basic shell command sleep (in seconds) for delaying your process
db_process: start_db_script.sh
migrarion_process: sleep 5; bundle exec rake db:migrate --trace
Full control: Instead of run default migration rake task, you can write another rake task which check the connection to database before execute the migration ( refer to this answer)
retried = 0
begin
# Establishes connection
ActiveRecord::Base.establish_connection
# Try to reconnect
# It will raise error if cannot reach your database
ActiveRecord::Base.connection.reconnect!
Rake::Task["db:migrate"].invoke if ActiveRecord::Base.connected?
rescue => e
retried += 1
if retried <= 5 # Retry only 5 times
sleep 1 # Wait 1 seconds before retry
retry
end
puts "#{e} Cannot connect to your database with 5 seconds"
end

Rufus Scheduler not running when rails server runs as daemon

I have a a rails app that is using Rufus Scheduler. When I turn on the rails server with:
rails s --port=4000
Rufus scheduler runs its tasks. If I run the rails server with:
rails s --port=4000 --daemon
Rufus no longer does its tasks. I added a couple of log messages. Here is the schedule code:
class AtTaskScheduler
def self.start
scheduler = Rufus::Scheduler.new
p "Starting Attask scheduler"
scheduler.every('5m') do
# test sending hip chat message
issue = Issue.new
issue.post_to_hipchat("Starting sync with AtTask","SYNC")
p "Launching Sync"
Issue.synchronize
end
end
end
Hipchat never gets the message from the scheduler and the log never gets the statement "Launching Sync".
Any ideas on what may be causing this?
There is documentation of this issue in the rufus-scheduler docs:
There is the handy rails server -d that starts a development Rails as
a daemon. The annoying thing is that the scheduler as seen above is
started in the main process that then gets forked and daemonized. The
rufus-scheduler thread (and any other thread) gets lost, no scheduling
happens.
I avoid running -d in development mode and bother about daemonizing
only for production deployment.
These are two well crafted articles on process daemonization, please
read them:
http://www.mikeperham.com/2014/09/22/dont-daemonize-your-daemons/
http://www.mikeperham.com/2014/07/07/use-runit/
If anyway, you need something like rails server -d, why not try bundle exec unicorn -D
instead? In my (limited) experience, it worked out of the box (well,
had to add gem 'unicorn' to Gemfile first).

Logging in delayed_job?

I can't get any log output from delayed_job, and I'm not sure my jobs are starting.
Here's my Procfile:
web: bundle exec rails server
worker: bundle exec rake jobs:work
worker: bundle exec clockwork app/clock.rb
And here's the job:
class ScanningJob
def perform
logger.info "logging from delayed_job"
end
def after(job)
Rails.logger.info "logging from after delayed_job"
end
end
I see that clockwork outputs to system out, and I can see worker executor starting, but I never see my log statements hit. I tried puts as well to no avail.
My clock file is pretty simple:
every(3.seconds, 'refreshlistings') { Delayed::Job.enqueue ScanningJob.new }
I just want to see this working, and lack of logging means I can't. What's going on here?
When I needed log output from Delayed Jobs, I found this question to be fairly helpful.
In config/initializers/delayed_job.rb I add the line:
Delayed::Worker.logger = Logger.new(File.join(Rails.root, 'log', 'dj.log'))
Then, in my job, I output to this log with:
Delayed::Worker.logger.info("Log Entry")
This results in the file log/dj.log being written to. This works in development, staging, and production environments.

Start or ensure that Delayed Job runs when an application/server restarts

We have to use delayed_job (or some other background-job processor) to run jobs in the background, but we're not allowed to change the boot scripts/boot-levels on the server. This means that the daemon is not guaranteed to remain available if the provider restarts the server (since the daemon would have been started by a capistrano recipe that is only run once per deployment).
Currently, the best way I can think of to ensure the delayed_job daemon is always running, is to add an initializer to our Rails application that checks if the daemon is running. If it's not running, then the initializer starts the daemon, otherwise, it just leaves it be.
The question, therefore, is how do we detect that the Delayed_Job daemon is running from inside a script? (We should be able to start up a daemon fairly easily, bit I don't know how to detect if one is already active).
Anyone have any ideas?
Regards,
Bernie
Based on the answer below, this is what I came up with. Just put it in config/initializers and you're all set:
#config/initializers/delayed_job.rb
DELAYED_JOB_PID_PATH = "#{Rails.root}/tmp/pids/delayed_job.pid"
def start_delayed_job
Thread.new do
`ruby script/delayed_job start`
end
end
def process_is_dead?
begin
pid = File.read(DELAYED_JOB_PID_PATH).strip
Process.kill(0, pid.to_i)
false
rescue
true
end
end
if !File.exist?(DELAYED_JOB_PID_PATH) && process_is_dead?
start_delayed_job
end
Some more cleanup ideas: The "begin" is not needed. You should rescue "no such process" in order not to fire new processes when something else goes wrong. Rescue "no such file or directory" as well to simplify the condition.
DELAYED_JOB_PID_PATH = "#{Rails.root}/tmp/pids/delayed_job.pid"
def start_delayed_job
Thread.new do
`ruby script/delayed_job start`
end
end
def daemon_is_running?
pid = File.read(DELAYED_JOB_PID_PATH).strip
Process.kill(0, pid.to_i)
true
rescue Errno::ENOENT, Errno::ESRCH # file or process not found
false
end
start_delayed_job unless daemon_is_running?
Keep in mind that this code won't work if you start more than one worker. And check out the "-m" argument of script/delayed_job which spawns a monitor process along with the daemon(s).
Check for the existence of the daemons PID file (File.exist? ...). If it's there then assume it's running else start it up.
Thank you for the solution provided in the question (and the answer that inspired it :-) ), it works for me, even with multiple workers (Rails 3.2.9, Ruby 1.9.3p327).
It worries me that I might forget to restart delayed_job after making some changes to lib for example, causing me to debug for hours before realizing that.
I added the following to my script/rails file in order to allow the code provided in the question to execute every time we start rails but not every time a worker starts:
puts "cleaning up delayed job pid..."
dj_pid_path = File.expand_path('../../tmp/pids/delayed_job.pid', __FILE__)
begin
File.delete(dj_pid_path)
rescue Errno::ENOENT # file does not exist
end
puts "delayed_job ready."
A little drawback that I'm facing with this though is that it also gets called with rails generate for example. I did not spend much time looking for a solution for that but suggestions are welcome :-)
Note that if you're using unicorn, you might want to add the same code to config/unicorn.rb before the before_fork call.
-- EDITED:
After playing around a little more with the solutions above, I ended up doing the following:
I created a file script/start_delayed_job.rb with the content:
puts "cleaning up delayed job pid..."
dj_pid_path = File.expand_path('../../tmp/pids/delayed_job.pid', __FILE__)
def kill_delayed(path)
begin
pid = File.read(path).strip
Process.kill(0, pid.to_i)
false
rescue
true
end
end
kill_delayed(dj_pid_path)
begin
File.delete(dj_pid_path)
rescue Errno::ENOENT # file does not exist
end
# spawn delayed
env = ARGV[1]
puts "spawing delayed job in the same env: #{env}"
# edited, next line has been replaced with the following on in order to ensure delayed job is running in the same environment as the one that spawned it
#Process.spawn("ruby script/delayed_job start")
system({ "RAILS_ENV" => env}, "ruby script/delayed_job start")
puts "delayed_job ready."
Now I can require this file anywhere I want, including 'script/rails' and 'config/unicorn.rb' by doing:
# in top of script/rails
START_DELAYED_PATH = File.expand_path('../start_delayed_job', __FILE__)
require "#{START_DELAYED_PATH}"
# in config/unicorn.rb, before before_fork, different expand_path
START_DELAYED_PATH = File.expand_path('../../script/start_delayed_job', __FILE__)
require "#{START_DELAYED_PATH}"
not great, but works
disclaimer: I say not great because this causes a periodic restart, which for many will not be desirable. And simply trying to start can cause problems because the implementation of DJ can lock up the queue if duplicate instances are created.
You could schedule cron tasks that run periodically to start the job(s) in question. Since DJ treats start commands as no-ops when the job is already running, it just works. This approach also takes care of the case where DJ dies for some reason other than a host restart.
# crontab example
0 * * * * /bin/bash -l -c 'cd /var/your-app/releases/20151207224034 && RAILS_ENV=production bundle exec script/delayed_job --queue=default -i=1 restart'
If you are using a gem like whenever this is pretty straightforward.
every 1.hour do
script "delayed_job --queue=default -i=1 restart"
script "delayed_job --queue=lowpri -i=2 restart"
end

Keeping a rake job running

I'm using delayed_job to run jobs, with new jobs being added every minute by a cronjob.
Currently I have an issue where the rake jobs:work task, currently started with 'nohup rake jobs:work &' manually, is randomly exiting.
While God seems to be a solution to some people, the extra memory overhead is rather annoying and I'd prefer a simpler solution that can be restarted by the deployment script (Capistrano).
Is there some bash/Ruby magic to make this happen, or am I destined to run a monitoring service on my server with some horrid hacks to allow the unprivelaged account the site deploys to the ability to restart it?
For me the daemons gem was unreliable with delayed_job. Could be a poorly written script (was using the one on collectiveidea's delayed_job github page), and not daemons fault, I'm not really sure. But for whatever reason, it would restart inconsistently on deployments.
I read somewhere this was due to it not waiting for the process to actually exit, so the pid files would get overwritten or something. But I didn't really bother to investigate. I switched to the daemons-spawn gem using these instructions and it seems to be much more reliable now.
The delayed_job docs suggest that you use a monitoring service to manage the rake worker job(s). I use runit--works well.
(You can install it in the mode where it does not replace init.)
Added:
Re: restart by Capistrano: yes, runit enables that. Just do a
sudo sv kill delayed_job
in your Capistrano recipe to kill the delayed_job worker. Runit will then restart it with your newly deployed code base.
I have implemented small rake task that restarts the jobs task over and over again:
desc "Start a delayed_job worker in a endless loop to prevent exits."
task :jobs => :environment do
while true
begin
Delayed::Worker.new(:min_priority => ENV['MIN_PRIORITY'],
:max_priority => ENV['MAX_PRIORITY'],
:quiet => false).start
rescue Exception => e
puts "Exception occured (#{e})"
end
puts "Task jobs:work exited, clearing queue and restarting"
sleep 1
Delayed::Job.delete_all
end
end
Apparently it did not work. So I ended with this simple solution:
for (( ;; )); do rake jobs:work --trace; done
get rid of delayed job and use either whenever or resque

Resources