Sidekiq jobs not persisting data on a rails app - ruby-on-rails

I have the following setup, an ubuntu server with nginx - passenger, postgres db, redis and sidekiq.
i have set up an upstart job to keep an eye on the sidekiq process.
description "Sidekiq Background Worker"
# no "start on", we don't want to automatically start
stop on (stopping workers or runlevel [06])
# change to match your deployment user
setuid deploy
setgid deploy
respawn
respawn limit 3 30
# TERM and USR1 are sent by sidekiqctl when stopping sidekiq. Without declaring these as normal exit codes, it just respawns.
normal exit 0 TERM USR1
script
# this script runs in /bin/sh by default
# respawn as bash so we can source in rbenv
exec /bin/bash <<EOT
# use syslog for logging
exec &> /dev/kmsg
# pull in system rbenv
export HOME=/home/deploy
export RAILS_ENV=production
source /home/deploy/.rvm/scripts/rvm
cd /home/deploy/myapp/current
bundle exec sidekiq -c 25 -L log/sidekiq.log -P tmp/pids/sidekiq.pid -q default -q payment -e production
EOT
end script
This actually work i can start, stop and restart the process. i can see the process over at the sidekiq web http monitor. but when i run this workers workers they get processed. Here are both workers:
class EventFinishedWorker
include Sidekiq::Worker
def perform(event_id)
event = Event.find(event_id)
score_a = event.team_A_score
score_b = event.team_B_score
event.bets.each do |bet|
if (score_a == bet.team_a_score) && (score_b == bet.team_b_score)
bet.guessed = true
bet.save
end
end
end
end
and
class PayPlayersBetsWorker
include Sidekiq::Worker
sidekiq_options :queue => :payment
def perform(event_id)
event = Event.find(event_id)
bets = event.bets.where(guessed: true)
total_winning_bets = bets.count
unless total_winning_bets == 0
pot = event.pot
amount_to_pay = pot / total_winning_bets
unless bets.empty?
bets.each do |bet|
account = bet.user.account
user = User.find_by_id(bet.user.id)
bet.payed = true
bet.save
account.wincoins += amount_to_pay
account.accum_wincoins.increment(amount_to_pay)
account.save
user.set_score(user.current_score)
end
end
end
end
end
2014-05-27T18:48:34Z 5985 TID-7agy8 EventFinishedWorker JID-33ef5d7e7de51189d698c1e7 INFO: start
2014-05-27T18:48:34Z 5985 TID-5s4rg PayPlayersBetsWorker JID-e8473aa1bc59f0b958d23509 INFO: start
2014-05-27T18:48:34Z 5985 TID-5s4rg PayPlayersBetsWorker JID-e8473aa1bc59f0b958d23509 INFO: done: 0.07 sec
2014-05-27T18:48:35Z 5985 TID-7agy8 EventFinishedWorker JID-33ef5d7e7de51189d698c1e7 INFO: done: 0.112 sec
I get no errors running both workers processes, but ...
cheking my data, EventFinishedWorker, does its job. no problems whatsoever. the second job PayPlayersBetsWorker doesnt work. but when i go into the console and do something like.
worker = PayPlayersBetsWorker.new
worker.perform(1)
it works it runs the jobs flawlessly. i have detected also that if i run sidekiq from the console directly not using upstart it works too.
bundle exec sidekiq -P tmp/pids/sidekiq.pid -q default -q payment -e production
this works. the problem seems to be running sidekiq with upstart. help would be appreciated :D

It looks like you have a race condition between the two jobs. EFW isn't finished before PPBW queries, doesn't find anything and immediately exits.

Related

How to restart Resque workers automatically when Redis server restarts

I have a Rails application that runs jobs on the background using the Resque adapter. I have noticed that once a couple of days my workers disappear (just stop), my jobs get stuck in the queue, and I have to restart the workers anew every time they stop.
I check using ps -e -o pid,command | grep [r]esque and launch workers in the background using
(RAILS_ENV=production PIDFILE=./resque.pid BACKGROUND=yes bundle exec rake resque:workers QUEUE='*' COUNT='12') 2>&1 | tee -a log/resque.log.
Then I stopped redis-server using /etc/init.d/redis-server stop and again checked the worker processes. They disappeared.
This gives a reason to think that worker processes stop maybe because the redis server restarting because of some reason.
Is there any Rails/Ruby way solution to this problem? What comes to my mind is writing a simple Ruby code that would watch the worker processes with the period, say, 5 seconds, and restart them if they stop.
UPDATE:
I don't want to use tools such as Monit, God, eye, and etc. They are not reliable. Then I will need to watch them too. Something like to install God to manage Resque workers, then install Monit to watch God, ...
UPDTAE
This is what I am using and it is really working. I manually stoped redis-server and then started it again. This script successfully launched the workers.
require 'logger'
module Watch
def self.workers_dead?
processes = `ps -e -o pid,command | grep [r]esque`
return true if processes.empty?
false
end
def self.check(time_interval)
logger = Logger.new('watch.log', 'daily')
logger.info("Starting watch")
while(true) do
if workers_dead?
logger.warn("Workers are dead")
restart_workers(logger)
end
sleep(time_interval)
end
end
def self.restart_workers(logger)
logger.info("Restarting workers...")
`cd /var/www/agts-api && (RAILS_ENV=production PIDFILE=./resque.pid BACKGROUND=yes rake resque:workers QUEUE='*' COUNT='12') 2>&1 | tee -a log/resque.log`
end
end
Process.daemon(true,true)
pid_file = File.dirname(__FILE__) + "#{__FILE__}.pid"
File.open(pid_file, 'w') { |f| f.write Process.pid }
Watch.check 10
You can use process monitoring tools such as monit, god, eye etc. These tools can check for resque PID and memory usage at time interval specified by you. You also have options to restart background processes if the memory limit exceeds your specified expectations. Personally, I use eye gem.
You could do it much simpler. Start Resque in foreground. When it exits, start it again. No pid files, no monitoring, no sleep.
require 'logger'
class Restarter
def initialize(cmd:, logger: Logger.new(STDOUT))
#cmd = cmd
#logger = logger
end
def start
loop do
#logger.info("Starting #{#cmd}")
system(#cmd)
#logger.warn("Process exited: #{#cmd}")
end
end
end
restarter = Restarter.new(
cmd: 'cd /var/www/agts-api && (RAILS_ENV=production rake resque:workers QUEUE='*' COUNT='12') 2>&1 | tee -a log/resque.log',
logger: Logger.new('watch.log', 'daily')
)
restarter.start

Capistrano not restarting Sidekiq

I have Capistrano deploying my app to a Ubuntu remote server on a cloud host. It works except that Sidekiq does not get restarted. After a deploy new Sidekiq jobs are stuck in the queue until it does finally get restarted. I currently manually SSH into the machine and run sudo initctl stop/start workers which works. I am not super strong at all with Capistrano and me research so far has failed to find me a solution to this. I am hoping I am missing something obvious to someone more familiar than me. Here is the relevant portion of my /config/deploy.rb file:
namespace :deploy do
namespace :sidekiq do
task :quiet do
on roles(:app) do
puts capture("pgrep -f 'workers' | xargs kill -USR1")
end
end
task :restart do
on roles(:app) do
execute :sudo, :initctl, :stop, :workers
execute :sudo, :initctl, :start, :workers
end
end
end
after 'deploy:starting', 'sidekiq:quiet'
after 'deploy:reverted', 'sidekiq:restart'
after 'deploy:published', 'sidekiq:restart'
end
UPDATE
From my reply logs:
DEBUG [268bc235] Running /usr/bin/env kill -0 $( cat /home/ubuntu/staging/shared/tmp/pids/sidekiq-0.pid ) as ubuntu#159.203.8.242
DEBUG [268bc235] Command: cd /home/ubuntu/staging/releases/20160806065537 && ( export RBENV_ROOT="$HOME/.rbenv" RBENV_VERSION="2.2.3" ; /usr/bin/env kill -0 $( cat /home/ubuntu/staging/shared/tmp/pids/sidekiq-0.pid ) )
DEBUG [268bc235] Finished in 0.471 seconds with exit status 1 (failed).
I don't believe you need those configs in your deploy.rb if you have the capistrano-sidekiq gem installed and called in your Capfile.
Make sure you have require 'capistrano/sidekiq' in your Capfile or it won't know to call the default tasks.

How to kill sidekiq job in Rails 4 with ActiveJob

Let's say we have test job:
class TestJob < ActiveJob::Base
queue_as :default
def perform(video)
video.process
end
end
After it we run bundle exec sidekiq start
And run in new terminal window in rails console
Video.first.pending!; TestJob.perform_later(Video.first)
Job is running, ffmpeg is on background, everything is fine, but according to official sidekiq wiki docs I try:
require 'sidekiq/api'
Sidekiq::Queue.new
=> #<Sidekiq::Queue:0x00000006406978 #name="default", #rname="queue:default">
Sidekiq::Queue.new.each {|job| puts job}
=> nil
Sidekiq::Queue.new.size
=> 0
ss = Sidekiq::ScheduledSet.new
=> #<Sidekiq::ScheduledSet:0x000000064a4330 #_size=0, #name="schedule">
ss.size
=> 0
Why there is no jobs? The job is running successfully ( I see it also in first window where sidekiq starts, but I can't see and delete it in rails console )
I am using Ubuntu 14 if it helps
With best regards, Ruslan.
UPD:
It looks that
ps = Sidekiq::ProcessSet.new; ps.each(&:quiet!)
works
, but it doesn't stop my ffmpeg process this part of code internally in process:
cmd = "ffmpeg -i #{input_file.shellescape} #{options} -threads 0 -y #{self.path + outfile}"
pid = spawn(cmd, :out => output_file, :err => output_file)
Process.wait(pid)
How to stop it?
If a job is processing, it's not enqueued anymore.
I use sidekiq web UI and various queues for my jobs. If I realize that particular job is failing I can clear out the queue of subsequent jobs of the same class.

Keep getting 504 Gateway Time-out after deploying on ec2 using rubber

I used the rubber gem to deploy my application on ec2.
I followed the instructions here: http://ramenlab.wordpress.com/2011/06/24/deploying-your-rails-app-to-aws-ec2-using-rubber/.
The process seems to finish successfully but when I try to use the app I keep getting 504 gateway time-out.
Why is this happening and how do I fix it?
Answer from Matthew Conway (reposted below): https://groups.google.com/forum/?fromgroups#!searchin/rubber-ec2/504/rubber-ec2/AtEoOf-T9M0/zgda0Fo1qeIJ
Note: Even with this code you need to do something like:
> cap deploy:update
> FILTER=app01,app02 cap deploy:restart
> FILTER=app03,app04 cap deploy:restart
I assume this is a rails application? The rails stack is notoriously slow to load up, so this delay in load time is probably what you are seeing. Passenger was supposed to make this better with the zero downtime feature v3, but they seemed to have reneged on that and are only going to be offering it as part of some undefined paid version at some undefined point n the future.
What I do is have multiple app server instances and restart them serially so that I can continue to serve traffic on one, while the others are restarting. Doesn't work with a single instance, but most production setups need multiple instances for redundancy/reliability anyway. This isn't currently part of rubber, but I have it deploy scripts setup for my app and will merge it in at some point - my config looks something like the below.
Matt
rubber-passenger.yml:
roles:
passenger:
rolling_restart_port: "#{passenger_listen_port}"
web_tools:
rolling_restart_port: "#{web_tools_port}"
deploy-apache.rb:
on :load do
rubber.serial_task self, :serial_restart, :roles => [:app, :apache] do
rsudo "service apache2 restart"
end
rubber.serial_task self, :serial_reload, :roles => [:app, :apache] do
# remove file checked by haproxy to take server out of pool, wait some
# secs for haproxy to realize it
maybe_sleep = " && sleep 5" if RUBBER_ENV == 'production'
rsudo "rm -f #{previous_release}/public/httpchk.txt #{current_release}/public/httpchk.txt#{maybe_sleep}"
rsudo "if ! ps ax | grep -v grep | grep -c apache2 &> /dev/null; then service apache2 start; else service apache2 reload; fi"
# Wait for passenger to startup before adding host back into haproxy pool
logger.info "Waiting for passenger to startup"
opts = get_host_options('rolling_restart_port') {|port| port.to_s}
rsudo "while ! curl -s -f http://localhost:$CAPISTRANO:VAR$/ &> /dev/null; do echo .; done", opts
# Touch the file so that haproxy adds this server back into the pool.
rsudo "touch #{current_path}/public/httpchk.txt#{maybe_sleep}"
end
end
after "deploy:restart", "rubber:apache:reload"
desc "Starts the apache web server"
task :start, :roles => :apache do
rsudo "service apache2 start"
opts = get_host_options('rolling_restart_port') {|port| port.to_s}
rsudo "while ! curl -s -f http://localhost:$CAPISTRANO:VAR$/ &> /dev/null; do echo .; done", opts
rsudo "touch #{current_path}/public/httpchk.txt"
end
I got the same error and solved the problem.
It was "haproxy" timeout. It is a load balancer installed by Rubber.
It is set to 30000ms, you should change it in rubber configuration file.
Good luck!

Managing unicorn instances / rails deployment

my head hurts today! :)
I need some help with rails deployment.
I migrated from cherokee to nginx and well, I migrated my django apps easily.
I just have to launch uwsgi to get a tcp socket and run my app. So I use supervisord to start / stop uwsgi sockets for every app.
I want something similar for rails. I just started with rails but I want to be able to deploy now so I won't have problems in a future.
I read all almost all internet and well, I have to ask here :)
My app lives in "/srv/http/hello/"
I use unicorn with a fancy config/unicorn.rb
worker_processes 2
working_directory "/srv/http/hello/"
# This loads the application in the master process before forking
# worker processes
# Read more about it here:
# http://unicorn.bogomips.org/Unicorn/Configurator.html
preload_app true
timeout 30
# This is where we specify the socket.
# We will point the upstream Nginx module to this socket later on
listen "/srv/http/hello/tmp/sockets/unicorn.sock", :backlog => 64
pid "/srv/http/hello/tmp/pids/unicorn.pid"
# Set the path of the log files inside the log folder of the testapp
stderr_path "/var/log/unicorn/hello-stderr.log"
stdout_path "/var/log/unicorn/hello-stdout.log"
before_fork do |server, worker|
# This option works in together with preload_app true setting
# What is does is prevent the master process from holding
# the database connection
defined?(ActiveRecord::Base) and
ActiveRecord::Base.connection.disconnect!
end
after_fork do |server, worker|
# Here we are establishing the connection after forking worker
# processes
defined?(ActiveRecord::Base) and
ActiveRecord::Base.establish_connection
end
I just adapted some example for internet.
If I run something like:
unicorn_rails -c config/unicorn.rb -D
It works like a charm. I tried to put that command in supervisord but hehe, I asked too much for it.
So, with some research I discovered god, so I picked the example of github and I put it on "config/god.rb" (which is the good place?)
# http://unicorn.bogomips.org/SIGNALS.html
rails_env = ENV['RAILS_ENV'] || 'development'
rails_root = ENV['RAILS_ROOT'] || "/srv/http/hello"
God.watch do |w|
w.name = "unicorn"
w.interval = 30.seconds # default
# unicorn needs to be run from the rails root
w.start = "cd #{rails_root} && /srv/http/.rvm/gems/ruby-1.9.3-p0#hello/bin/unicorn_rails -c #{rails_root}/config/unicorn.rb -E #{rails_env} -D"
# QUIT gracefully shuts down workers
w.stop = "kill -QUIT `cat #{rails_root}/tmp/pids/unicorn.pid`"
# USR2 causes the master to re-create itself and spawn a new worker pool
w.restart = "kill -USR2 `cat #{rails_root}/tmp/pids/unicorn.pid`"
w.start_grace = 10.seconds
w.restart_grace = 10.seconds
w.pid_file = "#{rails_root}/tmp/pids/unicorn.pid"
#w.uid = 'http'
#w.gid = 'webgroup'
w.behavior(:clean_pid_file)
w.start_if do |start|
start.condition(:process_running) do |c|
c.interval = 5.seconds
c.running = false
end
end
w.restart_if do |restart|
restart.condition(:memory_usage) do |c|
c.above = 300.megabytes
c.times = [3, 5] # 3 out of 5 intervals
end
restart.condition(:cpu_usage) do |c|
c.above = 50.percent
c.times = 5
end
end
# lifecycle
w.lifecycle do |on|
on.condition(:flapping) do |c|
c.to_state = [:start, :restart]
c.times = 5
c.within = 5.minute
c.transition = :unmonitored
c.retry_in = 10.minutes
c.retry_times = 5
c.retry_within = 2.hours
end
end
end
NOTE: I commented the uid and gid since I launch it from the http user or I get a permission error writing the pid. Also I put "development" because is just a "rails new hello"
Ok, so this works:
god -c config/god.rb -D
god launch unicorn good and in another terminal and I can do "god stop unicorn" and it works.
So questions...
1 - Is this the correct way?
2 - Do I need one god config for every project and launch a god process for every project?
3 - How can I manage those god's process? Something like supervisord "supervisorctl restart djangoproject"
4 - Will I die if I put "killall god" 3 times in a row? :P
5 - NEW QUESTION: Im too far if I say that I just need 1 god config with ALL unicorn instances, launch it in some form and just manage it with god? god start blah, god start bleh...
Thanks a lot, I just need to start rails development with a good system administration.
If you already have experience with uWSGI why not using it for rails too ?
http://projects.unbit.it/uwsgi/wiki/RubyOnRails
If you plan to host a lot of apps consider using the Emperor
http://projects.unbit.it/uwsgi/wiki/Emperor

Resources