I have a problem that sidekiq process is using almost all of my server CPU.
Today I removed sidekiq from the app but still when I run htop it shows heavy CPU usage from usr/local/bin/bundle exec sidekiq.
The weirdest thing is that the sidekiq was implemented about 5 months ago but the problem started to occur only recently (around two weeks ago).
I'm using sidekiq only for one background job.
I have tryed to run killall sidekiq on server to kill all sidekiq processes but nothing happens.
Here is my pretty simple sidekiq worker:
class UserWorker
include Sidekiq::Worker
def perform(user_id)
user = User.find(user_id)
url = "http://url_to_external_api"
uri = URI.parse(url)
request = Net::HTTP::Get.new(uri.request_uri)
response = Net::HTTP.start(uri.host, uri.port) do |http|
http.request(request)
end
age = JSON.parse(response.body)['user']['age']
user.age = age
user.save!
rescue ActiveRecord::RecordNotFound
# nothing to do here
end
end
At this point I'm pretty desperate because this sidekiq process is killing my server every other day and I can't even remove it completely.
EDIT:
When I run kill -9 PID to kill the process, it gets killed but right after that it's starts up again with another PID.
QUESTION
I just manually run the command /usr/bin/ruby1.9.1 /usr/local/bin/bundle exec sidekiq -r /var/www/path to my application on Amazon Ec2 instance and I got error: Error fetching message: Error connecting to Redis on 127.0.0.1:6379 (ECONNREFUSED)
So do I need to install redis on the server to run sidekiq?
Related
I built a simple test job for Sidekiq and added it to my schedule.yml file for Sidekiq Cron.
Here's my test job:
module Slack
class TestJob < ApplicationJob
queue_as :default
def perform(*args)
begin
SLACK_NOTIFIER.post(attachments: {"pretext": "test", "text": "hello"})
rescue Exception => error
puts error
end
end
end
end
The SLACK_NOTIFIER here is a simple API client for Slack that I initialize on startup.
And in my schedule.yml:
test_job:
cron: "* * * * *"
class: "Slack::TestJob"
queue: default
description: "Test"
So I wanted to have it run every minute, and it worked exactly as I expected.
However, I've now deleted the job file and removed the job from schedule.yml, and it still tries to run the job every minute. I've gone into my sidekiq dashboard, and I see a bunch of retries for that job. No matter how many times I kill them all, they just keep coming.
I've tried shutting down both the redis server and sidekiq several times. I've tried turning off my computer (after killing the servers, of course). It still keeps scheduling these jobs and it's interrupting my other jobs because it raises the following exception:
NameError: uninitialized constant Slack::TestJob
I've done a project-wide search for "TestJob", but get no results.
I only had the redis server open with this job for roughly 10 minutes...
Is there maybe something lingering in the redis database? I've looked into the redis-cli documentation, but I don't think any of it helps me.
Try calling $ FLUSHALL in the redis cli...
Other than that...
The sidekiq-cron documentation seems to expect that you check for the existence of schedule.yml explicitly...
#initializers/sidekiq.rb
schedule_file = "config/schedule.yml"
if File.exist?(schedule_file) && Sidekiq.server?
Sidekiq::Cron::Job.load_from_hash YAML.load_file(schedule_file)
end
The default for Sidekiq is to retry a failed job 25 times. Because the job doesn't exist, it fails... every time; Sidekiq only knows that the job fails (because attempting to perform a job that doesn't exist would raise an exception), so it flags it for retry... 25 times. Because you were scheduling that job every minute, you probably had a TON of these failed jobs queued up for retry. You can:
Wait ~3 weeks to hit the maximum number of retries
If you have a UI page set up for Sidekiq, you can see and clear these jobs in the Retries tab
Dig through the Redis CLI documentation for a way to identify these specific jobs and delete them (or flush the whole thing if you're not worried about losing other jobs)
I want to run both the database and migration in foreman. However, I found that they usually run at the same time when I start foreman. Since at the time I run migration the database has not fully started yet, it causes the migration to fail.
Heroku using Procfile could facilitate a release phase. The phase would be run after all the commands are run. Can I do the same using foreman in my computer?
Heroku does not rely to Procfile to maintain the release process. The Heroku build stack does.
Since the foreman provides us the way to run multiple processes at the same time, not running processes in order, so your problem is not the responsibility of foreman
However, you have some other ways to do so.
Simple: since foreman can start your process with shell command, you can use basic shell command sleep (in seconds) for delaying your process
db_process: start_db_script.sh
migrarion_process: sleep 5; bundle exec rake db:migrate --trace
Full control: Instead of run default migration rake task, you can write another rake task which check the connection to database before execute the migration ( refer to this answer)
retried = 0
begin
# Establishes connection
ActiveRecord::Base.establish_connection
# Try to reconnect
# It will raise error if cannot reach your database
ActiveRecord::Base.connection.reconnect!
Rake::Task["db:migrate"].invoke if ActiveRecord::Base.connected?
rescue => e
retried += 1
if retried <= 5 # Retry only 5 times
sleep 1 # Wait 1 seconds before retry
retry
end
puts "#{e} Cannot connect to your database with 5 seconds"
end
We've had a couple of issues recently in our Rails 4 app that could have been prevented with a simple check to see that both the main app and a worker process correctly boot up our application.
Our Procfile for Heroku looks like so:
web: bundle exec unicorn -p $PORT -c config/heroku/unicorn.rb
sidekiq: bundle exec sidekiq -q high -q medium -q low
Right now, we run a very large test suite that takes 30 mins to complete, even when bisecting the suite across 10+ containers.
We use CircleCI and are considering adding a test like follows that simulates the Sidekiq bootup process.
require 'spec_helper'
RSpec.describe 'Booting processes' do
context 'Sidekiq server' do
it 'loads successfully' do
boot_sidekiq_sequence = <<-SEQ
require 'sidekiq/cli'
Sidekiq::CLI.instance.environment = '#{ENV['RAILS_ENV']}'
Sidekiq::CLI.instance.send(:boot_system)
SEQ
expect(system(%{echo "#{boot_sidekiq_sequence}" | ruby > /dev/null})).to be_truthy,
"The Sidekiq process could not boot up properly. Run `sidekiq` to troubleshoot"
end
end
end
The problem with this is that:
It's not quite a complete test of our application boot process which would require reading from our Procfile (we also want to protect devs from making breaking changes there)
It's slow and would add approx 30 secs to our test runtime
It's a bit low-level for my liking.
What makes testing the Sidekiq boot process in particular challenging is that it will only exit if there is an exception on boot.
Can anyone recommend a better, faster, more thorough approach to automating this kind of test?
For integration testing, it's always best to run the process as close to production as possible, so bundle exec sidekiq .... I wouldn't use the CLI API to boot it in-process.
I would first enqueue a special job which just does this:
def perform
puts "Sidekiq booted"
Process.kill($$, TERM) # terminate ourselves
end
Then execute Sidekiq by running the binary and monitoring STDOUT. If you see /booted/ within 30 seconds, PASS. If you see nothing within 30 seconds, FAIL and kill the child.
Thanks to the inspiration here I ended up with a test that executes the actual command from the Procfile:
require 'foreman/procfile'
require 'open3'
# Utilizes a Procfile like this:
# web: bundle exec puma -C config/puma.rb
# workers: bundle exec sidekiq
#
describe Sidekiq do
let(:procfile_name){ 'Procfile' }
let(:command_name){ 'workers' }
let(:command){ Foreman::Procfile.new(procfile_name)[command_name] || raise("'#{command_name}' not defined in '#{procfile_name}'") }
it 'launches without errors' do
expect(output_from(command)).to match(/Starting processing/)
end
def output_from(command)
Open3.popen2e(*command.split) do |_, out, wait_thr|
pid = wait_thr.pid
output_pending = IO.select([out], nil, nil, 10.seconds)
Process.kill 'TERM', pid
if output_pending
out.read
else
raise Timeout::Error
end
end
end
end
I have a a rails app that is using Rufus Scheduler. When I turn on the rails server with:
rails s --port=4000
Rufus scheduler runs its tasks. If I run the rails server with:
rails s --port=4000 --daemon
Rufus no longer does its tasks. I added a couple of log messages. Here is the schedule code:
class AtTaskScheduler
def self.start
scheduler = Rufus::Scheduler.new
p "Starting Attask scheduler"
scheduler.every('5m') do
# test sending hip chat message
issue = Issue.new
issue.post_to_hipchat("Starting sync with AtTask","SYNC")
p "Launching Sync"
Issue.synchronize
end
end
end
Hipchat never gets the message from the scheduler and the log never gets the statement "Launching Sync".
Any ideas on what may be causing this?
There is documentation of this issue in the rufus-scheduler docs:
There is the handy rails server -d that starts a development Rails as
a daemon. The annoying thing is that the scheduler as seen above is
started in the main process that then gets forked and daemonized. The
rufus-scheduler thread (and any other thread) gets lost, no scheduling
happens.
I avoid running -d in development mode and bother about daemonizing
only for production deployment.
These are two well crafted articles on process daemonization, please
read them:
http://www.mikeperham.com/2014/09/22/dont-daemonize-your-daemons/
http://www.mikeperham.com/2014/07/07/use-runit/
If anyway, you need something like rails server -d, why not try bundle exec unicorn -D
instead? In my (limited) experience, it worked out of the box (well,
had to add gem 'unicorn' to Gemfile first).
I am initiating long running processes from a browser and showing results after it completes. I have defined in my controller :
def runthejob
pid = Process.fork
if pid.nil? then
#Child
output = execute_one_hour_job()
update_output_in_database(output)
# Exit here as this child process shouldn't continue anymore
exit
else
#Parent
Process.detach(pid)
#send response - job started...
end
The request in parent completes correctly. But , in child, there is always a "500 internal server error" . rails reports "Completed 500 Internal Server Error in 227192ms" . I am guessing this happens because the request response cycle of the child process is not completed as there is a "exit" in child. How do I fix this?
Is this the correct way to execute long running processes ? Is there any better way to do it?
When child is running, if I do "ps -e | grep rails" , I see that there are two instances of "rails server" . (I use the command "rails server" to start my rails server.)
ps -e | grep rails
75678 ttys002 /Users/xx/ruby-1.9.3-p194/bin/ruby script/rails server
75696 ttys002 /Users/xx/ruby-1.9.3-p194/bin/ruby script/rails server
Does this mean that there are two servers running ? How are the requests handled now? Wont the request go to the second server?
Thanks for helping me.
Try your code in production and see if the same error comes up. If not, your error may be from the development environment being cleared when the first request completes, whilst your forks still need the environment to exist. I haven't verified this with forks, but that is what happens with threads.
This line in your config/development.rb will retain the environment, but any code changes will require server restarts.
config.cache_classes = true
there are a lot of frameworks to do this in rails: DelayedJob, Resque, Sidekiq etc.
you can find a whole lot of examples on railscasts: http://railscasts.com/