Sidekiq Timeout - ruby-on-rails

I'm using sidekiq in my Rails App to asynchronously execute a stored procedure in my SQL Server database when the user moves from one specific page to another.
The problem is that the stored procedure takes up to 4 minutes to complete and sidekiq returns a timeout message (and then retry).
I don't want to change my application's global database timeout setting in database.yml (I don't even know if it would resolve, but I can't do that).
Is there any way to tell sidekiq that my method can take long and then stop getting timeout errors?
I really appreciate any help.
UPDATE #1
2013-06-03T17:14:18Z 6136 TID-1ac4ik GeneratePropertyProfile JID-de571df94f21b9159c74db6b INFO: start
2013-06-03T17:19:03Z 6136 TID-1ac4ik GeneratePropertyProfile JID-de571df94f21b9159c74db6b INFO: fail: 285.218 sec
2013-06-03T17:19:03Z 6136 TID-1ac4ik WARN: {"retry"=>true, "queue"=>"default", "class"=>"GeneratePropertyProfile", "args"=>[{"id"=>41915658}], "jid"=>"de571df94f21b9159c74db6b", "error_message"=>"TinyTds::Error: Adaptive Server connection timed out: EXEC gaiainc.sp_wait", "error_class"=>"ActiveRecord::StatementInvalid", "failed_at"=>2013-06-03 17:19:03 UTC, "retry_count"=>0}
2013-06-03T17:19:03Z 6136 TID-1ac4ik WARN: TinyTds::Error: Adaptive Server connection timed out: EXEC gaiainc.sp_wait
UPDATE #2
I got it to work without changing my database.yml. However, I had to add the following code in my initializers/sidekiq.rb:
Sidekiq.configure_server do |config|
ActiveRecord::Base.connection.disconnect!
ActiveRecord::Base.configurations[Rails.env]['timeout'] = 300000
ActiveRecord::Base.establish_connection
end
I know it's an ugly solution but I had no time to find another solution to make it work.
If anyone has a cleaner solution, please reply to this topic.
Thank you!

TERM signals that Sidekiq should shut down within the -t timeout option. Any workers that do not finish within the timeout are forcefully terminated and their messages are lost. The timeout defaults to 8 seconds
The Sidekiq TERM timeout is set in config/sidekiq.yml or with the -t parameter. Sidekiq will do its best to exit by this timeout when it receives TERM.
:timeout: 300

Related

- Rails & cron & whenever / a few minutes delay when the server is heavily loaded

I use a gem called whenever to manage my cron jobs.
In cronfile, I have every 1 minute cron job which call a task XXXX. My config/schedule.rb is like this:
every '* * * * *' do
rake "XXXXXXXX"
end
This cron job is working fine with make slight delay. Task XXXX starts to run its first line a few seconds after process is created. Since this task finishes in less than 1 minute, I should never have multiple processes at the same time.
However, the server is heavily loaded, this delay will become a few minutes.
This leads that many undone processes remain in my process list beacause cron job creates a process every minute.
This will cause the server to become heavier, if worst comes to worst, the server is completely dead.
why does it happen? How can I prevent cronjob from to delay calling a task?
you can add a dependent task to check whether the server is running or not, fail fast (a.k.a. fail early), for example i want to verify that my rails server is already started on port 3000 before call the rake :test
# Rakefile
task :check_localhost do
pid = system("lsof -i tcp:3000 -t")
fail unless pid # or you can use `abort('message')`
end
task :test => :check_localhost do
puts "****** THIS IS TEST ******"
end

Sidekiq/Redis queuing a job that doesn't exist

I built a simple test job for Sidekiq and added it to my schedule.yml file for Sidekiq Cron.
Here's my test job:
module Slack
class TestJob < ApplicationJob
queue_as :default
def perform(*args)
begin
SLACK_NOTIFIER.post(attachments: {"pretext": "test", "text": "hello"})
rescue Exception => error
puts error
end
end
end
end
The SLACK_NOTIFIER here is a simple API client for Slack that I initialize on startup.
And in my schedule.yml:
test_job:
cron: "* * * * *"
class: "Slack::TestJob"
queue: default
description: "Test"
So I wanted to have it run every minute, and it worked exactly as I expected.
However, I've now deleted the job file and removed the job from schedule.yml, and it still tries to run the job every minute. I've gone into my sidekiq dashboard, and I see a bunch of retries for that job. No matter how many times I kill them all, they just keep coming.
I've tried shutting down both the redis server and sidekiq several times. I've tried turning off my computer (after killing the servers, of course). It still keeps scheduling these jobs and it's interrupting my other jobs because it raises the following exception:
NameError: uninitialized constant Slack::TestJob
I've done a project-wide search for "TestJob", but get no results.
I only had the redis server open with this job for roughly 10 minutes...
Is there maybe something lingering in the redis database? I've looked into the redis-cli documentation, but I don't think any of it helps me.
Try calling $ FLUSHALL in the redis cli...
Other than that...
The sidekiq-cron documentation seems to expect that you check for the existence of schedule.yml explicitly...
#initializers/sidekiq.rb
schedule_file = "config/schedule.yml"
if File.exist?(schedule_file) && Sidekiq.server?
Sidekiq::Cron::Job.load_from_hash YAML.load_file(schedule_file)
end
The default for Sidekiq is to retry a failed job 25 times. Because the job doesn't exist, it fails... every time; Sidekiq only knows that the job fails (because attempting to perform a job that doesn't exist would raise an exception), so it flags it for retry... 25 times. Because you were scheduling that job every minute, you probably had a TON of these failed jobs queued up for retry. You can:
Wait ~3 weeks to hit the maximum number of retries
If you have a UI page set up for Sidekiq, you can see and clear these jobs in the Retries tab
Dig through the Redis CLI documentation for a way to identify these specific jobs and delete them (or flush the whole thing if you're not worried about losing other jobs)

PG::TRDeadlockDetected: ERROR: deadlock detected

I am restarting 8 puma workers via bundle exec pumactl -F config/puma.rb phased-restart what works fine. Now I am getting more and more postgres errors:
PG::TRDeadlockDetected: ERROR: deadlock detected
I found a about 50 of idle postgres processes running:
postgres: myapp myapp_production 127.0.0.1(59950) idle
postgres: myapp myapp_production 127.0.0.1(60141) idle
...
They disappear when I am running bundle exec pumactl -F config/puma.rb stop.
After starting the app with bundle exec pumactl -F config/puma.rb start, I get exactly 16 idle processes. (Eight too many in my opinion.)
How can I manage these processes better? Thanks for your help!
Update
My puma.rb:
environment 'production'
daemonize true
pidfile 'tmp/pids/puma.pid'
state_path 'tmp/pids/puma.state'
threads 0, 1
bind 'tcp://0.0.0.0:3010'
workers 8
quiet
I might have found a solution to my question: I had some queries outside of my controllers (custom middleware), which seem to have caused the problem.
If you have queries outside of controllers (ActiveMailer could also cause this problem), put your code in a ActiveRecord::Base.connection_pool.with_connection block:
ActiveRecord::Base.connection_pool.with_connection do
# code
end
ActiveRecord’s with_connection method yields a database connection from its pool to the block. When the block finishes, the connection is automatically checked back into the pool, avoiding connection leaks.
I hope this helps some of you!
Looks like this may be due to the database connections not getting closed on server shutdown. https://github.com/puma/puma/issues/59 A lot of people in that issue are using ActiveRecord:: ConnectionAdapters::ConnectionManagement to handle this, or you may be able to roll your own by using Puma's on_restart hook.

Override 30 seconds timeout on gem class timeout

My thin server is timing out after 30 seconds. I would like to override this ruby file.
DEFAULT_TIMEOUT from 30 seconds to 120 seconds. how to do it? Please let me know.
code is here:
https://github.com/macournoyer/thin/blob/master/lib/thin/server.rb
I would like to override without "already initialized constant" Warnings.
See the help
➜ ~/app ✓ thin --help | grep timeout
-t, --timeout SEC Request or command timeout in sec (default: 30)
So you can change it from the command line when starting the server
➜ ~/app ✓ thin --timeout 60 start
or you can set a config file somewhere like /etc/thin/your_app.yml with something like this
---
timeout: 60
and then run thin, pointing it at this YAML file with
thin -C /etc/thin/your_app.yml start
As a side note, you should consider if increasing your timeout is really necessary. Typically long running requests should be queued up and run later through a service like delayed_job or resque
After seeing your comment and learning you're using Heroku, I suggest you read the documentation
Occasionally a web request may hang or take an excessive amount of time to process by your application. When this happens the router will terminate the request if it takes longer than 30 seconds to complete. The timeout countdown begins when the request leaves the router. The request must then be processed in the dyno by your application, and then a response delivered back to the router within 30 seconds to avoid the timeout.
I even more strongly suggest looking into delayed_job, resque, or similar if you're using Heroku. You will have at least one worker running to handle the queue. HireFire is an excellent service to save you money by only spinning up workers when your queue actually has jobs to process.

ruby on rails, fork, long running background process

I am initiating long running processes from a browser and showing results after it completes. I have defined in my controller :
def runthejob
pid = Process.fork
if pid.nil? then
#Child
output = execute_one_hour_job()
update_output_in_database(output)
# Exit here as this child process shouldn't continue anymore
exit
else
#Parent
Process.detach(pid)
#send response - job started...
end
The request in parent completes correctly. But , in child, there is always a "500 internal server error" . rails reports "Completed 500 Internal Server Error in 227192ms" . I am guessing this happens because the request response cycle of the child process is not completed as there is a "exit" in child. How do I fix this?
Is this the correct way to execute long running processes ? Is there any better way to do it?
When child is running, if I do "ps -e | grep rails" , I see that there are two instances of "rails server" . (I use the command "rails server" to start my rails server.)
ps -e | grep rails
75678 ttys002 /Users/xx/ruby-1.9.3-p194/bin/ruby script/rails server
75696 ttys002 /Users/xx/ruby-1.9.3-p194/bin/ruby script/rails server
Does this mean that there are two servers running ? How are the requests handled now? Wont the request go to the second server?
Thanks for helping me.
Try your code in production and see if the same error comes up. If not, your error may be from the development environment being cleared when the first request completes, whilst your forks still need the environment to exist. I haven't verified this with forks, but that is what happens with threads.
This line in your config/development.rb will retain the environment, but any code changes will require server restarts.
config.cache_classes = true
there are a lot of frameworks to do this in rails: DelayedJob, Resque, Sidekiq etc.
you can find a whole lot of examples on railscasts: http://railscasts.com/

Resources