PG::TRDeadlockDetected: ERROR: deadlock detected - ruby-on-rails

I am restarting 8 puma workers via bundle exec pumactl -F config/puma.rb phased-restart what works fine. Now I am getting more and more postgres errors:
PG::TRDeadlockDetected: ERROR: deadlock detected
I found a about 50 of idle postgres processes running:
postgres: myapp myapp_production 127.0.0.1(59950) idle
postgres: myapp myapp_production 127.0.0.1(60141) idle
...
They disappear when I am running bundle exec pumactl -F config/puma.rb stop.
After starting the app with bundle exec pumactl -F config/puma.rb start, I get exactly 16 idle processes. (Eight too many in my opinion.)
How can I manage these processes better? Thanks for your help!
Update
My puma.rb:
environment 'production'
daemonize true
pidfile 'tmp/pids/puma.pid'
state_path 'tmp/pids/puma.state'
threads 0, 1
bind 'tcp://0.0.0.0:3010'
workers 8
quiet

I might have found a solution to my question: I had some queries outside of my controllers (custom middleware), which seem to have caused the problem.
If you have queries outside of controllers (ActiveMailer could also cause this problem), put your code in a ActiveRecord::Base.connection_pool.with_connection block:
ActiveRecord::Base.connection_pool.with_connection do
# code
end
ActiveRecord’s with_connection method yields a database connection from its pool to the block. When the block finishes, the connection is automatically checked back into the pool, avoiding connection leaks.
I hope this helps some of you!

Looks like this may be due to the database connections not getting closed on server shutdown. https://github.com/puma/puma/issues/59 A lot of people in that issue are using ActiveRecord:: ConnectionAdapters::ConnectionManagement to handle this, or you may be able to roll your own by using Puma's on_restart hook.

Related

Using foreman, can we run db migration once the database is started?

I want to run both the database and migration in foreman. However, I found that they usually run at the same time when I start foreman. Since at the time I run migration the database has not fully started yet, it causes the migration to fail.
Heroku using Procfile could facilitate a release phase. The phase would be run after all the commands are run. Can I do the same using foreman in my computer?
Heroku does not rely to Procfile to maintain the release process. The Heroku build stack does.
Since the foreman provides us the way to run multiple processes at the same time, not running processes in order, so your problem is not the responsibility of foreman
However, you have some other ways to do so.
Simple: since foreman can start your process with shell command, you can use basic shell command sleep (in seconds) for delaying your process
db_process: start_db_script.sh
migrarion_process: sleep 5; bundle exec rake db:migrate --trace
Full control: Instead of run default migration rake task, you can write another rake task which check the connection to database before execute the migration ( refer to this answer)
retried = 0
begin
# Establishes connection
ActiveRecord::Base.establish_connection
# Try to reconnect
# It will raise error if cannot reach your database
ActiveRecord::Base.connection.reconnect!
Rake::Task["db:migrate"].invoke if ActiveRecord::Base.connected?
rescue => e
retried += 1
if retried <= 5 # Retry only 5 times
sleep 1 # Wait 1 seconds before retry
retry
end
puts "#{e} Cannot connect to your database with 5 seconds"
end

Sidekiq Timeout

I'm using sidekiq in my Rails App to asynchronously execute a stored procedure in my SQL Server database when the user moves from one specific page to another.
The problem is that the stored procedure takes up to 4 minutes to complete and sidekiq returns a timeout message (and then retry).
I don't want to change my application's global database timeout setting in database.yml (I don't even know if it would resolve, but I can't do that).
Is there any way to tell sidekiq that my method can take long and then stop getting timeout errors?
I really appreciate any help.
UPDATE #1
2013-06-03T17:14:18Z 6136 TID-1ac4ik GeneratePropertyProfile JID-de571df94f21b9159c74db6b INFO: start
2013-06-03T17:19:03Z 6136 TID-1ac4ik GeneratePropertyProfile JID-de571df94f21b9159c74db6b INFO: fail: 285.218 sec
2013-06-03T17:19:03Z 6136 TID-1ac4ik WARN: {"retry"=>true, "queue"=>"default", "class"=>"GeneratePropertyProfile", "args"=>[{"id"=>41915658}], "jid"=>"de571df94f21b9159c74db6b", "error_message"=>"TinyTds::Error: Adaptive Server connection timed out: EXEC gaiainc.sp_wait", "error_class"=>"ActiveRecord::StatementInvalid", "failed_at"=>2013-06-03 17:19:03 UTC, "retry_count"=>0}
2013-06-03T17:19:03Z 6136 TID-1ac4ik WARN: TinyTds::Error: Adaptive Server connection timed out: EXEC gaiainc.sp_wait
UPDATE #2
I got it to work without changing my database.yml. However, I had to add the following code in my initializers/sidekiq.rb:
Sidekiq.configure_server do |config|
ActiveRecord::Base.connection.disconnect!
ActiveRecord::Base.configurations[Rails.env]['timeout'] = 300000
ActiveRecord::Base.establish_connection
end
I know it's an ugly solution but I had no time to find another solution to make it work.
If anyone has a cleaner solution, please reply to this topic.
Thank you!
TERM signals that Sidekiq should shut down within the -t timeout option. Any workers that do not finish within the timeout are forcefully terminated and their messages are lost. The timeout defaults to 8 seconds
The Sidekiq TERM timeout is set in config/sidekiq.yml or with the -t parameter. Sidekiq will do its best to exit by this timeout when it receives TERM.
:timeout: 300

Sidekiq worker not getting triggered

I am using Sidekiq for my background jobs:
I have a worker app/workers/data_import_worker.rb
class DataImportWorker
include Sidekiq::Worker
sidekiq_options retry: false
def perform(job_id,file_name)
begin
#Some logic in it .....
end
end
Called from a file lib/parse_excel.rb
def parse_raw_data
#job_id and #filename are defined bfr
DataImportWorker.perform_async(job_id,filename)
end
As soon as i trigger it from my action the worker is not getting called.. Redis is running on localhost:6379
Any idea why this must be happening. The Environment is Linux.
I had a similar problem where Sidekiq was running but when I called perform_async it didn't do anything except return true.
The problem was rspec-sidekiq was added to my ":development, :test" group. I fixed the problem by moving rspec-sidekiq to the ":test" group only.
Start sidekiq from the root directory of your Rails app. For example,
bundle exec sidekiq -e staging -C config/sidekiq.yml
I encounter the same problem, it turns out that the argument I've passed in the function perform_async is not appropriate, it seems that one should not pass any query result in perform_async, you must do all the query in the function perform.
You need to specify the name of the queue that worker is for.
Example:
sidekiq_options retry: false, :queue => data_import_worker
data_import_worker can be any name you want to give it.
Then when you go to the web interface: yoursite.com/sidekiq, you'll be able to see the current workers for the queue "data_import_worker"
For me when doing a perform_later, it would enqueue but never remove from queue. I needed to add my queue name to the sidekiq.yml file
---
:concurrency: 25
:pidfile: ./tmp/pids/sidekiq.pid
:logfile: ./log/sidekiq.log
:queues:
- default
- my_queue
Lost a good 15 min on this. To check if Sidekiq is correctly loading your config file (with the queues names), go to the web interface in the Busy tab and you'll find your Process ID and below it you'll find your queues.
In our case, we had misspelled mailer (the correct ActiveJob queue for Mailers is mailers, in plural).
My issue was simply having the worker file in the wrong path.
Needs to be in "project_root/app/worker/worker.rb", not "project_root/worker/worker.rb"
Check the file path!
is it realy run multiple workers on standalone sidekiq?
for example I have 2 workers:
ProccessWorker
CallbackWorker
when I am runnigs sidekiq:
bundle exec sidekiq -r ./workers/proccess_worker.rb -C ./config/sidekiq.yml
only one worker in same time.
I was calling perform_async(23) in a production console, however my sidekiq was started in staging mode.
After I started the Sidekiq in production mode, things have started working very well.

Resque: worker status is not right

Resque is currently showing me that I have a worker doing work on a queue. That worker was shutdown by me in the middle of the queue (it's just for testing) and the worker is still showing as running. I've confirmed the process ID has been killed and bluepill is no longer monitoring it. I can't find anyway in the UI to force clear that it is working.
What's the best way to update the status for the # of workers that are currently up (I have 2, web UI reports 3).
You may have a lingering pid file. This file is independent of the process running; in other words, when you killed the process, it didn't delete the pid file.
If you're using a typical Rails and Resque setup, Resque will store the pid in the Rails ./tmp directory.
Some Resque start scripts specify the pid file in a different location, something like this:
PIDFILE=foo/bar/resque/pid bundle exec rake resque:work
Wherever the script puts the pid file, look there, then delete it, then restart.
Also on the command line, you can ask redis for the running workers:
redis-cli keys *worker:*
If there are workers that you don't expect, you can delete them with:
redis-cli del <keyname>
Try to restart the applications.
For future references: also have a look under https://github.com/resque/resque/issues/299

run multi delayed_job instances per RAILS_ENV

I'm working on a Rails app with multi RAILS_Env
env_name1:
adapter: mysql
username: root
password:
host: localhost
database: db_name_1
env_name2:
adapter: mysql
username: root
password:
host: localhost
database: db_name_2
...
..
.
And i'm using delayed_job (2.0.5) plugin to manage asynchrone and background work.
I would like start multi delayed_job per RAILS_ENV:
RAILS_ENV=env_name1 script/delayed_job start
RAILS_ENV=env_name2 script/delayed_job start
..
I noticed that I can run only one delayed_job instance
for the 2nd, I have this error "ERROR: there is already one or more instance(s) of the program running"
My question : is't possible to run multi delayed_job instances per RAILS_ENV?
Thanks
You can have multiple instance of delayed job running as long as they have different process names. Like Slim mentioned in his comment, you can use the -i flag to add a unique numerical identifier to the process name. So the commands would look like:
RAILS_ENV=env_name1 script/delayed_job -i 1 start
RAILS_ENV=env_name2 script/delayed_job -i 2 start
This would create two seperate delayed job instances, naming them delayed_job.1 and delayed_job.2
A gotcha is that when you do this you also have to use the same flags when stopping them. Omitting the -i 1 or -i 2 when calling stop, won't stop them. As delayed job won't be able to find the correct corresponding process to stop.
Not sure if it'll solve your problem but... I often need to run multiple versions of script/server - and those don't play nice with each other either. The way to get them running is to use different ports. eg:
RAILS_ENV=env_name1 script/server -p 3000
RAILS_ENV=env_name2 script/server -p 3002
Perhaps this'll work for delayed_job too?
(though I'd avoid port 3000 as it's the std rails port) :)

Resources