Limit the amount of workers per queue in Sidekiq - ruby-on-rails

I've been trying to limit the amount of workers per queue using the sidekiq-limit_fetch gem, and Sidekiq seems to "see" the imposed limits in the log but when I watch the workers the limits are ignored.
Here's the part from the log where Sidekiq sees the limits:
2013-04-02T05:47:19Z 748 TID-11ilcw DEBUG: {:queues=>
["recommendvariations",
"recommendvariations",
"recommendvariations",
"recommendphenotypes",
"recommendphenotypes",
"recommendphenotypes",
"preparse",
"preparse",
"preparse",
"parse",
"parse",
"parse",
"zipgenotyping",
"zipgenotyping",
"zipfulldata",
"deletegenotype",
"fitbit",
"frequency",
"genomegov",
"mailnewgenotype",
"mendeley_details",
"mendeley",
"pgp",
"plos_details",
"plos",
"snpedia",
"fixphenotypes"],
:concurrency=>5,
:require=>".",
:environment=>"production",
:timeout=>8,
:profile=>false,
:verbose=>true,
:pidfile=>"/tmp/sidekiq.pid",
:logfile=>"./log/sidekiq.log",
:limits=>
{"recommendvariations"=>1,
"recommendphenotypes"=>1,
"preparse"=>2,
"parse"=>2,
"zipgenotyping"=>1,
"zipfulldata"=>1,
"fitbit"=>3,
"frequency"=>10,
"genomegov"=>1,
"mailnewgenotype"=>1,
"mendeley_details"=>1,
"mendeley"=>1,
"pgp"=>1,
"plos_details"=>1,
"plos"=>1,
"snpedia"=>1,
"fixphenotypes"=>1},
:strict=>false,
:config_file=>"config/sidekiq.yml",
:tag=>"snpr"}
and here's the sidekiq.yml. Judging from the web-interface of sidekiq the limits are ignored - right now, I got 2 workers on the "recommendvariations"-queue but that should be 1.
I start the workers over bundle exec sidekiq -e production -C config/sidekiq.yml.
Has anyone else ever encountered this?

Did you try to set the limit in a sidekiq.rb initializer file?
Like this:
Sidekiq::Queue['recommend'].limit = 1
It worked for me.

Related

Couldn't call app. Bad request to "curl 'http://localhost:3564/' -s --fail 2>&1" derailed_benchmarks gem

I am getting error messages as pasted below:
% USE_SERVER=puma bundle exec derailed exec perf:mem_over_time
Booting: production
docking_dev already exists
Endpoint: "/"
Port: 3857
Server: "puma"
[4990] Puma starting in cluster mode...
[4990] * Version 3.7.0 (ruby 2.3.3-p222), codename: Snowy Sagebrush
[4990] * Min threads: 5, max threads: 5
[4990] * Environment: production
[4990] * Process workers: 2
[4990] * Preloading application
[4990] * Listening on tcp://0.0.0.0:3000
[4990] Use Ctrl-C to stop
[4990] - Worker 0 (pid: 5013) booted, phase: 0
[4990] - Worker 1 (pid: 5014) booted, phase: 0
PID: 4990
149.67578125
Couldn't call app.
Bad request to "curl 'http://localhost:3857/' -s --fail 2>&1"
***RESPONSE***:
""
[5014] ! Detected parent died, dying
[5013] ! Detected parent died, dying
I checked RAILS_ENV=production rails server and RAILS_ENV=production rails console both working as expected. What else I need to check to make it working. Is this because my http://localhost:3000/ url has authentication enabled. I checked that I turned force_ssl to false. I checked this post, what it suggested not helping.
I also don't know why it is picking some random ports every time I run it, like in this pasted one it is 3857. But my app runs using 3000 port locally. Is there something I need to do to so that it uses same port 3000?
P.S. Why random port I got to know from gem code.
Ok I fixed it. In my case, root url redirects to login page when users are not signed in. So this is causing the gem bad requests, it seems not able to handle 302 correctly at this moment. So I fixed it..
PATH_TO_HIT="/login" bundle exec derailed exec perf:mem_over_time
Yes I also had to removed the USE_SERVER=puma as it was causing errors too.

Why are we out of database connections on Heroku?

We have a Rails app on Heroku with Sidekiq and are running out of database connections.
ActiveRecord::ConnectionTimeoutError: could not obtain a database
connection within 5.000 seconds (waited 5.000 seconds)
Heroku stuff:
Database plan: Standard0 (120 connections)
Web dynos: 2 Standard-2X
Worker dynos: 1 Standard-2X
heroku config:
MAX_THREADS: 5
(DB_POOL not set)
(WEB_CONCURRENCY not set)
Procfile:
web: bundle exec puma -C config/puma.rb
worker: bundle exec sidekiq
database.yml:
...
production:
url: <%= ENV["DATABASE_URL"] %>
pool: <%= ENV["DB_POOL"] || ENV['MAX_THREADS'] || 5 %>
puma.rb:
# https://devcenter.heroku.com/articles/deploying-rails-applications-with-the-puma-web-server#adding-puma-to-your-application
workers Integer(ENV['WEB_CONCURRENCY'] || 2)
threads_count = Integer(ENV['MAX_THREADS'] || 2)
threads threads_count, threads_count
preload_app!
rackup DefaultRackup
port ENV['PORT'] || 3000
environment ENV['RACK_ENV'] || 'development'
on_worker_boot do
# Worker specific setup for Rails 4.1+
# See: https://devcenter.heroku.com/articles/deploying-rails-applications-with-the-puma-web-server#on-worker-boot
ActiveRecord::Base.establish_connection
end
sidekiq.yml:
---
:concurrency: 25
:queues:
- [default]
We also have a couple of rake tasks that fire every 10 minutes, and they finish within a second or two.
The problem seems to happen when we do a lot of message processing in sidekiq. We do something like:
get article headlines from a 3rd party web service
insert each headline into the db inside a single transaction
create a message in sidekiq for each headline (worker.perform_async)
each message is processed, hits an endpoint to get the body and updates the body (can take .5 - 3 seconds)
While number 4 is happening we see the connection issue.
My understanding is we are way, way, way below the connection limit with our configuration above, but did we do something incorrectly? Is something just consuming the pool? Any help would be great, thanks.
Sources:
https://devcenter.heroku.com/articles/concurrency-and-database-connections
https://devcenter.heroku.com/articles/deploying-rails-applications-with-the-puma-web-server
https://github.com/mperham/sidekiq/wiki/Advanced-Options
You are sharing 5 DB connections among 25 Sidekiq threads. Set DB_POOL to 25 or Sidekiq's concurrency to 5.

Rake Task killed probably by out-of-memory issue

I have a rake task and when I run it in console, it is killed. This rake task operates with a table of cca 40.000 rows, I guess that may be a problem with Out of memory.
Also, I believe that this query used is optimized for dealing with long tables:
MyModel.where(:processed => false).pluck(:attribute_for_analysis).find_each(:batch_size => 100) do |a|
# deal with 40000 rows and only attribute `attribute_for_analysis`.
end
This task will not be run in the future on regular basis, so I want to avoid some job monitoring solutions like God etc...but considering background jobs e.g.Rescue job.
I work with Ubuntu, ruby 2.0 and rails 3.2.14
> My free memory is as follows:
Mem: total used free shared buffers cached
3891076 1901532 1989544 0 1240 368128
-/+ buffers/cache: 1532164 2358912
Swap: 4035580 507108 3528472
QUESTIONS:
How to investigate why rake task is always killed (answered)
How to make this rake task running ( not answered - still is killed )
What is the difference between total-vm, aton-rs, file-rss (not answered)
UPDATE 1
-Can someone explain the difference between?:
total-vm
anon-rss
file-rss
$ grep "Killed process" /var/log/syslog
Dec 25 13:31:14 Lenovo-G580 kernel: [15692.810010] Killed process 10017 (ruby) total-vm:5605064kB, anon-rss:3126296kB, file-rss:988kB
Dec 25 13:56:44 Lenovo-G580 kernel: [17221.484357] Killed process 10308 (ruby) total-vm:5832176kB, anon-rss:3190528kB, file-rss:1092kB
Dec 25 13:56:44 Lenovo-G580 kernel: [17221.498432] Killed process 10334 (ruby-timer-thr) total-vm:5832176kB, anon-rss:3190536kB, file-rss:1092kB
Dec 25 15:03:50 Lenovo-G580 kernel: [21243.138675] Killed process 11586 (ruby) total-vm:5547856kB, anon-rss:3085052kB, file-rss:1008kB
UPDATE 2
modified query like this and rake task is still killed.
MyModel.where(:processed => false).find_in_batches do |group|
p system("free -k")
group.each do |row| # process
end
end

How to debug/fix random occurring Redis::TimeoutError?

I have a rails app running which is using redis quite a lot - however - I'm seeing quite a few Redis::TimeoutError occurring here and there, from time to time. There is no pattern in the circumstances. It occurs both in the web app and in the background jobs (which is being processed using sidekiq) - not often but from time to time.
Now I have no idea how to track down the root cause of this and hence no idea how to fix it.
Here is a little background on my setup:
The redis instance is running on a separate physical server which is connected to both my web server and background server in a private local 1Gbit network. All servers are running ubuntu 12.04. The redis version is 2.6.10. I'm connecting from my rails app (which is 3.2) using an initializer like so:
require 'redis'
require 'redis/objects'
REDIS = Redis.new(:url => APP_CONFIG['REDIS_URL'])
Redis.current = REDIS
This is the output of redis-cli INFO:
# Server
redis_version:2.6.10
redis_git_sha1:00000000
redis_git_dirty:0
redis_mode:standalone
os:Linux 3.2.0-38-generic x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:4.6.3
process_id:28475
run_id:d89bbb1b81d3169c4228cf23c0988ae437d496a1
tcp_port:6379
uptime_in_seconds:14913365
uptime_in_days:172
lru_clock:1507056
# Clients
connected_clients:233
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:19
# Memory
used_memory:801637360
used_memory_human:764.50M
used_memory_rss:594706432
used_memory_peak:4295394784
used_memory_peak_human:4.00G
used_memory_lua:31744
mem_fragmentation_ratio:0.74
mem_allocator:jemalloc-3.3.0
# Persistence
loading:0
rdb_changes_since_last_save:23166
rdb_bgsave_in_progress:0
rdb_last_save_time:1378219310
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:4
rdb_current_bgsave_time_sec:-1
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
# Stats
total_connections_received:932395
total_commands_processed:3088408103
instantaneous_ops_per_sec:837
rejected_connections:0
expired_keys:31428
evicted_keys:3007
keyspace_hits:124093049
keyspace_misses:53060192
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:17651
# Replication
role:master
connected_slaves:1
slave0:192.168.0.2,6379,online
# CPU
used_cpu_sys:54000.21
used_cpu_user:73692.52
used_cpu_sys_children:36229.79
used_cpu_user_children:420655.84
# Keyspace
db0:keys=1498962,expires=1310
In my redis config I have the following set:
\fidaemonize yes
pidfile /var/run/redis/redis-server.pid
timeout 0
loglevel notice
databases 1
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /var/lib/redis
slave-serve-stale-data yes
slave-read-only yes
slave-priority 100
maxclients 1000
maxmemory 4GB
maxmemory-policy volatile-lru
appendonly no
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-entries 512
list-max-ziplist-value 64
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
That could come from many issues :
because you use the SAVE command (it is setup in your conf) generating a lot of I/O and hammering the server, especially if you use EBS volumes on Amazon.
because you have a Redis slave (same as before, doing SAVE before mirroring).
because you use a KEY * which is very slow on a lot of indexes.
Try "slowlog" command on the redis server to see if there are some "slow query".
Write some logs when "TimeoutError" happens, to see if the "error redis command" in the "slow log".
adjust your timeout setting on the client side。
It might be a problem on the client side if server performs normally. Each redis client instance, not the server, also has a timeout setting, and the default setting is very short - something like a few milliseconds. So if the server does not respond within that time, a Redis::TimeoutError will be raised by the client.
First thing you can try is to set a longer timeout value, and see if things get better.
redis_url = 'redis://user:password#host:port/'
redis = Redis.connect(:url => redis_url, :timeout => 0.7)
Even with longer timeout setting, there is no guarantee that timeout would not happen, but then it'd be a problem of the design of your system.
Are you rolling your own code to connect to redis or just letting sidekiq handle it? I think you should really just design your connection code to reconnect if the connection has been lost. You can rescue Redis::BaseConnectionError and reconnect.

Are there console commands to look at whats in the queue and to clear the queue in Sidekiq?

I'm used to using delayed_jobs method of going into the console to see whats in the queue, and the ease of clearing the queue when needed. Are there similar commands in Sidekiq for this? Thanks!
There is an ergonomic API for viewing and managing queues.
It is not required by default.
require 'sidekiq/api'
Here's the excerpt:
# get a handle to the default queue
default_queue = Sidekiq::Queue.new
# get a handle to the mailer queue
mailer_queue = Sidekiq::Queue.new("mailer")
# How many jobs are in the default queue?
default_queue.size # => 1001
# How many jobs are in the mailer queue?
mailer_queue.size # => 50
#Deletes all Jobs in a Queue, by removing the queue.
default_queue.clear
You can also get some summary statistics.
stats = Sidekiq::Stats.new
# Get the number of jobs that have been processed.
stats.processed # => 100
# Get the number of jobs that have failed.
stats.failed # => 3
# Get the queues with name and number enqueued.
stats.queues # => { "default" => 1001, "email" => 50 }
#Gets the number of jobs enqueued in all queues (does NOT include retries and scheduled jobs).
stats.enqueued # => 1051
I haven't ever used Sidekiq, so it's possible that there are methods just for viewing the queued jobs, but they would really just be wrappers around Redis commands, since that's basically all Sidekiq (and Resque) is:
# See workers
Sidekiq::Client.registered_workers
# See queues
Sidekiq::Client.registered_queues
# See all jobs for one queue
Sidekiq.redis { |r| r.lrange "queue:app_queue", 0, -1 }
# See all jobs in all queues
Sidekiq::Client.registered_queues.each do |q|
Sidekiq.redis { |r| r.lrange "queue:#{q}", 0, -1 }
end
# Remove a queue and all of its jobs
Sidekiq.redis do |r|
r.srem "queues", "app_queue"
r.del "queue:app_queue"
end
Unfortunately, removing a specific job is a little more difficult as you'd have to copy its exact value:
# Remove a specific job from a queue
Sidekiq.redis { |r| r.lrem "queue:app_queue", -1, "the payload string stored in Redis" }
You could do all of this even more easily via redis-cli :
$ redis-cli
> select 0 # (or whichever namespace Sidekiq is using)
> keys * # (just to get an idea of what you're working with)
> smembers queues
> lrange queues:app_queue 0 -1
> lrem queues:app_queue -1 "payload"
if there is any scheduled job. You may delete all the jobs using the following command:
Sidekiq::ScheduledSet.new.clear
if there any queues you wanted to delete all jobs you may use the following command:
Sidekiq::Queue.new.clear
Retries Jobs can be removed by the following command also:
Sidekiq::RetrySet.new.clear
There are more information here at the following link, you may checkout:
https://github.com/mperham/sidekiq/wiki/API
There is a API for accessing real-time information about workers, queues and jobs.
Visit here https://github.com/mperham/sidekiq/wiki/API
A workaround is to use the testing module (require 'sidekiq/testing') and to drain the worker (MyWorker.drain).
There were hanged 'workers' in default queue and I was able to see them though web interface. But they weren't available from console if I used Sidekiq::Queue.new.size
irb(main):002:0> Sidekiq::Queue.new.size
2014-03-04T14:37:43Z 17256 TID-oujb9c974 INFO: Sidekiq client with redis options {:namespace=>"sidekiq_staging"}
=> 0
Using redis-cli I was able to find them
redis 127.0.0.1:6379> keys *
1) "sidekiq_staging:worker:ip-xxx-xxx-xxx-xxx:7635c39a29d7b255b564970bea51c026-69853672483440:default"
2) "sidekiq_staging:worker:ip-xxx-xxx-xxx-xxx:0cf585f5e93e1850eee1ae4613a08e45-70328697677500:default:started"
3) "sidekiq_staging:worker:ip-xxx-xxx-xxx-xxx:7635c39a29d7b255b564970bea51c026-69853672320140:default:started"
...
The solution was:
irb(main):003:0> Sidekiq.redis { |r| r.del "workers", 0, -1 }
=> 1
Also in the Sidekiq v3 there is a command
Sidekiq::Workers.new.prune
But for some reason it didn't work for me that day
And if you want to clear the sidekiq retry queue, it's this: Sidekiq::RetrySet.new.clear
$ redis-cli
> select 0 # (or whichever namespace Sidekiq is using)
> keys * # (just to get an idea of what you're working with)
> smembers queues
> lrange queue:queue_name 0 -1 # (queue_name must be your relevant queue)
> lrem queue:queue_name -1 "payload"
Rake task for clear all sidekiq queues:
namespace :sidekiq do
desc 'Clear sidekiq queue'
task clear: :environment do
require 'sidekiq/api'
Sidekiq::Queue.all.each(&:clear)
end
end
Usage:
rake sidekiq:clear
This is not a direct solution for the Rails console, but for a quick monitoring of the Sidekiq task count and queue size, you can use sidekiqmon binary that ships with Sidekiq 6+:
$ sidekiqmon
Sidekiq 6.4.2
2022-07-25 11:05:56 UTC
---- Overview ----
Processed: 20,313,347
Failed: 57,120
Busy: 9
Enqueued: 17
Retries: 0
Scheduled: 37
Dead: 2,382
---- Processes (1) ----
36f993209f93:15:a498f85c6a12 [server]
Started: 2022-07-25 10:49:43 +0000 (16 minutes ago)
Threads: 10 (9 busy)
Queues: default, elasticsearch, statistics
---- Queues (3) ----
NAME SIZE LATENCY
default 0 0.00
elasticsearch 17 0.74
statistics 0 0.00

Resources