I'm using Sidekiq with Rails and Heroku to execute asynchronous workers but I am not able to get it working. I've tried launching those workers manually (executing the perform method) and they work, however when you schedule them and check the Sidekiq webpage they appear as enqueued and it doesn't process them.
I'm using an Heroku Worker and Redis To Go and this is the config:
sidekiq.rb
require 'sidekiq'
Sidekiq.configure_client do |config|
config.redis = { :size => 1 }
end
Sidekiq.configure_server do |config|
config.redis = { :size => 2 }
end
sidekiq.yml
:concurrency: 2
Worker launch
Sidekiq::Client.enqueue(MyWorker)
Sidekiq and Redis seems to be properly connected because I get the following log on Heroku:
015-04-27T21:46:51Z 3 TID-ovbydc96k INFO: Sidekiq client with redis options {:size=>1, :url=>"redis://redistogo:REDACTED#soapfish.redistogo.com:xxxx/"}
I'm missing something but I don't know what. Any help would be appreciated. Thanks!
Have you run this command so that Sidekiq knows to use the Redis-To-Go URL?
heroku config:set REDIS_PROVIDER=REDISTOGO_URL
Related
I have a Rails API deployed onto heroku, which serves to a React.js Static page. There both deployed on heroku, and they comunicate through an API link. My struggle is comming when using Redis and Sidekiq.
On my Rails API I got the RedisToGo link Configured with no issues, but when I go to my React app and try to send an email invite I get this message Redis::CannotConnectError: Error connecting to Redis on 127.0.0.1:6379 (ECONNREFUSED).
I thought if I have it configured on my backend then it would work to my react static page app.
Sidekiq.yml
---
:verbose: false
:concurrency: 3
staging:
:concurrency: 1
production:
:concurrency: 5
:queues:
- [mailers,2]
- slack_notifications
- mixpanel
- invoices
- default
- rollbar
Redis.rb
uri = URI.parse(ENV["REDISTOGO_URL"])
REDIS = Redis.new(:url => uri)
sidekiq.rb
Sidekiq::Extensions.enable_delay!
unless Rails.env == 'development' || Rails.env == 'test'
Sidekiq.configure_server do |config|
config.redis = {
url: Rails.application.credentials.redis_url,
password: Rails.application.credentials.redis_password
}
end
Sidekiq.configure_client do |config|
config.redis = {
url: Rails.application.credentials.redis_url,
password: Rails.application.credentials.redis_password
}
end
end
# Turn off backtrace if at all memory issues are popping up as
# backtrace occupies to much memory on redis
# number of lines of backtrace and number of re-tries
Sidekiq.default_worker_options = { backtrace: false, retry: 3 }
Sidekiq.configure_server do |config|
# runs after your app has finished initializing
# but before any jobs are dispatched.
config.on(:startup) do
puts 'Sidekiq is starting...'
# make_some_singleton
end
config.on(:quiet) do
puts 'Got USR1, stopping further job processing...'
end
config.on(:shutdown) do
puts 'Got TERM, shutting down process...'
# stop_the_world
end
end
So my question is the next one:
If I have already configured a REDISTOGO_LINK on my rails app, do I need the same on my react config vars?
Whats the best way to configure sidekiq and Redis on a Rails API using react as a front-end in heroku? I haven't seen something that covers this on the internet.
I would appreciate your help! ;)
You need to run heroku config:set REDIS_PROVIDER=REDISTOGO_URL to tell Sidekiq to use REDISTOGO_URL to connect to Redis.
https://github.com/mperham/sidekiq/wiki/Using-Redis#using-an-env-variable
I am experiencing a strange issue from Passenger docker ruby 2.3 with aws redis:
bad URI(is not URI?): 'redis://redis.xxx.ng.0001.apse1.cache.amazonaws.com:6379/1' (URI::InvalidURIError)
/usr/local/rvm/rubies/ruby-2.3.8/lib/ruby/2.3.0/uri/rfc3986_parser.rb:67:in `split'
/usr/local/rvm/rubies/ruby-2.3.8/lib/ruby/2.3.0/uri/rfc3986_parser.rb:73:in `parse'
/usr/local/rvm/rubies/ruby-2.3.8/lib/ruby/2.3.0/uri/common.rb:227:in `parse'
/usr/local/rvm/gems/ruby-2.3.8/gems/sidekiq-5.2.7/lib/sidekiq/redis_connection.rb:97:in `log_info'
/usr/local/rvm/gems/ruby-2.3.8/gems/sidekiq-5.2.7/lib/sidekiq/redis_connection.rb:31:in `create'
/usr/local/rvm/gems/ruby-2.3.8/gems/sidekiq-5.2.7/lib/sidekiq.rb:126:in `redis_pool'
/usr/local/rvm/gems/ruby-2.3.8/gems/sidekiq-5.2.7/lib/sidekiq.rb:94:in `redis'
My sidekiq config:
# ENV['REDIS_URL']= redis://redis.xxx.ng.0001.apse1.cache.amazonaws.com:6379/1
Sidekiq.configure_server do |config|
config.redis = { url: ENV['REDIS_URL'] }
end
Sidekiq.configure_client do |config|
config.redis = { url: ENV['REDIS_URL'] }
end
However if I run:
docker exec -it container_id bash
and then rails console everything seems to work just fine.
I also tried this:
redis_url = ENV['REDIS_URL']
# The statement below parsed successfully thus the redis_url is correct
uri = URI.parse(redis_url)
redis_options = {
host: uri.host,
port: 6379,
db: uri.path[1..-1]
}
Sidekiq.configure_server do |config|
config.redis = redis_options
end
Sidekiq.configure_client do |config|
config.redis = redis_options
end
But it raised the exact same error. I could run the docker locally connected to local redis just fine. I am wondering there might be something wrong with the ENV['REDIS_URL'] value.
Is there anyone experiencing this issue or any clues?
My env is
- passenger-docker ruby 2.3.8
- aws elastic cache redis: 5.0.5
- sidekiq 5.2.7
After many hours, I just realize that the redis_url from the container was pointed to the deleted Redis cluster in aws. My first launch of Redis was set to default to r5 16GM instance type which is very big and costly for testing. I decided to launch a new one with just t2 small and deleted the r5 one to save some money, however in passenger-config I have the REDIS_URL set and Nginx overridden the env value over the one I set in the ECS task.
The error raised by ruby Redis is not straightforward and having two places to set ENV (in Nginx config and in ECS-task ) make the debugging painful.
We have a Rails app on Heroku with Sidekiq and are running out of database connections.
ActiveRecord::ConnectionTimeoutError: could not obtain a database
connection within 5.000 seconds (waited 5.000 seconds)
Heroku stuff:
Database plan: Standard0 (120 connections)
Web dynos: 2 Standard-2X
Worker dynos: 1 Standard-2X
heroku config:
MAX_THREADS: 5
(DB_POOL not set)
(WEB_CONCURRENCY not set)
Procfile:
web: bundle exec puma -C config/puma.rb
worker: bundle exec sidekiq
database.yml:
...
production:
url: <%= ENV["DATABASE_URL"] %>
pool: <%= ENV["DB_POOL"] || ENV['MAX_THREADS'] || 5 %>
puma.rb:
# https://devcenter.heroku.com/articles/deploying-rails-applications-with-the-puma-web-server#adding-puma-to-your-application
workers Integer(ENV['WEB_CONCURRENCY'] || 2)
threads_count = Integer(ENV['MAX_THREADS'] || 2)
threads threads_count, threads_count
preload_app!
rackup DefaultRackup
port ENV['PORT'] || 3000
environment ENV['RACK_ENV'] || 'development'
on_worker_boot do
# Worker specific setup for Rails 4.1+
# See: https://devcenter.heroku.com/articles/deploying-rails-applications-with-the-puma-web-server#on-worker-boot
ActiveRecord::Base.establish_connection
end
sidekiq.yml:
---
:concurrency: 25
:queues:
- [default]
We also have a couple of rake tasks that fire every 10 minutes, and they finish within a second or two.
The problem seems to happen when we do a lot of message processing in sidekiq. We do something like:
get article headlines from a 3rd party web service
insert each headline into the db inside a single transaction
create a message in sidekiq for each headline (worker.perform_async)
each message is processed, hits an endpoint to get the body and updates the body (can take .5 - 3 seconds)
While number 4 is happening we see the connection issue.
My understanding is we are way, way, way below the connection limit with our configuration above, but did we do something incorrectly? Is something just consuming the pool? Any help would be great, thanks.
Sources:
https://devcenter.heroku.com/articles/concurrency-and-database-connections
https://devcenter.heroku.com/articles/deploying-rails-applications-with-the-puma-web-server
https://github.com/mperham/sidekiq/wiki/Advanced-Options
You are sharing 5 DB connections among 25 Sidekiq threads. Set DB_POOL to 25 or Sidekiq's concurrency to 5.
So I'm trying to process a CSV file via Sidekiq background job processing on a Heroku Worker instance. Whilst I can complete the process, I feel it could certainly be done quicker/more efficiently than I'm currently doing it. This question contains two parts - firstly are the database pools setup correctly and secondly how can I optimise the process.
Application environment:
Rails 4 application
Unicorn
Sidekiq
Redis-to-go (Mini plan, 50 connections max)
CarrierWave S3 implementation
Heroku Postgres (Standard Yanari, 60 connections max)
1 Heroku Web dyno
1 Heroku Worker dyno
NewRelic monitoring
config/unicorn.rb
worker_processes 3
timeout 15
preload_app true
before_fork do |server, worker|
Signal.trap 'TERM' do
puts 'Unicorn master intercepting TERM and sending myself QUIT instead'
Process.kill 'QUIT', Process.pid
end
if defined?(ActiveRecord::Base)
ActiveRecord::Base.connection.disconnect!
end
end
after_fork do |server, worker|
Signal.trap 'TERM' do
puts 'Unicorn worker intercepting TERM and doing nothing. Wait for master to send QUIT'
end
if defined?(ActiveRecord::Base)
config = ActiveRecord::Base.configurations[Rails.env] ||
Rails.application.config.database_configuration[Rails.env]
config['reaping_frequency'] = ENV['DB_REAP_FREQ'] || 10 # seconds
config['pool'] = ENV['DB_POOL'] || 2
ActiveRecord::Base.establish_connection(config)
end
end
config/sidekiq.yml
---
:concurrency: 5
staging:
:concurrency: 5
production:
:concurrency: 35
:queues:
- [default, 1]
- [imports, 10]
- [validators, 10]
- [send, 5]
- [clean_up_tasks, 30]
- [contact_generator, 20]
config/initializers/sidekiq.rb
ENV["REDISTOGO_URL"] ||= "redis://localhost:6379"
Sidekiq.configure_server do |config|
config.redis = { url: ENV["REDISTOGO_URL"] }
database_url = ENV['DATABASE_URL']
if database_url
ENV['DATABASE_URL'] = "#{database_url}?pool=50"
ActiveRecord::Base.establish_connection
end
end
Sidekiq.configure_client do |config|
config.redis = { url: ENV["REDISTOGO_URL"] }
end
The database connection pools are worked out as such:
I have 3 Web processes (unicorn worker_processes), to each of these I am allocating 2 ActiveRecord connections via the after_fork hook (config/unicorn.rb) for 6 total (maximum) of my 60 available Postgres connections assigned to the Web dyno. In the Sidekiq initialiser, I'm allocating 50 Postgres connections via the ?pool=50 param appended to ENV['DATABASE_URL'] as described (somewhere) in the docs. I'm keeping my Sidekiq concurrency value at 35 (sidekiq.yml) to ensure I stay under both the 50 Redis connections and 60 Postgres connection limits. This still needs more finely grained tuning, but I'd rather get the data processing itself sorted before going any further with this.
Now, assuming the above is correct (and it wouldn't surprise me at all if it weren't) I'm handling the following scenario:
A user uploads a CSV file to be processed via their browser. This file can be anywhere between 50 rows and 10 million rows. The file is uploaded to S3 via the CarrierWave gem.
The user then configures a couple of settings for the import via the UI,the culmination of which adds an FileImporter job to the Sidekiq queue to start creating various models based on the rows.
The Import worker looks something like:
class FileImporter
include Sidekiq::Worker
sidekiq_options :queue => :imports
def perform(import_id)
import = Import.find_by_id import_id
CSV.foreach(open(import.csv_data), headers: true) do |row|
# import.csv_data is the S3 URL of the file
# here I do some validation against a prebuilt redis table
# to validate the row without making any activerecord calls
# (business logic validations rather than straight DB ones anyway)
unless invalid_record # invalid_record being the product of the previous validations
# queue another job to actually create the AR models for this row
ImportValidator.perform_async(import_id, row)
# increment some redis counters
end
end
end
This is slow - I've tried to limit the calls to ActiveRecord in the FileImporter worker so I'm assuming it's because I'm streaming the file from S3. It's not processing rows fast enough to build up a queue so I'm never utilising all of my worker threads (usually somewhere between 15-20 of the 35 available threads are active. I've tried splitting this job up and feeding rows a 100 at a time into an intermediary worker which then creates the ImportValidator jobs in a more parallel fashion but that didn't fare much better.
So my question is, what's the best/most efficient method to accomplish a task like this?
It's possible you are at 100% CPU with 20 threads. You need another dyno.
(See below for my detailed config, which is the result of Henley Chiu's answer).
I've been trying to wrap my brain around Sidekiq deploys, and I am not really getting it. I have an app with a staging environment, and a production environment, on the same server. Everything I see about sidekiq deploys basically say "just add sidekiq/capistrano to your deploy file", so I did that. And then the instructions are "here's a yml file with options" but nothing seems to be explained. Do I need namespaces? I see that in an initialize file, but that seems to be to point outside the server.
I deployed earlier, and each stage seems to boot sidekiq up with the proper environment, but they both process from the same queues. My emails from production were trying to be processed by the stage sidekiq, and failing. I stopped my stage for now, but eventually I will need to use it again. I hope I'm not being dense, I've really tried to understand this and am just having a hard time with finding a definitive "here's how it's done".
For what it's worth, here is config/sidekiq.yml (which is loaded fine during the deploy):
:concurrency: 5
:verbose: false
:pidfile: ./tmp/pids/sidekiq.pid
:logfile: ./log/sidekiq.log
:queues:
- [carrierwave, 7]
- [client_emails, 5]
- [default, 3]
staging:
:concurrency: 10
production:
:concurrency: 25
Log files, and pids seem to be in the right spot, but the queues are just merged. Any help would be GREAT!
Also, if it matters:
Rails 3.2.11, passenger, nginx, rvm, Ubuntu 12.10, and Ruby 1.9.3
Detailed Configuration (answer):
First I set up a new redis server at port 7777 (or whatever port you please besides the default 6379). Pretty much followed the redis quickstart guide that I used the first time around.
Then I made the initilizer file; this has both the client and the server config. Both are required to make sidekiq work multistage.
Note that I am using an external YAML file for the settings. I am using SettingsLogic for this to make things easier, but you can just as easily do this yourself by including the file. By using a yaml file, we don't have to touch our environments/staging or production files.
# config/initializers/sidekiq.rb
server = Settings.redis.server
port = Settings.redis.port
db_num = Settings.redis.db_num
namespace = Settings.redis.namespace
Sidekiq.configure_server do |config|
config.redis = { url: "redis://#{server}:#{port}/#{db_num}", namespace: namespace }
end
I am using passenger - the troubleshooting page of the sidekiq wiki recommends a change for the setup when using unicorn or passenger, so I added the code there for the client setup:
# config/initializers/sidekiq.rb (still)
if defined?(PhusionPassenger)
PhusionPassenger.on_event(:starting_worker_process) do |forked|
Sidekiq.configure_client do |config|
config.redis = { url: "redis://#{server}:#{port}/#{db_num}", namespace: namespace }
end if forked
end
end
This is my Settings file (obviously values changed):
#config/settings.yml
defaults: &defaults
redis: &redis_defaults
server: 'localhost'
port: 6379
db_num: 0
namespace: 'sidekiq_development'
development:
<<: *defaults
test:
<<: *defaults
staging:
<<: *defaults
redis:
<<: *redis_defaults
port: 8888
namespace: 'sidekiq_staging'
production:
<<: *defaults
redis:
<<: *redis_defaults
port: 7777
namespace: 'sidekiq_production'
I found that adding the namespace to the config/sidekiq.yml file didn't seem to work - sidekiq would boot on deploy using the right port, but wouldn't actually process anything. But since the wiki recommends using a namespace, I ended up just adding it to the init file.
I hope this helpful for others, because this was really hard for me to understand, having not done a lot of this kind of setup before.
If all environments(development, staging and production) are on same server then use namespace. In your initializers/sidekiq.rb file,
Sidekiq.configure_server do |config|
config.redis = { url: 'redis://localhost:6379/0', namespace: "sidekiq_app_name_#{Rails.env}" }
end
Sidekiq.configure_client do |config|
config.redis = { url: 'redis://localhost:6379/0', namespace: "sidekiq_app_name_#{Rails.env}" }
end
In your initializers/sidekiq.rb file, you specify the Redis queue all environments boot up with.
For mine it is:
redisServer = "localhost"
Sidekiq.configure_server do |config|
config.redis = { :url => 'redis://' + redisServer + ':6379/0' }
end
If you want each environment to process from separate queues, you can have specific sidekiq.rb files in the environments folder for each environment. Each with different redis servers.
In addition to the namespace, it will be good if you also separate out DBs for each Rails environment in Redis too i.e.:
env_num = Rails.env == 'staging' ? 0 : 1
Redis.new(db: env_num) # existing DB is selected if already present
Sidekiq.configure_server do |config|
config.redis = { url: "redis://localhost:6379/#{env_num}", namespace: "app_name_#{Rails.env}" }
end
Sidekiq.configure_client do |config|
config.redis = { url: "redis://localhost:6379/#{env_num}", namespace: "app_name_#{Rails.env}" }
end