Rails Puma running out of Redis connections - ruby-on-rails

I've looked around at other similar questions on SO but can't quite piece things together well enough. I have a Rails app (on Heroku) that uses Puma with both multiple processes and multiple threads. My app also uses Redis as a secondary data store (in addition to a SQL database), querying Redis directly (well, through the connection_pool gem). Here's my Puma config file:
workers Integer(ENV["WEB_CONCURRENCY"] || 4)
threads_count = Integer(ENV["MAX_THREADS"] || 5)
threads threads_count, threads_count
preload_app!
rackup DefaultRackup
port ENV["PORT"] || 3000
environment ENV["RACK_ENV"] || "development"
on_worker_boot do
# Worker specific setup for Rails 4.1+
ActiveRecord::Base.establish_connection
redis_connections_per_process = Integer(ENV["REDIS_CONNS_PER_PROCESS"] || 5)
$redis = ConnectionPool.new(size: redis_connections_per_process) do
Redis.new(url: ENV["REDIS_URL"] || "redis://localhost:6379/0")
end
end
My Redis instance has a connection limit of 20, and I find myself regularly going over this limit, despite having what should be (as far as I can tell) only 5 connections per process spread across 4 worker processes.
In fact, I even get max number of clients reached Redis errors when I set REDIS_CONNS_PER_PROCESS to 1. Is on_worker_boot called for each thread rather than each process?
I've also tried having a separate redis.rb initializer, which still gives me errors even when REDIS_CONNS_PER_PROCESS is 1. This seems odd since I should be able to have it up to 4 if I'm doing my math correctly (4 worker processes + 1 master process) * 4 connections per process. (Note that for the purposes of this question I'm ignoring errors that occur around deploying, since I'm assuming Heroku might be connecting both old and new processes during that process, even though I'm not using Preboot.)
Where am I misunderstanding how this all fits together?

I had similar problem. At first I was using redis-togo, and it has no problem. but After I changed from redis-togo to Heroku redis, I got "ERR max number of clients reached" erros.
My app's code is not changed, redis provider's changing was the only one.
I opened a ticket at Heroku support, and they advised me to change the default setting of timeout value.
https://devcenter.heroku.com/articles/heroku-redis#configuring-your-instance
after I changed the default timeout value of Heroku redis, everyting was solved.
I guess the default value of redis timeout is different by redis providers. and Heroku redis's default setting is 0.
"A value of zero means that connections will not be closed."
I wish my experience is helpful.

After further reading and testing, I ended up moving my Redis connection pool code into a separate initializer. Unfortunately, this didn't solve my problem at all—despite lots of tinkering with process and connection numbers, I was still getting max number of clients reached errors way before I should have been.
The answer, it turns out, was to switch Redis providers from Heroku Redis to Redis Cloud. I'm not sure why Heroku Redis wasn't allowing the number of connections it advertises, but upon some investigation Redis Cloud actually appears to allow more connections than advertised (or at least limit connections transparently and without errors) without any issues whatsoever. Wow. They've certainly earned my business.

I ran into this problem also, and while the Heroku Redis dashboard showed only a few connections, I was running out of connections.
Then I contacted Heroku support, and they told me that the dashboard only shows active clients/connections, and not the idle ones.
So because of the Redis timeout being 0 (never timeout), on reboot the Redis connections idle and new ones are opened. So the situation gets worse on every reboot.
A solution, as mentioned by others on this page, is to set the timeout to something else than 0:
heroku redis:timeout -s 10 -a APPLICATION_NAME
This makes the connections die after 10 seconds, which shouldn't be a problem because while it's being used it will stay open (no unnecessary closes).
When you have only little traffic, you might consider setting this to something a bit higher.

Related

(Heroku + Sidekiq) Is my understanding of how Connection Pooling works correct?

Assume I have the below setup on Heroku + Rails, with one web dyno and two worker dynos.
Below is what I believe to be true, and I'm hoping that someone can confirm these statements or point out an assumption that is incorrect.
I'm confident in most of this, but I'm a bit confused by the usage of client and server, "connection pool" referring to both DB and Redis connections, and "worker" referring to both puma and heroku dyno workers.
I wanted to be crystal clear, and I hope this can also serve as a consolidated guide for any other beginners having trouble with this
Thanks!
How everything interacts
A web dyno (where the Rails application runs)
only interacts with the DB when it needs to query it to serve a page request
only interacts with Redis when it is pushing jobs onto the Sidekiq queue (stored in Redis). It is the Sidekiq client
A Worker dyno
only interacts with the DB if the Sidekiq job it's running needs to query the DB
only interacts with Redis to pull jobs from the Sidekiq queue (stored in Redis). It is the Sidekiq server
ActiveRecord Pool Size
An ActiveRecord pool size of 25 means that each dyno has 25 connections to work with. (This is what I'm most unsure of. Is it each dyno or each Puma/Sidekiq worker?)
For the web dynos, it can only run 10 things (threads) at once (2 puma x 5 threads), so it will only consume a maximum of 10 threads. 25 is above and beyond what it needs.
For worker dynos, the Sidekiq concurrency of 15 means 15 Sidekiq processes can run at a time. Again, 25 connections is beyond what it needs, but it's a nice buffer to have in case there are stale or dead connections that won't clear.
In total, my Postgres DB can expect 10 connections from the web dyno and 15 connects from each worker dyno for a total of 40 connections maximum.
Redis Pool Size
The web dyno (Sidekiq client) will use the connection pool size specified in the Sidekiq.configure_client block. Generally ~3 is sufficient because the client isn't constantly adding jobs to the queue. (Is it 3 per dyno, or 3 per Puma worker?)
Each worker dyno (Sidekiq server) will use the connection pool size specified in the Sidekiq.configure_server block. By default it's sidekiq concurrency + 2, so here 17 redis connections will be taken up by each dyno
I don't know Heroku + Rails but believe I can answer some of the more generic questions.
From the client's perspective, the setup/teardown of any connection is very expensive. The concept of connection pooling is to have a set of connections which are kept alive and can be used for some period of time. The JDK HttpUrlConnection does the same (assuming HTTP 1.1) so that - assuming you're going to the same server - the HTTP connection stays open, waiting for the next expected request. Same thing applies here - instead of closing a JDBC connection each time, the connection is maintained - assuming same server and authentication credentials - so the next request skips the unnecessary work and can immediately move forward in sending work to the database server.
There are many ways to maintain a client-side pool of connections, it may be part of the JDBC driver itself, you might need to implement pooling using something like Apache Commons Pooling, but whatever you do it's going to increase your behavior and reduce errors that might be caused by network hiccups that could prevent your client from connecting to the server.
Server-side, most database providers are configured with a pool of n possible connections that the database server may accept. Usually each additional connection has a footprint - usually quite small - so based on the memory available you can figure out the maximum number of available connections.
In most cases, you're going to want to have larger-than-expected connections available. For example, in postgres, the configured connection pool size is for all connections to any database on that server. If you have development, test, and production all pointed at the same database server (obviously different databases), then connections used by test might prevent a production request from being fulfilled. Best not to be stingy.

Benefits of connection pooling with Redis and Unicorn

Background: I have a Ruby/Rails + Nginx/Unicorn web app with connections to multiple Redis DBs (i.e. I am not using Redis.current and am instead using global variables for my different connections). I understanding that I need to create a new connection in the after_fork block when a new Unicorn worker is created, as explained here and here.
My question is about the need for connection pooling. According to this SO thread, "In Unicorn each process establishes its own connection pool, so you if your db pool setting is 5 and you have 5 Unicorn workers then you can have up to 25 connections. However, since each unicorn worker can handle only one connection at a time, then unless your app uses threading internally each worker will only actually use one db connection... Having a pool size greater than 1 means each Unicorn worker has access to connections it can't use, but it won't actually open the connections, so that doesn't matter."
Since I am NOT using Sidekiq, do I even need to use connection pools for my Redis connections? Is there any benefit of a connection pool with a pool size of 1? Or should I simply use variables with single connections -- e.g. Redis.new(url: ENV["MY_CACHE"])?
Connection pool is only used when ActiveRecord talks to the SQL databases defined in your databases.yml config file. It is not related to Redis at all and the SO answer that you cite is actually not relevant for Redis.
So, unless you wanted to use some custom connection pool solution for Redis, you don't have to deal with it at all, as there is no pool for Redis in Rails by default. I guess the custom pool might be suitable if you had multiple threads in your applications which is not your case.
Update: Does building a connection pool make sense in your scenario? I doubt it. Connection pool is a way to reuse open connections (typically among multiple threads / requests). But you say that you:
use unicorn, the workers of which are separate, independent processes, not threads,
open a stable connection (or two) during after_fork, a connection which is then open all the time the unicorn worker lives
do not use threads in your application anywhere (I'd check if this is true again - it's not only Sidekiq but it might be any gem that tends to do things in the background).
In such scenario, pooling connection to Redis makes no sense to me as there seems to be no code that would benefit from reusing the connection - it is open all the time anyway.

Reason to use a global resource to connect to a redis-server

So, recently I moved all the session-related information in my app to the redis. Everything is running fine and now I am not facing the cookie-related issues (especially from IE).
In doing that, I read some blogs and all of them defined a redis-connector as a global variable in the config like
$redis = Redis.new(:host => 'localhost', :port => 6379)
Now there are a few things that bugging me:
Defining a global resource means that I have just a single connection to the redis. Will it create a bottleneck in my system when I have to serve multiple requests?
Also when multiple request arrives, will the Rails enqueue the requests for the redis as the connection is global resource, in case it is already in use?
Redis supports multiple instances. Wouldn't creating multiple instances boost the performance?
There are no standard connections pools included into Redis gem. If we consider Rails as a single threaded execution model it doesn't sound too problematic.
It might be evil when used in multi-threaded environment (think of background jobs as an example). So connection pooling is a good idea in general.
You can implement it for Redis using connection_pool gem.
Sidekiq also uses this gem for connecting to Redis. It can be seen here and here. Also, sidekiq author is the same person as connection_pool author, https://github.com/mperham.
As to your questions:
Multiple requests still don't mean multi-threading, so this approach might work well before you use threads;
Rails is not going to play the role of connection pool for your database;
It will boost performance (and avoid certain errors) if used in multi-threaded environment.
1) No it's not a bottleneck, opening TCP for Redis for every query/request cause leak of perfomance.
3) Yes if you have more then one core/thread.
Simply measure Redis connection number to see there is no new connection instantiated before each Rails request processed. The connection established on rails processor (Unicorn, Puma, Passenger etc) side during application load process.
echo info | redis-cli | grep connected_clients
Try to run the bash command before and during your application is being run locally.

Proper activerecord connection pool size with sidekiq and postgres for multiple sidekiq processes?

I'm running 7 sidekiq processes (currency set to 40) plus a passenger webserver, connecting to a postgres database. Rails pool setting is set to 100 and and postgres max_connections setting is also the default 100.
I just added a new job class where each job makes multiple postgres requests, and I started getting this error on many sidekiq jobs and sometimes on my webserver: PG::ConnectionBad: FATAL: remaining connection slots are reserved for non-replication superuser connections
I tried increasing postgres max_connections to 200, and the error still occurs. Then I tried reducing the activerecord pool setting to 25 (25 connections for each process = 200 total connections), figuring I might start getting DB connection timeout errors but at least it would stop the "no remaining connection slots" errors.
But I'm still getting the remaining connection slots are reserved error.
The smarter way to to deal with this issue might be to load the important postgres data that I keep reusing into redis, and then access it from redis - which obivously plays much more nicely and quickly with sidekiq. But even as I do that, I'd like to understand what's going on here with the postgres connections:
Am I likely leaking connections, and is that something I should be
managing inside the sidekiq jobs?
(see Releasing ActiveRecord connection before the end of a Sidekiq job)
Should I look into more obscure things like locking/contention issues
or threading issues with the PG driver?
(see https://github.com/mperham/sidekiq/issues/594. I think I'm using ActiveRecord pretty simply without much obscure or abnormal logic for a rails app...)
Or maybe I'm just not understanding how the ActiveRecord pool setting
and postgres max_connection settings work together...?
My situation may be too specific to help many others running into this error, but I'll share what I've found out in case it helps to point you in the right direction.
Am I likely leaking connections, and is that something I should be managing inside the sidekiq jobs?
No, not likely. Sidekiq's default middleware includes a hook to close connections even if a job fails. It took me a long time to understand what the heck that means, so if you're not sure what that means, tl;dr: Sidekiq won't leak connections if you're using it normally.
Should I look into more obscure things like locking/contention issues or threading issues with the PG driver?
Unless you're using a very obscure setup, its probably something more simple.
Or maybe I'm just not understanding how the ActiveRecord pool setting and postgres max_connection settings work together...?
Anyone can feel free to correct me if I'm wrong, but here's the guidelines I'm going on for pool settings, max_connections, and sidekiq processes:
Minimum DB pool size = sidekiq concurrency setting
Maximum DB pool size* = postgres max_connections / total sidekiq processes (+ leave a few connections for web processes)
*note that active record will only create a new connection when a new thread needs one, so if 95% of your threads don't use postgres at the same time, you should be able to get away with far fewer max_connections than if every thread is trying to check out a connection at the same time.
What fixed my problem:
On my Ubuntu machine, I had changed the vm.overcommit_memory setting to 1 as recommended by redis, so that it can spawn it's write to disk process without breaking the machine.
This is the right way to go, but leaves postgres vulnerable to being killed by OOM (out of memory) Killer if memory usage gets too high. Turns out that postgres will stop allowing new connections if it receives a kill signal from the OOM Killer.
Once I restarted postgres, sidekiq was able to connect again. The longer term solution is simply to work on memory leaks and make sure memory usage doesn't get too high. Also it's possible to configure the OOM killer to prioritize killing my sidekiqs before killing postgres.

Puma Cluster configuration on Heroku

I need some help with my configuration of Puma (Multi-Thread+Multi-Core Server) on my RoR4 Heroku app.
The Heroku docs on that are not quite up-to-date. I followed this one: Concurrency and Database Connections for the configuration, which does not mention the configuration for a Cluster, so I had to use both types together (threaded and multicore).
My current configuration:
./Procfile
web: bundle exec puma -p $PORT -C config/puma.rb
./config/puma.rb
environment production
threads 0,16
workers 4
preload_app!
on_worker_boot do
ActiveRecord::Base.connection_pool.disconnect!
ActiveSupport.on_load(:active_record) do
config = Rails.application.config.database_configuration[Rails.env]
config['reaping_frequency'] = ENV['DB_REAP_FREQ'] || 10 # seconds
config['pool'] = ENV['DB_POOL'] || 5
ActiveRecord::Base.establish_connection
end
end
Questions:
a) Do I need the before_fork / after_fork configuration like in Unicorn, since the Cluster workers are forked?.
b) How do I tune my thread count depending on my application - what would be the reason to drop it down? / In what cases would it make a difference? Isn't 0:16 already optimized?
c) The Heroku database allows 500 connections. What would be a good value for DB_POOL depending on thread, worker and dyno count? - Does every thread per worker per dyno require a sole DB connection when working parallely?
In general: How should my configuration look like for concurrency and performance?
a) Do I need the before_fork / after_fork configuration like in
Unicorn, since the Cluster workers are forked?.
Normally no, but since you're using preload_app, yes. Preloading the app gets an instance up and running and then forks the memory space for the workers; the result is your initializers only get ran once (possibly allocating db connections and such). In this instance, your on_worker_boot code is appropriate. If you're not using preload_app, then each worker boots itself, in which case using an initializer would be ideal for setting up the custom connection like you're doing. In fact, without preload_app, your on_worker_boot block would error out because at that point ActiveRecord and friends aren't even loaded.
b) How do I tune my thread count depending on my application - what
would be the reason to drop it down? / In what cases would it make a
difference? Isn't 0:16 already optimized?
On Heroku (and my testing) you're best of matching your min/max threads, with max <= DB_POOL setting. The min threads allows your application to spin down resources when not under load, which is normally great to free up resources on the server, but likely less needed on Heroku; that dyno is already dedicated to serving web requests, may as well have them up and ready. While setting your max threads <= your DB_POOL environment variable isn't required, you run the risk of consuming all your database connections in the pool, then you have a thread wanting a connection but can't get it, and you can get the old "ActiveRecord::ConnectionTimeoutError - could not obtain a database connection within 5 seconds." error. This depends on your application though, you very well could have max > DB_POOL and be fine. I would say your DB_POOL should be at least the same as your min threads value, even though your connections are not eagerly loaded (5:5 threads wont open 5 connections if your app never hits the database).
c) The Heroku database allows 500 connections. What would be a good
value for DB_POOL depending on thread, worker and dyno count? - Does
every thread per worker per dyno require a sole DB connection when
working parallely?
The Production Tier allows 500, to be clear :)
Every thread per worker per dyno could consume a connection, depending on if they're all trying to access the database at the same time. Usually the connections are reused once they're done, but as I mentioned in b), if you're threads are greater than your pool you can have a bad time. The connections will be reused, all of this is handled by ActiveRecord, but sometimes not ideally. Sometimes connections go idle, or die, and that's why turning on the Reaper is suggested, to detect and reclaim dead connections.
You don't want less DB connections than threads. Remember that each separate process has its own connection pool, so if your DB supports 20 connections and you want to run 2 processes, the most threads you can run without risking timeouts is 10 threads each with a pool of 10 connections.
You want to leave a few connections for rails console sessions. Also be aware of background workers, and whether they are threaded.
If your workers are in a separate process (sidekiq), they will have their own pool. If your workers' threads are spawned from the web process (girl_friday or sucker_punch), you will want the DB_POOL to be larger than the max number of web threads, since they will be sharing a connection pool.

Resources