Is there any reason to use a database connection pool with ActiveRecord? - ruby-on-rails

What are the benefits to using an external connection pool?
I've heard that most other applications will open up a connection for each unit of work. In Rails, for example, I'd take that to mean that each request could open a new connection. I'm assuming a connection pool would make that possible.
The only benefit I can think of is that it allows you to have 1,000 frontend processes without having 1,000 postgres processes running.
Are there any other benefits?

Rails has connection pooling built in:
Simply use ActiveRecord::Base.connection as with Active Record 2.1 and earlier (pre-connection-pooling). Eventually, when you’re done with the connection(s) and wish it to be returned to the pool, you call ActiveRecord::Base.clear_active_connections!. This will be the default behavior for Active Record when used in conjunction with Action Pack’s request handling cycle.
Manually check out a connection from the pool with ActiveRecord::Base.connection_pool.checkout. You are responsible for returning this connection to the pool when finished by calling ActiveRecord::Base.connection_pool.checkin(connection).
Use ActiveRecord::Base.connection_pool.with_connection(&block), which obtains a connection, yields it as the sole argument to the block, and returns it to the pool after the block completes.
This has been available since version 2.2. You'll see a pool parameter in your database.yml for controlling it:
pool: number indicating size of connection pool (default 5)
I don't think there would be much point in layering another pooling system underneath it and it could even confuse AR's pooling if you tried it.

Related

Are there reasons to keep Active Record pool size low when you use pgbouncer?

If I use pgbouncer, what are the reasons I should not just set Active Record's pool size to 99999, effectively disabling it, and leaving pgbouncer in charge of all pooling?
In my case, this is with Rails 5.2. pgbouncer uses transaction pooling.
I can think of some possible reasons:
If a runaway process somehow tries to open a high number of threads/connections, the AR pool would set a ceiling, preventing it from exhausting all connections.
Similarly, if AR doesn't close connections to pgbouncer correctly (e.g. if some code opens connections in threads without closing them), and AR's reaper does not run or does not run often enough, that code could exhaust all connections.
If Active Record itself has costly overhead per connection (does it?), perhaps it's preferable to wait and reuse connections instead of opening a higher number of connections, in situations where the same process tries to open a lot of connections.
Are those valid reasons? Are they the only reasons?
(I've seen Disabling Connection Pooling in Rails to use PgBouncer and think this is related but not quite the same question.)

Can I / Should I disable Ruby On Rails Database Connection Pooling when using PgBouncer?

Can I disable Ruby on Rails connection pooling completely?
And would this be okay considering PgBouncer already handles database connection pooling?
No.
PgBouncer advertises itself to clients as a Postgres server and then manages connections to the actual Postgres server. We don't need to get into any further detail than that for PgBouncer -- from the Rails side of things, the PgBouncer instance is the Postgres server, so let's explain why starting from there.
In Rails there are two primary factors to consider for concurrency: the number of inbound client requests that can be made to its web server and the size of the connection pool for the database.
If your connection pool size is 1 then your application is effectively single-threaded: every time an inbound client request is made that must make a database query, the request must check out a connection to the database from the pool. When the pool size is 1 and the number of concurrent inbound requests is greater than 1, every request after the first request must be put on hold while the first request completes its query and returns the connection to the pool. If the first request takes a long time to complete then subsequent requests may timeout while waiting for an available connection from the pool.
So from the Rails side, you want to have a connection pool with a size larger than 1 to allow for concurrency. If you set it to a size of 1 then it doesn't matter how you have PgBouncer configured; Rails will always make at most one connection to the database and all threads must share that single connection.
You can read more about connection pools in Rails at ActiveRecord::ConnectionAdapters::ConnectionPool < Object.

Connection pools explained in a practical context

Rails' database.yml file has a setting, pool: 5. I understand what a database connection pool is but I'm being tripped by a few subtleties:
A connection is used then returned to its pool. The next request can then use a connection from the pool rather than creating a new connection.
How is it determined which request gets which connection?
Suppose I have a concurrent connections limit of 5 and one of my web pages needs to make 10 queries to the database:
Is each query a separate connection or all 10 queries are considered one connection?
In terms of queries, connections, or speed, what can be an example of a situation that would overwhelm that 5 concurrent connections limit?
And suppose that, in a different database, I set the database connection pool size to 5.
How are pool size and concurrent connections related, if at all?
In terms of queries, connections, or speed, what can be an example of a situation that would overwhelm this pool size?
1) ActiveRecord::Base loads a connection when required (lazily on a request or it's current one is closed/disconnected)
2) No, The same connection will be used to make multiple queries
3) No way to answer that without using diagnostic utilities which your db vendor supplied with your db
4) That is db vendor/adapter specific
5) same answer as 3.
If you are experiencing a slow down, the only way to solve them is by using diagnostic tools to inform you where your bottleneck is concurring. 90% of the time, it's not your db or the connections to it (It's usually the indexing, n+1, etc... )
If you are NOT experiencing any slow down, then keep the defaults and move on. Premature optimization will lead to an over engineered solution

How to workaround an CachedConnectionManager error on JBoss with JRuby?

I am having an issue deploying an JRuby Rails application into JBoss, using JNDI to manage database connections.
After the first request I have this error:
[CachedConnectionManager] Closing a connection for you. Please close them yourself
I think this is because JBoss uses a connection pool and expect that rails (jruby) release the connection after each use, what is wrong, because rails (ActiveRecord) has its own connection pool.
I've tried to call
ActiveRecord::Base.clear_active_connections!
after each request, in a after_filter, but this haven't worked.
Does somebody have some idea?
Thanks in advance.
I am also having connection pool problems trying to run a Sinatra/ActiveRecord app in multithreaded mode in a Glassfishv3 container. The app includes a "before" block with ActiveRecord::Base.connection to force acquisition of a thread-owned connection, and an "after" block with ActiveRecord::Base.clear_active_connections! to release that connection.
Multi threaded mode
I have tried a great many variations of ActiveRecord 3.0.12 and 3.2.3, JNDI with a Glassfish connection pool, simply using ActiveRecord's connection pool, or even monkey patching the ActiveRecord connection pool to bypass it completely, using the Glassfish pool directly.
When testing with a simple multi-threaded HTTP fetcher, all variations I have tried have resulted errors, with a greater percentage of failing requests as I increase the worker threads in the HTTP fetcher.
With AR 3.0.12, the typical error is the Glassfish pool throwing a timeout exception (for what it's worth, my AR connection pool is larger than my Glassfish pool; I understand AR will pool the connection adapter object, and arjdbc will acquire and release actual connections behind the scenes).
With AR 3.2.3, I get a more ominous error. The error below came from a test not using JNDI but simply using the ActiveRecord connection pool. This is one of the better configurations in that about 95% of requests complete OK. The error requests fail with this exception:
org.jruby.exceptions.RaiseException: (ConcurrencyError) Detected invalid hash contents due to unsynchronized modifications with concurrent users
at org.jruby.RubyHash.keys(org/jruby/RubyHash.java:1356)
at ActiveRecord::ConnectionAdapters::ConnectionPool.release(/Users/pat/app/glassfish/glassfish3/glassfish/domains/domain1/applications/lookup_service/WEB-INF/gems/gems/activerecord-3.2.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:294)
at ActiveRecord::ConnectionAdapters::ConnectionPool.checkin(/Users/pat/app/glassfish/glassfish3/glassfish/domains/domain1/applications/lookup_service/WEB-INF/gems/gems/activerecord-3.2.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:282)
at MonitorMixin.mon_synchronize(classpath:/META-INF/jruby.home/lib/ruby/1.9/monitor.rb:201)
at MonitorMixin.mon_synchronize(classpath:/META-INF/jruby.home/lib/ruby/1.9/monitor.rb:200)
at ActiveRecord::ConnectionAdapters::ConnectionPool.checkin(/Users/pat/app/glassfish/glassfish3/glassfish/domains/domain1/applications/lookup_service/WEB-INF/gems/gems/activerecord-3.2.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:276)
at ActiveRecord::ConnectionAdapters::ConnectionPool.release_connection(/Users/pat/apps/glassfish/glassfish3/glassfish/domains/domain1/applications/lookup_service/WEB-INF/gems/gems/activerecord-3.2.3/lib/active_record/connection_adapters/abstrac/connection_pool.rb:110)
at ActiveRecord::ConnectionAdapters::ConnectionHandler.clear_active_connections!(/Users/pat/apps/glassfish/glassfish3/glassfish/domains/domain1/applications/lookup_service/WEB-INF/gems/gems/activerecord-3.2.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:375)
...
Single threaded mode
Losing confidence in the general thread safety of ActiveRecord (or arjdbc?), I gave up on using ActiveRecord multithreaded and configured warbler and JRuby-Rack do JRuby runtime pooling, emulating multiple single-threaded Ruby processes much like Unicorn, Thin, and other typical Ruby servers.
In config/warble.rb:
config.webxml.jruby.min.runtimes = 10
config.webxml.jruby.max.runtimes = 10
I reran my test, using JNDI this time, and all requests completed with no errors!
My Glassfish connection pool is size 5. Note the number of jruby runtimes is greater than the connection pool size. There is probably little point in making the JRuby runtime pool larger than the database connection pool since in this application, each request consumes a database connection, but I just wanted to make sure that even with contention for database connections I did not get the time-out errors I had seen in multithreaded mode.
These concurrency problems are disappointing to say the least. Is anyone successfully using ActiveRecord under moderate concurrency? I know that in a Rails app, one must call config.threadsafe! to avoid the global lock. Looking at the code, it does not seem to modify any ActiveRecord setting; is there some configuration of ActiveRecord I'm not doing?
The Rails connection pool does hold connections open, even after they are returned to the pool, so the underlying JDBC connection does not get closed. Are you using a JNDI connection with activerecord-jdbc-adapter (adapter: jndi or jndi: location in database.yml)?
If so, this should happen automatically, and would be a bug.
If not, you can use the code in ar-jdbc and apply the appropriate connection pool callbacks yourself.
https://github.com/jruby/activerecord-jdbc-adapter/blob/master/lib/arjdbc/jdbc/callbacks.rb
class ActiveRecord::ConnectionAdapters::JdbcAdapter
# This should force every connection to be closed when it gets checked back
# into the connection pool
include ActiveRecord::ConnectionAdapters::JndiConnectionPoolCallbacks
end

Connection Timeout and Connection Lifetime

What is the advantage and disadvantage of connection timeout=0?
And what is the use of Connection Lifetime=0?
e.g
(Database=TestDB;
port=3306;
Uid=usernameID;
Pwd=myPassword;
Server=192.168.10.1;
Pooling=false;
Connection Lifetime=0;
Connection Timeout=0)
and what is the use of Connection Pooling?
Timeout is how long you wait for a response from a request before you give up. TimeOut=0 means you will keep waiting for the connection to occur forever. Good I guess if you are connecting to a really slow server that it is normal if it takes 12 hours to respond :-). Generally a bad thing. You want to put some kind of reasonable timeout on a request, so that you can realize your target is down and move on with your life.
Connection Lifetime = how long a connection lives before it is killed and recreated. A lifetime of 0 means never kill and recreate. Normally not a bad thing, because killing and recreating a connection is slow. Through various bugs your connections may get stuck in an unstable state (like when dealing with weird 3 way transactions).. but 99% of the time it is good to keep connection lifetime as infinite.
Connection pooling is a way to deal with the fact that creating a connection is very slow. So rather than make a new connection for every request, instead have a pool of say, 10, premade connections. When you need one, you borrow one, use it, and return in. You can adjust the size of the pool to change how your app behaves. Bigger pool = more connections = more threads doing stuff at a time, but this could also overwhelm whatever you are doing.
In summary:
ConnectionTimeout=0 is bad, make it something reasonable like 30 seconds.
ConnectionLifetime=0 is okay
ConnectionPooling=disabled is bad, you will likely want to use it.
I know this is an old thread but I think it is important to point out an instance in which you may want to disable Connection Pooling or use Connection Lifetime.
In some environments (especially when using Oracle, or at least in my experience) the web application is designed so that it connects to the database using the user's credentials vs a fixed connection string located in the server's configuration file. In this case enabling connection pooling will cause the server to create a connection pool for each user accessing the website (See Pool Fragmentation). Depending on the scenario this could either be good or bad.
However, connection pooling becomes a problem when the database server is configured to kill database connections that exceed a maximum idle time due to the fact that the database server could kill connections that may still reside in the connection pool. In this scenario the Connection Lifetime may come in handy to throw away these connections since they have been closed by the server anyway.

Resources