So I have a connection pool setup. Which is great and all since I have an application that really needs it. However what I would like to know is if it is possible to share this connection pool with other J2SE apps? Would this even be worth it, as opposed to creating a connection pool based on each apps needs? If it would be prudent, how can I accomplish this?
It is not hard having connection pools in a single JVM doing multiple things - that is what applications servers do everyday (using JNDI to throw objects across classloaders)
The interesting part is when you have the connection pool in a separate JVM from the client code needing it, as this does not immediately allow simply asking for and getting a connection from the pool and returning it afterwards.
Basically you have two options:
Doing remote requests for all your JDBC commands over the network. This will most likely mean that the data will travel over the network twice, from the database to the connection pool, and then from the connection pool to your application. If the database connections are very expensive objects then this might be a viable solution.
Use RMI to get the connection object from the connection pool JVM to your own machine. This is a very expensive operation, but can as far as I know include the actual driver classes, allowing your connection pool to provide connections to databases not known to your application JVM. To me this would only make sense if the database connections were ridiculoulusly expensive or it was a requirement to be able to support additional databases after deployment without changing the original deployments.
Note that the primary reason for having connection pools at all is because connections are expensive to create, use shortly and then discard. Some databases more than others, e.g. MySQl is (or was when I tried) very cheap so it might be the simplest just to do that.
So. First of all: Measure what your connection pool buys you in time, and then consider if it is worth your while to centralize this further.
Related
Looking at the node-postgres documentation on connecting to a database server it looks like the Client and Pool constructor are functionally equivalent.
My understanding is that using the Pool constructor provides you with the same functionality as using the Client constructor except that connections are made from a connection pool.
Isn't this always desirable? What are the conditions that I would choose to use the Client constructor over the Pool constructor?
One fairly good explanation can be found here: https://gist.github.com/brianc/f906bacc17409203aee0. As part of this post:
I would definitely use a single pool of clients throughout the application. node-postgres ships with a pool implementation that has always met my needs, but it's also fine to just use the require('pg').Client prototype and implement your own pool if you know what you're doing & have some custom requirements on the pool.
The drawback to using a pool for each piece of middleware or using multiple pools in your application is you need to control how many open clients you have connected to the backend, it's more code to maintain, and likely wont improve performance over using a single pool. If you find requests often waiting on available clients from the pool you can increase the size of the built in pool with pg.defaults.poolSize = 100 or something. I've set the default at 20 which is a sane default I think. If you have long running queries during web requests you probably have bigger problems than increasing your pool size is going to solve.
There is a legacy implementation(to an extent company proprietary in Pascal, C with some java macros) which processes TCP Socket based requests from TCP client application. It supports multiple client applications(around 5K) connecting over TCP Socket, however, it only supports single socket connection with backend(database). There are two instances of the server, so in total, it supports 10K client applications over two TCP Socket connection with database. All database related communication happens in synchronous manner over single socket connection. There are massive issues in this application, especially higher RTT(Round Trip Time) and occasional outages due to back-pressure. We have an ops team for such issues. They mostly resolve them by restarting the server. Hardly, we have people in our team who know coding details of this application and there is not much documentation. As this is a critical application we can not afford messing with it. We don't want to touch the code at least for now. This even becomes more critical due to shift in business priorities. There is a need to add another 30K client applications of another business with this setup.
Task before us is to integrate it with another application which is based on microservice architecture with middleware using RabbitMQ. This is a customer facing application sensitive to higher QoS. We can not afford outage & downtime in it. As part of this integration, there is a need to process request messages coming from the above legacy application over TCP Socket before passing them to database. In other words, we want to introduce a component which would process requests of legacy application before handing over to database. This additional process is part of our client request. Some of the processing requirement is very intensive and resource hungry in terms of CPU Cycle, Memory and socket i/o. As a result, there are chances, such processing may lead to server downtime & higher RTT. Our this layer is very flexible, we can easily add more server or replace faulty ones. But, this doesn't sound very efficient in this integration as we are limited with single socket connection of legacy application. So in total at max, we can only have 2(+ 6 for new 30k client application) servers. This is our cause of concern.
I want know, what different possible options are available to address high availability, scalability and latency issues of such integration? Especially with limitation of single TCP socket connection, how can we make this integration efficient, something which can handle back-pressure, better application uptime etc.
We were thinking of leveraging RabbitMQ, Layer 4 Load balancer(like haProxy, NginX), IPVS, NAT etc.. But all lead toward making some changes(or not very efficient technique) in the legacy code, which we don't want.
In my application, I need to configure 2 databases during start up. They are created as Tomcat JDBC pools - org.apache.tomcat.jdbc.pool with seperate pool properties. If I configure such that both the database URLs, user name and password are same i.e., both point to the same database server, then how will be the connection pool created? Will it create 2 pools with different properties or only one? If its only one, which pool properties will be set to the pool - the one which is first created or the next?
Also please suggest if there is any tool which can be used to see the connections to a database and the pools created on it?
My guess is that they would create two separate pools. You can check the SQL server for the "active connections" which the pool should keep a few alive.
My suggestion though, is to use HikariCP as your connection pool. I've found it to be the most robust (survives even if the SQL server goes down) and the fastest (lightest, and smallest lib).
I'm in process of upgrading our client-server ERP app to multi-tier. We want to offer our customers the possibility having their databases in cloud(hosted in our server).So, clients are written in Delphi, server is a http IOCP server written also in Delphi (from mORMot framework), for dbs we use Firebird embedded.
Our customers(let's say 200), can have 25-30 Firebird databases (total 5000-6000 databases), accessed by 4-5 user per each customer. This is not happening all at once. One user can work in one db, other 2 users can work in another db, but all the dbs should be available and online. So, I can have 800-1000 users working at 700-900 dbs. Databases are not big, typically 20-30 MB each but can go to 200 MB.
This is not data sharding so please don't suggest to merge all databases together, I really need them individually with possibility to backup/restore/replace each one of them.
So, I need multiple pools of connections - for every database I need a pool of let's say 2 connections. I read about Firedac connection pooling. It seems that TFDManager should be perfect for me. I define multiple "ConnectionDef"s with "Pooled=true" and it can maintain multiple pools of connections (each connection lasting until some minutes of inactivity).
Questions:
I have to create all "ConnectionDef"s before server starts serving requests?
Can TFDManager "handle" requests (and time-out connections on inactivity), while in other thread I need to create a new database , so automatically I need to create a new pool of connections and start serving requests from newly created database. Practically can I call FDManager.AddConnectionDef(..) while other pools are in use?
AFAIK Firebird embedded does not have any "connection". As its name states, it is embedded within the same process, so there is no connection pool needed. Connection pools are needed when several clients connect/disconnect to the same DB over the network, whereas here all is embedded, and you would have direct access to the Firebird engine.
As such, you may:
Define one "connnection" per Firebird embedded database;
Protect your SOA code via a mutex (aka critical section) per DB. In fact, mORMot's HTTP IOCP server would run the incoming requests from a thread pool, so ensure that all DB access is safely made.
Ensure you use at least Firebird 2.5 since the embedded version is told to be threadsafe only since revision 2.5 (see the release notes).
Instead of FireDAC, consider using ZDBC/Zeos (in latest 7.2/7.3 branch), which has nice features, together with the native mORMot SynDB libraries.
Looking at Firedac sources, seems that all about adding connection definitions and acquiring connections in pooled mode is thread-safe.
Adding a connection definition or matching one is guarded by a TMultiReadExclusiveWriteSynchronizer and acquiring a connection from the pool is guarded by a TCriticalSection.
So, answers:
I don't have to create all "ConnectionDef"s before server starts serving requests.
Yes, I can call safely FDManager.AddConnectionDef(..) while other pools are in use.
Using Firedac, acquiring a connection for any of those databases will be guarded by one TCriticalSection. The solution proposed by #Arnaud Bouchez presents a more grained access by creating one TCriticalSection per database and I think will scale better, but you should be aware of a bug when using multiple TCriticalSection, especially that all will be initiated at once:
https://www.delphitools.info/2011/11/30/fixing-tcriticalsection/
In that article present a very simple fix for this bug.
Why is database connection expensive to create ? Like what finite resource (bandwidth/network round trip/cpu) exactly is it consuming ?
Typically expensive to create means it is consuming some resource like cpu/disk/io, but in case of connection i can only think of the time it takes for Sync/Ack etc.
You didn't say what database you are asking about, so this answer is pretty generic.
Database connections are much more than just a TCP/IP socket. Each connection consumes memory that associates the user with various resources in the database. It will likely use up some memory blocks from a shared memory pool, etc. Just authorizing the connection will run several queries, depending on the connection string. First the user will be authenticated. If an "initial-catalog" is specified, then an authorization will be performed as well. And if there is some sort of auditing going on, then the connection will be logged somewhere.