Database connection expensive to create? - database-connection

Why is database connection expensive to create ? Like what finite resource (bandwidth/network round trip/cpu) exactly is it consuming ?
Typically expensive to create means it is consuming some resource like cpu/disk/io, but in case of connection i can only think of the time it takes for Sync/Ack etc.

You didn't say what database you are asking about, so this answer is pretty generic.
Database connections are much more than just a TCP/IP socket. Each connection consumes memory that associates the user with various resources in the database. It will likely use up some memory blocks from a shared memory pool, etc. Just authorizing the connection will run several queries, depending on the connection string. First the user will be authenticated. If an "initial-catalog" is specified, then an authorization will be performed as well. And if there is some sort of auditing going on, then the connection will be logged somewhere.

Related

How many Redis connections are used with Actioncable?

I'm looking at moving my app's deployment to Heroku, and I'd like to determine if it can correctly run there on the basic plan before putting in the effort to migrate. The basic plan limits Redis to 20 connections.
I don't fundamentally understand the Rails/Redis connection architecture. Is there a single connection to Actioncable, which is then distributing the data, or is the connection per actual client (i.e. one connection for every browser tab)?
As per the docs,
An individual user will create one consumer-connection pair per browser tab, window, or device they have open.
ActionCable lets you identify a connection using a connection identifier, typically a global object called current_user in most cases. With this approach, you can later retrieve all open connections by a given user (and potentially disconnect them all if the user is deleted or unauthorized or have too many connections open).
Also, note that ActionCable uses a worker pool to run connection callbacks and channel actions in isolation from your server's main thread.

Cowboy web server - improve performance

Cowboy is webserver written in erlang. It spawns new process for each request and than using that process for subsequent requests if HTTP pipelining (sending multiple requests on same socket one after the other without waiting for the response and assuming that responses will be send back in same order as requests was sent) is used by client.
This is fine, but if you want to use that webserver for building realtime web app, it has one problem and that is when socket is closed for instance because of client network problems, the process representing that socket on the server is terminated. That means you can`t use that process for storing some session data (because in realtime web app you probably want to go behind the end of the http request (if long polling is used for instance) and have some state associated to the connected client and think about him as "he is online" even if the http request was ended.
In sock.js, it is solved by spawning one more process for each client (each session id).
So if you have 2000 clients using websockets, you will have around 4k processes (one process from cowboy that represents that socket and one more for keeping the session state alive for case that cowboy process will be terminated (for instance because of network problems).
THE QUESTION IS: i am relative new in erlang so i don`t know if it does make sense much in question of performance improvement, but i am thinking about rewriting that Cowboy webserver a bit so the process representing realtime connection will not ends until i want it (the process will be alive even when the underlying websocket socket will be terminated).
This will eliminate the needs to have one more session process for each client. So instead of 4000 processes you will have just 2000. Can it be huge performance booster in erlang?
Erlang is pretty good with processes, but, too much of anything ain't good. Using processes as direct mappings to sessions is not a good idea. Why not do it logically ? I assume you can have some IN-MEMORY storage, say, ETS, or even mnesia.
If am using Web Sockets to communicate, each user is connected via one such process, however, you simply map a certain random unique Session Key to each individual Process, hence to each individual user.
-record(client,{web_sock_pid, session_key,username}).
If the process exits, and the client end has a way pf reconnecting, once it re-identifies itself as the same user, then , the session key still holds, but the pid of the attached process has changed. it does not matter.
If it is NOT web sockets, and it is just HTTP REST/JSON/JSONP/XML services , then it is even very easy. Use ETS tables in RAM. A new session is stored and the parameters defining that session are store in RAM, then for each request, the session key can come along plus other parameters. Message delivery is by comet or frequent checks by the client end.
Sounds like you are doing some premature optimizations if you ask me.
Erlang processes are very inexpensive. You shouldn't really have to worry about spawning too manny processes.
Write it with two processes per websocket, then do some measurements to see where it is using the most memory and wasting the most cpu cycles.

Java: Sharing a connection pool accross other J2SE Apps...?

So I have a connection pool setup. Which is great and all since I have an application that really needs it. However what I would like to know is if it is possible to share this connection pool with other J2SE apps? Would this even be worth it, as opposed to creating a connection pool based on each apps needs? If it would be prudent, how can I accomplish this?
It is not hard having connection pools in a single JVM doing multiple things - that is what applications servers do everyday (using JNDI to throw objects across classloaders)
The interesting part is when you have the connection pool in a separate JVM from the client code needing it, as this does not immediately allow simply asking for and getting a connection from the pool and returning it afterwards.
Basically you have two options:
Doing remote requests for all your JDBC commands over the network. This will most likely mean that the data will travel over the network twice, from the database to the connection pool, and then from the connection pool to your application. If the database connections are very expensive objects then this might be a viable solution.
Use RMI to get the connection object from the connection pool JVM to your own machine. This is a very expensive operation, but can as far as I know include the actual driver classes, allowing your connection pool to provide connections to databases not known to your application JVM. To me this would only make sense if the database connections were ridiculoulusly expensive or it was a requirement to be able to support additional databases after deployment without changing the original deployments.
Note that the primary reason for having connection pools at all is because connections are expensive to create, use shortly and then discard. Some databases more than others, e.g. MySQl is (or was when I tried) very cheap so it might be the simplest just to do that.
So. First of all: Measure what your connection pool buys you in time, and then consider if it is worth your while to centralize this further.

Which strategy about connection management should we use when developing an application?

Which use of connection management is better while developing a windows based application which uses a Database as its data store? What about web-based applications?
when user loads the first form of an application, the global
connection opens and on closing the last form of the application
the connection closes and disposes.
for each form within the application, there is a local connection
(form scope) and when user wants to perform an operation like
insert, update, delete, search, ... the application uses the
connection and by unloading the form the connection also closes and
disposes.
for every operation within a form of an application, there is a
local connection (procedure scope) and when user wants to perform
an operation like insert, update, delete, search, ... the
application uses procedure connection and at the end of every
procedure within the form, the connection also closes and disposes.
Go with #3
You should try to only ever keep connections open for just as long as is required.
Also have a look at
Understanding Connection Pooling
SQL Server Connection Pooling
(ADO.NET)
Connecting to a database server
typically consists of several
time-consuming steps. A physical
channel such as a socket or a named
pipe must be established, the initial
handshake with the server must occur,
the connection string information must
be parsed, the connection must be
authenticated by the server, checks
must be run for enlisting in the
current transaction, and so on.
In practice, most applications use
only one or a few different
configurations for connections. This
means that during application
execution, many identical connections
will be repeatedly opened and closed.
To minimize the cost of opening
connections, ADO.NET uses an
optimization technique called
connection pooling.
Connection pooling reduces the number
of times that new connections must be
opened. The pooler maintains ownership
of the physical connection. It manages
connections by keeping alive a set of
active connections for each given
connection configuration. Whenever a
user calls Open on a connection, the
pooler looks for an available
connection in the pool. If a pooled
connection is available, it returns it
to the caller instead of opening a new
connection. When the application calls
Close on the connection, the pooler
returns it to the pooled set of active
connections instead of closing it.
Once the connection is returned to the
pool, it is ready to be reused on the
next Open call.
This is quite a broad question. But usually, for any database server and application environment, opening and keeping a new connection is an expensive operation. That's why you definitely don't want to open multiple connections from a single client, and should stick to process-scope for connections.
In a desktop application using a database server, strategy for handling it's single connection depends a lot on the DB usage pattern. Say, if the app reads or writes something a lot within 5 minutes, and then just does nothing with the DB for hours, it makes no sense to keep the connection open all the time (assuming there are many other clients). You may introduce some kind of time-out for closing a connection.
The Web server situation depends a lot on the used technology. Say, in PHP every request is a "fresh start" WRT database connection. You open and close a connection for each mouse click. While popular Java application servers have DB connections pool, reusing the same connection instances for many HTTP request handling threads.

difference between shareable and unshareable connection in jdbc connection pool?

We notice something strange in our struts web application which was hosted on sun app server enterprise edition 8.1.
The NumConnUsed for Monitoring of JDBC resources stays at 100 over connections even though there was relatively very low user activities.
I try to do some research and found the following links
http://j2ee-performance.blogspot.com/
http://www.ibm.com/developerworks/websphere/library/techarticles/0506_johnsen/0506_johnsen.html
"When the application closes a shareable connection, the connection is not truly closed, nor is it returned to the free pool. Rather, it remains in the Shared connection pool, ready for another request within the same LTC for a connection to the same resource."
Base on the above comments, it is true that if my web.xml resource ref scope is set to shareable, when application side close conneciton, it remains in the shared conneciton pool thus the numconnused is always so high?
If I interpret the links in my own special way (;)), the shared vs. unshared connections is based on different connections in the same page.
java.sql.Connection connectionOne = DriverManager.getConnection(...);
...
java.sql.Connection connectionTwo = DriverManager.getConnection(...);
These two, at a glance, seem to be individual - but if your AS is set to shareable connections, the second one will be created with a pointer to the first connection instead of returning a new connection. When the page finishes the connection should be sent back to the pool.
The AS is probably keeping the pool filled with connections to enhance performance.
This is not fact, only my own iterpretation of the links.

Resources