Connection Acquiring Timing out despite idle connections - database-connection

We are using r2dbc-pool for our application, along with Jooq.
The ConnectionFactory is as follows
ConnectionFactoryOptions.Builder connectionFactoryBuilder = ConnectionFactoryOptions.builder();
connectionFactoryBuilder.option(ConnectionFactoryOptions.HOST, ....)
.option(ConnectionFactoryOptions.DRIVER, "pool")
.option(ConnectionFactoryOptions.PROTOCOL, "postgres")
.option(ConnectionFactoryOptions.DATABASE, ....)
.option(ConnectionFactoryOptions.USER, username)
.option(ConnectionFactoryOptions.PASSWORD, password);
return ConnectionFactories.get(connectionFactoryBuilder.build());
The ConnectionPoolConfiguration looks something like this
ConnectionPoolConfiguration configuration = ConnectionPoolConfiguration.builder(<connection-factory>)
.initialSize(10)
.maxCreateConnectionTime(Duration.ofSeconds(30))
.maxAcquireTime(Duration.ofSeconds(30))
.acquireRetry(3)
.maxSize(20)
.build();
We were constantly getting Connection acquisition timed out after 30000ms. We suspected that maybe the load was too much for the connection and decided to log the PoolMetrics exposed by ConnectionPool.getMetrics()
The code we had to get connection looks something like this
public Single<Connection> getConnection(ConnectionPool connectionPool) {
Optional<PoolMetrics> poolMetricsOptional = connectionPool.getMetrics();
poolMetricsOptional.ifPresent(
poolMetrics -> log.info("Connection Pool before acquiring connection: {}", poolMetrics)
);
Single<Connection> connectionSingle = Single.fromPublisher(connectionPool.create());
Optional<PoolMetrics> poolMetricsOptional = connectionPool.getMetrics();
poolMetricsOptional.ifPresent(
poolMetrics -> log.info("Connection Pool after acquiring connection: {}", poolMetrics)
);
return connectionSingle;
When we started hitting the timeout the logs looked something like this
Connection Pool before acquiring connection: Acquire Size: 1, Allocated Size: 20, Idle Size: 9, Pending Acquire Size: 0
Connection Pool after acquiring connection: Acquire Size: 1, Allocated Size: 20, Idle Size: 9, Pending Acquire Size: 0
I have two doubts:
Shouldn't acquire size + idle size be equal to allocated size?
Any idea why isn't the connection being acquired despite the idle connections?
Version details are as follows
r2dbc-pool: 0.9.1.RELEASE
r2dbc-postgresql: 0.9.0.RELEASE

Related

AWS SQS SimpleMessageListenerContainer failing while polling queue

I have one SqsListener in my spring boot app as below:
#SqsListener(value = "QUEUE-FQN", deletionPolicy = SqsMessageDeletionPolicy.NEVER)
private void receiveNotifications(String payload, MessageHeaders headers, Acknowledgment acknowledgment)
throws IOException, ParseException, InterruptedException {
try {
// process message here
}
}
And the following two beans defined:
#Bean
public AmazonSQSAsync amazonSQSAsync(AWSCredentialsProvider awsCredentialsProvider) {
return AmazonSQSAsyncClientBuilder
.standard()
.withCredentials(awsCredentialsProvider)
.withRegion(Regions.US_EAST_1.getName())
.build();
}
#Bean
public SimpleMessageListenerContainerFactory simpleMessageListenerContainerFactory(AmazonSQSAsync amazonSqs) {
SimpleMessageListenerContainerFactory factory = new SimpleMessageListenerContainerFactory();
factory.setAmazonSqs(amazonSqs);
factory.setMaxNumberOfMessages(10);
factory.setAutoStartup(false);
return factory;
}
After the simpleMessageListenerContainer.start(QUEUE_NAME), I see below exceptions constantly and the listener is never able to poll for new messages (never makes any progress).
Anything that I am doing wrong here? How to get past this error:
WARN 1297 --- [enerContainer-2] i.a.c.m.l.SimpleMessageListenerContainer : An Exception occurred while polling queue 'https://sqs:us-east-1:amazonaws:com/ACCOUNTID/QUEUE_NAME'. The failing operation will be retried in 10000 milliseconds
org.springframework.core.task.TaskRejectedException: Executor [java.util.concurrent.ThreadPoolExecutor#3cee2db[Running, pool size = 11, active threads = 11, queued tasks = 0, completed tasks = 1188]] did not accept task: io.awspring.cloud.messaging.listener.SimpleMessageListenerContainer$SignalExecutingRunnable#681b4433
at org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor.execute(ThreadPoolTaskExecutor.java:363)
at io.awspring.cloud.messaging.listener.SimpleMessageListenerContainer$AsynchronousMessageListener.run(SimpleMessageListenerContainer.java:343)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.util.concurrent.RejectedExecutionException: Task io.awspring.cloud.messaging.listener.SimpleMessageListenerContainer$SignalExecutingRunnable#681b4433 rejected from java.util.concurrent.ThreadPoolExecutor#3cee2db[Running, pool size = 11, active threads = 11, queued tasks = 0, completed tasks = 1188]
at java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055)
at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825)
at java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355)
at org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor.execute(ThreadPoolTaskExecutor.java:360)
... 6 more

gRPC streaming call which takes longer than 2 minutes is killed by hardware (routers, etc.) in between client and server

Grpc.Net client:
a gRpc client sends large amount of data to a gRpc server
after the gRpc server receives the data from the client, the http2 channel becomes idle (but is open) until the server returns the response to the client
the gRpc server receives the data and starts processing it. If the data processing takes longer than 2 minutes (which is the default idle timeout for http calls) then the response never reaches the client because the channel is actually disconnected, but the client does not know this because it was shutdown by other hardware in between due to long idle time.
Solution:
when the channel is created at the gRpc client side, it must have a httpClient set on it
the httpClient must be instantiated from a socketsHttpHandler with
the following properties set (PooledConnectionIdleTimeout, PooledConnectionLifetime, KeepAlivePingPolicy, KeepAlivePingTimeout, KeepAlivePingDelay)
Code snipped:
SocketsHttpHandler socketsHttpHandler = new SocketsHttpHandler()
{
PooledConnectionIdleTimeout = TimeSpan.FromMinutes(180),
PooledConnectionLifetime = TimeSpan.FromMinutes(180),
KeepAlivePingPolicy = HttpKeepAlivePingPolicy.Always,
KeepAlivePingTimeout = TimeSpan.FromSeconds(90),
KeepAlivePingDelay = TimeSpan.FromSeconds(90)
};
socketsHttpHandler.SslOptions.RemoteCertificateValidationCallback = (sender, cert, chain, sslPolicyErrors) => { return true; };
HttpClient httpClient = new HttpClient(socketsHttpHandler);
httpClient.Timeout = TimeSpan.FromMinutes(180);
var channel = GrpcChannel.ForAddress(_agentServerURL, new GrpcChannelOptions
{
Credentials = ChannelCredentials.Create(new SslCredentials(), credentials),
MaxReceiveMessageSize = null,
MaxSendMessageSize = null,
MaxRetryAttempts = null,
MaxRetryBufferPerCallSize = null,
MaxRetryBufferSize = null,
HttpClient = httpClient
});
A workaround is to package your message in an oneof and then send a KeepAlive from a seperate thread every x seconds, for the duration of the calculations.
For example:
message YourData {
…
}
message KeepAlive {}
message DataStreamPacket {
oneof data {
YourData data = 1;
KeepAlive ka = 2;
}
}
Then in your code:
stream <-
StartThread() {
each 5 seconds:
Send KeepAlive
}
doCalculations()
StopThread()
SendData()
this is what I needed. I had this problem for months now, but my only solution was to decrease the volume of data.

Flush data added to websocket

I'm writing a speed test, but i'm having trouble on the client side for uploading.
I have a the following setup, which basically continues to write data into the socket while a condition is true, and then closes the socket:
var ws = await createWebSocket(sb.serverAddress, sb.authToken);
while (condition) {
var bytes = generateRandomBytes(_BUFFER_SIZE_BYTES);
ws.add(bytes);
print('added');
var megabits = (bytes.length * 8) / 1000000;
channel.sink.add(megabits);
}
await ws.close();
My problem is that I can't work out how to wait for the bytes to be accepted by the underlying buffer. Even if I set _BUFFER_SIZE_BYTES to an huge size it still loops at break neck speed printing out added, where I really want to wait until all the bytes are accepted by the send buffer (having been accepted by the server) before adding a new list of bytes.
With an http post request you can do: await postReq.flush();, but I don't see any such method for web sockets.
Ok so I think I have a reasonable solution to this problem.
Client side has to wait for a response from the server before sending more bytes:
var bytes = generateRandomBytes(_CHUNK_SIZE_BYTES);
ws.listen((data) async {
ws.add(bytes);
var megabits = (bytes.length * 8) / 1000000;
channel.sink.add(megabits);
}
});
Server (Go) sends a message to the client signalling that it can send a chunk, and then reads the entire response from the client, before signalling to the client that it is ready to accept another one:
for start := time.Now(); time.Since(start) < time.Second*maxDuration; {
err := conn.WriteMessage(websocket.TextMessage, []byte("next"))
if err != nil {
break
}
// will get an error if try writing to closed socket
_, bytes, err := conn.ReadMessage()
if err != nil {
fmt.Println(err)
break
}
fmt.Println(len(bytes))
}
I think this solution is ok. I've set the chunk size to 10Mb which seems to work ok. Let me know if anyone has a better idea.

connections were in use and max pool size was reached

I am working on dotnet core 5.0 and I am facing this error message:
Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.
I set these parameters to my connection string
"DefaultConnection": "Data Source=.;Initial Catalog=TestDB;User Id=sa;password=admin123;Pooling=true;Max Pool Size=100;MultipleActiveResultSets=true"
This is how I deal with DB:
using (CD_DataToolContext objCDContext = new CD_DataToolContext())
{
List<CdStandardFile> standardFiles = new List<CdStandardFile>();
foreach (var item in data) {
CdStandardFile cdStandardFile = new CdStandardFile();
standardFiles.Add(cdStandardFile);
}
objCDContext.AddRange(standardFiles); objCDContext.SaveChanges();
}

Glassfish Connection Pool - java.sql.SQLException: Connection closed

I'm using for a web project JSF2 with Oracle Glassfish Server Open Source Edition 4.0 and Oracle Database 11g (Version 11.2.0-1.0).
The server and database are running on the same windows machine.
A connection pool managed the connections to the database.
Does anybody know why I sometimes get the following exception:
java.sql.SQLException: Connection closed
at com.sun.gjc.spi.base.ConnectionHolder.checkValidity(ConnectionHolder.java:766)
at com.sun.gjc.spi.base.ConnectionHolder.commit(ConnectionHolder.java:243)
at de.mydomain.myproject.Hl7MessageHandler.run(Hl7MessageHandler.java:123)
...
Or sometimes this one:
java.sql.SQLRecoverableException: Closed connection
at oracle.jdbc.driver.PhysicalConnection.commit(PhysicalConnection.java:5675)
at oracle.jdbc.driver.PhysicalConnection.commit(PhysicalConnection.java:5735)
at com.sun.gjc.spi.base.ConnectionHolder.commit(ConnectionHolder.java:244)
at de.mydomain.myproject.Hl7MessageHandler.run(Hl7MessageHandler.java:123)
...
The Database Class:
public static Connection getConnection() throws NamingException, SQLException {
Context initContext = new InitialContext();
DataSource datasSource = (DataSource)initContext.lookup("jdbc/Oracle");
Connection connnection = datasSource.getConnection();
return connnection;
}
Request handling in the servlet:
public IResponseSendable<String> run(String hl7MsgString, boolean publishErrorToDB) {
// ... do something
try {
con = Database.getConnection();
} catch (NamingException | SQLException conExc) {
return generateAck(true, conExc.getMessage(), hl7MsgString);
}
try {
con.setAutoCommit(false);
process();
con.commit();
} catch (HL7Exception | SQLException pe) {
logger.error(...);
// Exceptionhandling...
try {
con.rollback();
} catch (SQLException rollbackExc) {
logger.error(...);
}
return generateAck(true, pe.getMessage(),hl7MsgString, _log);
}
finally {
try {
con.setAutoCommit(true);
con.close();
} catch (SQLException e) {
logger.error(...);
}
}
return generateAck(false, "", hl7MsgString);
}
The process-Methode:
private void process() throws HL7Exception, SQLException {
// Do something...
String sql = "BEGIN save_patient_data(?,?,?,?,?,?,?); END;";
CallableStatement stmt = (CallableStatement) con.prepareCall(sql);
stmt.setString(1, ...);
// ...
stmt.registerOutParameter(6, java.sql.Types.VARCHAR);
stmt.registerOutParameter(7, java.sql.Types.NUMERIC);
stmt.execute();
// ...
stmt.close();
// More databse stored procedure can be called ...
}
Connection Pool Settingts:
Initial and Minimum Pool Size: 10 Connections
Maximum Pool Size: 60 Connections
Pool Resize Quantity: 2 Connections
Idle Timeout: 600 Seconds
Max Wait Time: 0 Milliseconds
Validate At Most Once: 0 Seconds
Connection Leak Timeout: 10 Seconds
Connection Leak Reclaim: enabled
Statement Leak Timeout: 6 Seconds
Statement Leak Reclaim: enabled
Creation Retry Attempts: 0
Retry Interval: 10 Seconds
Connection Validation: Required
Validation Method: meta-data
The database IDLE-Timeout setting is "UNLIMITED".
Notcie:
The exception occurred either when to call "con.prepareCall(sql);" (must not be at the first time) or when I try to commit the connection or later when to try to turn autocommit on.
Does any body know the reason or what is the best way to debug the application to find it out?
Thank you.
Edit:
Maybe it's important:
I can find in the server log many warnings about connection leaks:
2014-07-28T14:49:17.961+0200|Warnung: A potential connection leak detected for connection pool OraclePool. The stack trace of the thread is provided below :
com.sun.enterprise.resource.pool.ConnectionPool.setResourceStateToBusy(ConnectionPool.java:324)
com.sun.enterprise.resource.pool.ConnectionPool.getResourceFromPool(ConnectionPool.java:758)
com.sun.enterprise.resource.pool.ConnectionPool.getUnenlistedResource(ConnectionPool.java:632)
com.sun.enterprise.resource.pool.AssocWithThreadResourcePool.getUnenlistedResource(AssocWithThreadResourcePool.java:200)
com.sun.enterprise.resource.pool.ConnectionPool.internalGetResource(ConnectionPool.java:526)
com.sun.enterprise.resource.pool.ConnectionPool.getResource(ConnectionPool.java:381)
com.sun.enterprise.resource.pool.PoolManagerImpl.getResourceFromPool(PoolManagerImpl.java:245)
com.sun.enterprise.resource.pool.PoolManagerImpl.getResource(PoolManagerImpl.java:170)
com.sun.enterprise.connectors.ConnectionManagerImpl.getResource(ConnectionManagerImpl.java:360)
com.sun.enterprise.connectors.ConnectionManagerImpl.internalGetConnection(ConnectionManagerImpl.java:307)
com.sun.enterprise.connectors.ConnectionManagerImpl.allocateConnection(ConnectionManagerImpl.java:196)
com.sun.enterprise.connectors.ConnectionManagerImpl.allocateConnection(ConnectionManagerImpl.java:171)
com.sun.enterprise.connectors.ConnectionManagerImpl.allocateConnection(ConnectionManagerImpl.java:166)
com.sun.gjc.spi.base.AbstractDataSource.getConnection(AbstractDataSource.java:114)
de.mydomain.myproject.utilities.Database.getConnection(Database.java:17)
...
You have connection leak reclaim enabled and the connection leak timeout is 10 seconds. This means that if you hold onto a logical connection for longer than 10 seconds, it is forcibly revoked and closed by the connection pool manager (and the physical connection is returned to the connection pool). Subsequent attempts to use the logical connection will result in a SQLException as the connection is closed.
Find out which operation takes longer than 10 seconds and try to reduce the time it takes or configure a longer connection leak timeout (10 seconds is IMHO a bit short for connection leak detection). The same BTW applies to your statement leak detection (6 seconds is also pretty short).

Resources