I am trying to debug a connection leak issue in my app where the connection manager is using more connections than I would have wanted. One lead I have is the following:
According to Apache HttpClient quick start, if response content is not fully consumed for any reason, the pooling connection manager can not safely reuse the underlying connection and will discard it.
Can anyone point me to the code block that does the checking of unconsumed content in a connection and the discarding of the connection?
// Please note that if response content is not fully consumed the underlying
// connection cannot be safely re-used and will be shut down and discarded
// by the connection manager.
try (CloseableHttpResponse response1 = httpclient.execute(httpGet)) {
System.out.println(response1.getCode() + " " + response1.getReasonPhrase());
HttpEntity entity1 = response1.getEntity();
// do something useful with the response body
// and ensure it is fully consumed
EntityUtils.consume(entity1);
}
HttpClient version 4.x and 5.x wrap HTTP response entity with a proxy that releases the underlying connection back to the pool upon reaching the end of the message stream. In all other cases HttpClient assumes the message has not been fully consumed and the underlying connection cannot be re-used.
https://github.com/apache/httpcomponents-client/blob/master/httpclient5/src/main/java/org/apache/hc/client5/http/impl/classic/ResponseEntityProxy.java
Related
I'm using java.net.http.HttpClient.newHttpClient() under Java 19 (Temurin) and perform sendAsync(...) requests from different treads on the same instance. I assume this is ok, as the javadoc states:
Once built, an HttpClient is immutable...
However, some requests fail with:
java.io.IOException: HTTP/1.1 header parser received no bytes
The weird thing is, it depends on the speed of my requests:
Requests every 5 seconds: 30% failure
Requests every 3 seconds: 0% failure
I've written a test for it:
private final HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create("https://..."))
.setHeader("Content-Type", "application/json")
.POST(HttpRequest.BodyPublishers.ofByteArray("[]".getBytes()))
.build();
#ParameterizedTest
#ValueSource(ints = {3, 5})
void httpClientTest(int intervalSeconds) throws Exception {
HttpClient httpClient = HttpClient.newHttpClient();
httpClient.sendAsync(request, HttpResponse.BodyHandlers.ofByteArray()).get();
Thread.sleep(Duration.ofSeconds(intervalSeconds));
httpClient.sendAsync(request, HttpResponse.BodyHandlers.ofByteArray()).get();
Thread.sleep(Duration.ofSeconds(intervalSeconds));
httpClient.sendAsync(request, HttpResponse.BodyHandlers.ofByteArray()).get();
Thread.sleep(Duration.ofSeconds(intervalSeconds));
httpClient.sendAsync(request, HttpResponse.BodyHandlers.ofByteArray()).get();
Thread.sleep(Duration.ofSeconds(intervalSeconds));
httpClient.sendAsync(request, HttpResponse.BodyHandlers.ofByteArray()).get();
}
I've already tried the following:
Doing the same with curl on the command line. No requests fail whatever interval I try. So it's probably not a problem with the server.
Running the tests multiple times in parallel. Still the 5-second-intervals fail (then multiple times in parallel). So it's probably not a problem with the server.
Creating an HttpClient.newHttpClient() for every request. No requests fail whatever interval. So it's probably not a problem with the server but with an internal state of the HttpClient (although it claims to be immutable?).
Do you have an idea what I could do, without needing to create a new HttpClient for every request?
Here is the answer for the record: the java.net.HttpClient has a long default HTTP/1.1 keepAlive time, which is longer than what usual servers are configured with. This often results in the server closing idle HTTP/1.1 connections before the client does. If the server closes the connection at about the same time than the client tries to reuse it, some IOException might get raised.
If such exceptions are observed too frequently applications should consider adapting the default keepAlive time in the client to some value shorter than what the servers it connects to are using.
A default value for the HttpClient HTTP/1.1 keepAlive time can be specified on the command line with: -Djdk.httpclient.keepalive.timeout=duration-in-seconds
So for instance - if a server is configured with a keepAlive time of 5s, you could consider supplying -Djdk.httpclient.keepalive.timeout=3 or -Djdk.httpclient.keepalive.timeout=4 on the client's java command line.
I am using the TIdHTTP component and it's GET function.
The GET function sends a complete request, which is fine.
However I would like to spare/save some traffic from a GET response and only want to receive the Responsecode which is in the first "line" of a HTTP response.
Is there a possibility of disconnecting the connection in order to save traffic from any further content?
As mentioned, I only need the responsecode from a website.
I alternatively thought about using Indy's TCP component (with SSL IOHandler) and craft an own HTTP Request Header and then receive the responsecode and disconnect on success - but I don't know how to do that.
TIdHTTP has an OnHeadersAvailable event that is intended for this very task. It is triggered after the response headers have been read and before the body content is read, if any. It has a VContinue output parameter that you can set to False to cancel any further reading.
Update: Something I just discovered: When setting VContinue=False in the OnHeadersAvailable event, TIdHTTP will set Response.KeepAlive=False and skip reading the response body (OK so far), but after the response is done being processed, TIdHTTP checks the KeepAlive value, and the property getter returns True if the socket hasn't been closed on the server's end (HTTP 1.1 uses keep-alives by default). This causes TIdHTTP to not close its end of the socket, and will leave any response body unread. If you then re-use the same TIdHTTP object for a new HTTP request, it will end up processing any unread body data from the previous response before it sees thee response headers of the new request.
You can work around this issue by setting the Request.Connection property to 'close' before calling TIdHTTP.Get(). That tells the server to close its end of the socket connection after sending the response (although, I just found that when requesting an HTTPS url, especially after an HTTP request directs to HTTPS, TIdHTTP clears the Request.Connection value!). Or, simply call TIdHTTP.Disconnect() after TIdHTTP.Get() exits.
I have now updated TIdHTTP to:
no longer clear the Request.Connection when preparing an HTTPS request.
close its end of the socket connection if either:
OnHeadersAvailable returns VContinue=False
the Request.Connection property (or, if connected to a proxy, the Request.ProxyConnection property) has been set to 'close', regardless of the server's response.
Usually you would use TIdHttp.Head, because HEAD requests are intended for doing just that.
If the server does not accept HEAD requests like in OP's case, you can assign the OnWorkBegin event of your TIdHttp instance, and call TIdHttp(Sender).Disconnect; there. This immediately closes the connection, the download does not continue, but you still have the meta data like response code, content length etc.
I'm using rabbitMQ server with amq.
I am having a difficult problem. After leaving the server alone for about 10 min, the connection is lost.
What could be causing this?
If you look at the Erlang client documentation http://www.rabbitmq.com/erlang-client-user-guide.html you will see a section titled Connecting To A Broker
This gives you a few different options that you can specify when setting up your connection to the RabbitMQ server, one of the options is the heartbeat, as you can see the default is 0 so no heartbeat is specified.
I don't know the exact Erlang notation, but you will need to do something like:
{ok, Connection} = amqp_connection:start(#amqp_params_network{heartbeat = 5})
The heartbeat timeout is specified in seconds. So this would cause your consumer to heartbeat back to the server every 5seconds.
Also take a look at this discussion: https://groups.google.com/forum/?fromgroups=#!topic/rabbitmq-discuss/u227xzvqOr8
The default connection timeout for the RabbitMQ connection factory is 600 seconds (at least in the Java client API), hence your 10 minutes. You can change this by specifying to the connection factory your timeout of choice.
It is good practice to ensure your connection is release and recreated after a specific amount of time, to prevent eventual leaks and excessive resournces. Your code should ensure that it seeks a valid connection that is not close to be timed-out, and re-establish a new connection on the ones that did time-out. Overall, adopt a connection-pooling approach.
- Java example:
ConnectionFactory factory = new ConnectionFactory();
factory.setHost(this.serverName);
factory.setPort(this.serverPort);
factory.setUsername(this.userName);
factory.setPassword(this.userPassword);
factory.setConnectionTimeout( YOUR-TIMEOUT-IN-SECONDS );
Connection = factory.newConnection();
I have written a web-service using Erlang and Mochiweb. The web service returns a lot of results and takes some time to finish the computation.
I'd like to return results as soon as the program finds it, instead of returning them when it found them all.
edit:
i found that i can use a chunked request to stream result, but seems that i can't find a way to close the connection. so any idea on how to close a mochiweb request?
To stream data of yet unknown size with HTTP 1.1 you can use HTPP chunked transfer encoding. In this encoding each chunk of data prepended by its size in hexadecimal. Last chunk is a zero-length chunk, with the chunk size coded as 0, but without any data.
If client doesn't support HTTP 1.1 server can send data as binary chunks and close connection at the end of the stream.
In MochiWeb it's all works as following:
HTTP response should be started with Response = Request:respond({Code, ResponseHeaders, chunked}) function. (By the way, look at the code comments);
Then chunks can be send to client with Response:write_chunk(Data) function. To indicate client the end of the stream chunk of zero length should be sent: Response:write_chunk(<<>>).
When handling of current request is over MochiWeb decides should connection be closed or can be reused by HTTP persistent connection.
I have read that HttpURLConnection supports persistent connections, so that a connection can be reused for multiple requests. I tried it and the only way to send a second POST was by calling openConnection for a second time. Otherwise I got a IllegalStateException("Already connected");
I used the following:
try{
URL url = new URL("http://someconection.com");
}
catch(Exception e){}
HttpURLConnection con = (HttpURLConnection) url.openConnection();
//set output, input etc
//send POST
//Receive response
//Read whole response
//close input stream
con.disconnect();//have also tested commenting this out
con = (HttpURLConnection) url.openConnection();
//Send new POST
The second request is send over the same TCP connection (verified it with wireshark) but I can not understand why (although this is what I want) since I have called disconnect.
I checked the source code for the HttpURLConnection and the implementation does keep a keepalive cache of connections to the same destinations. My problem is that I can not see how the connection is placed back in the cache after I have send the first request. The disconnect closes the connection and without the disconnect, still I can not see how the connection is placed back in the cache. I saw that the cache has a run method to go through over all idle connections (I am not sure how it is called), but I can not find how the connection is placed back in the cache. The only place that seems to happen is in the finished method of httpClient but this is not called for a POST with a response.
Can anyone help me on this?
EDIT
My interest is, what is the proper handling of an HttpUrlConnection object for tcp connection reuse. Should input/output stream be closed followed by a url.openConnection(); each time to send the new request (avoiding disconnect())? If yes, I can not see how the connection is being reused when I call url.openConnection() for the second time, since the connection has been removed from the cache for the first request and can not find how it is returned back.
Is it possible that the connection is not returned back to the keepalive cache (bug?), but the OS has not released the tcp connection yet and on new connection, the OS returns the buffered connection (not yet released) or something similar?
EDIT2
The only related i found was from JDK_KeepAlive
...when the application calls close()
on the InputStream returned by
URLConnection.getInputStream(), the
JDK's HTTP protocol handler will try
to clean up the connection and if
successful, put the connection into a
connection cache for reuse by future
HTTP requests.
But I am not sure which handler is this. sun.net.www.protocol.http.Handler does not do any caching as I saw
Thanks!
Should input/output stream be closed
followed by a url.openConnection();
each time to send the new request
(avoiding disconnect())?
Yes.
If yes, I can not see how the connection is being
reused when I call
url.openConnection() for the second
time, since the connection has been
removed from the cache for the first
request and can not find how it is
returned back.
You are confusing the HttpURLConnection with the underlying Socket and its underlying TCP connection. They aren't the same. The HttpURLConnection instances are GC'd, the underlying Socket is pooled, unless you call disconnect().
From the javadoc for HttpURLConnection (my emphasis):
Each HttpURLConnection instance is
used to make a single request but the
underlying network connection to the
HTTP server may be transparently
shared by other instances. Calling the
close() methods on the InputStream or
OutputStream of an HttpURLConnection
after a request may free network
resources associated with this
instance but has no effect on any
shared persistent connection. Calling
the disconnect() method may close the
underlying socket if a persistent
connection is otherwise idle at that
time.
I found that the connection is indeed cached when the InputStream is closed. Once the inputStream has been closed the underlying connection is buffered. The HttpURLConnection object is unusable for further requests though, since the object is considered still "connected", i.e. its boolean connected is set to true and is not cleared once the connection is placed back in the buffer. So each time a new HttpUrlConnection should be instantiated for a new POST, but the underlying TCP connection will be reused, if it has not timed out.
So EJP answer's was the correct description. May be the behavior I saw, (reuse of the TCP connection) despite explicitly calling disconnect() was due to caching done by the OS? I do not know. I hope someone who knows can explain.
Thanks.
How do you "force use of HTTP1.0" using the HttpUrlConnection of JDK?
According to the section ā€˛Persistent Connectionsā€¯ of the Java 1.5 guide support for HTTP1.1 connections can be turned off or on using the java property http.keepAlive (default is true). Furthermore, the java property http.maxConnections indicates the maximum number of (concurrent) connections per destination to be kept alive at any given time.
Therefore, a "force use of HTTP1.0" could be applied for the whole application at once by setting the java property http.keepAlive to false.
Hmmh. I may be missing something here (since this is an old question), but as far as I know, there are 2 well-known ways to force closing of the underlying TCP connection:
Force use of HTTP 1.0 (1.1 introduced persistent connections) -- this as indicated by the http request line
Send 'Connection' header with value 'close'; this will force closing as well.
Abandoning streams will cause idle TCP connections. The response stream should be read completely. Another thing I overlooked initially, and have seen overlooked in most answers on this topic is forgetting to deal with the error stream in case of exceptions. Code similar to this fixed one of my apps that wasn't releasing resources properly:
HttpURLConnection connection = (HttpURLConnection)new URL(uri).openConnection();
InputStream stream = null;
BufferedReader reader = null;
try {
stream = connection.getInputStream();
reader = new BufferedReader(new InputStreamReader(stream, Charset.forName("UTF-8")));
// do work on part of the input stream
} catch (IOException e) {
// read the error stream
InputStream es = connection.getErrorStream();
if (es != null) {
BufferedReader esReader = null;
esReader = new BufferedReader(new InputStreamReader(es, Charset.forName("UTF-8")));
while (esReader.ready() && esReader.readLine() != null) {
}
if (esReader != null)
esReader.close();
}
// do something with the IOException
} finally {
// finish reading the input stream if it was not read completely in the try block, then close
if (reader != null) {
while (reader.readLine() != null) {
}
reader.close();
}
// Not sure if this is necessary, closing the buffered reader may close the input stream?
if (stream != null) {
stream.close();
}
// disconnect
if (connection != null) {
connection.disconnect();
}
}
The buffered reader isn't strictly necessary, I chose it because my use case required reading one line at a time.
See also: http://docs.oracle.com/javase/1.5.0/docs/guide/net/http-keepalive.html