Connection reset by peer on Azure - neo4j

I have a web application running on an App Service on Azure cloud.
On the back-end I'm using a tcp connection to our database (Neo4j graph db), the best practice is to open the tcp connection and keep it alive in order to be more reactive when we perform queries.
The issue I encountered is that the database is logging the exception "Connection reset by peer";
reading on the web I found out that maybe Azure has a TCP timeout configured by default, I read it to be set up to 4 minutes, which could be my issue root cause.
Someone knows how to configure the tcp KEEP ALIVE to always for App Services on Azure?
I found on the web how to do it in Google cloud but nothing about Azure cloud.
Thank you in advance.
OaicStef

From everything I can find that is not an adjustable setting. Here is the forum link that says it will not be changing and that is a couple years old at this point. https://social.msdn.microsoft.com/Forums/en-US/32b76114-67a4-4e6b-ac45-61b0f0a0829f/changing-the-4-minute-request-time-out-for-app-services?forum=windowsazurewebsitespreview
I think you are going to have to add logic to your app that tests the connection, if it has been closed then either reopen it or create a new one. I don't know what language you are using to make any suggestions there.
Edit
I will add that the total number of TCP connections that can be open on a single App Service is about 6k, at least on the S1. Keep that in mind because if you don't have pooling on the server side or you are not disposing of those then you will exhaust that the TCP pool and you will start getting errors. I recommend you configure an alert for that.

Related

Kubernetes Service Selector change doesn't take effect on connected Clients

I want to display a maintenance page on an application running under Kubernetes whilst a deployment is in progress, in this “maintenance” window, I backup the database and then apply schema changes and then deploy the new version.
I thought maybe what I could do is change the service selector so that it would point to a nginx container serving up a simple maintenance page whilst the deployment progressed. Once the deployment had succeeded, I would switch back the selector to point to the pods that do the actual work.
My problem with this is approach is that unless I close and reopen the browser that is currently looking at the site then I never see the maintenance page; I’m guessing the browser is keeping a connection open. The public service address doesn’t change throughout this process.
I’m testing this locally on a Docker Kubernetes installation using a type of NodePort .
Any ideas on how to get it working or am I flogging a dead horse with this approach?
Regards
Lee
This happens due to a combination of how browsers and k8s services work.
Browsers cache TCP connections to servers: when requesting a page they will leave the TCP connection open, and if the user later requests more pages from the same domain, the browser will reuse the already-open TCP connection to save time.
The k8s service load balancing operates at the TCP layer. When a new TCP connection is received, it will be assigned to a pod from the Service, and it will keep talking to that pod for the entire TCP connection's lifetime.
So, the issue is your browser is keeping TCP connections open to your old pods, even if you modify the service.
How can we fix this?
Non-solution #1: have the browser not cache connections. As far as I know there's no way to do this, and you don't want it anyway because it'll make your site slower. Also, HTTP caching headers have no impact on this. Browsers always cache TCP connections. A no-cache header will make the browser request the page again, but over the already-open connection.
Non-solution #2: have k8s kill TCP connections when updating the service. This is not possible and is not desirable either because this behavior is what makes "graceful shutdown / request draining" deployment strategies work. See issue.
Solution #1: Use Layer 7 (HTTP) load balancing instead of Layer 4 (TCP) load balancing, such as nginx-ingress. L7 load balancing routes traffic to pods "per HTTP request", instead of "per TCP connection", so you won't have this problem even if browsers keep TCP connections open.
Solution #2: do this from your application instead of from k8s. For example, have an "in-maintenance" DB flag, check it on every request and serve the maintenance page if it's set.
Here is how services in Kubernetes work, they are basically a dummy loadbalancers forwarding requests to pods in a round robin fashion, and they select which pods to forward the requests to based on the labels as you have already figured out.
Now here is how http/tcp work, I open the browser to visit your website www.example.com the tcp takes it's round of syn,ack,syn-ack and I receive the data.
In your case once I open your website I get a reply from a certain pod based on how the service routed me, and that's it, no further communication is made.
Afterwards you remove the functional pods from the service and add the maintenance page, this will be only shown to the new clients connecting to your website.
I.E if I requested your website, and then you changed all the code and restarted NGINX, if I didn't refresh I would not receive new content
First of all make sure the content you are serving is not cached.
Second, make sure to close all open TCP connections when you shut down your pods. The steps should be as follows:
Change service selector to route traffic to maintenance pods
Gracefully shutdown running pods (this includes closing all open TCP connections)
Do maintenance
Change service selector back
As an alternative approach, you can use an ingress controller. That won't have this problem, because it doesn't maintain an open TCP connection to the pods.

Notify backend that container has been deployed

I want to create containers which start small webservices. Developers of our team should then upload small images which contain different services. A main backend system then uses these services.
My problem is: When a developer uploads a new service, how does the backend service know there is a new service it can use? Beforehand, when service X should be used and there was no service for that functionality, it returned just a simple message. When there is a service uploaded to do X, the main backend should use that service. But how does it know it is there and should be used?
You can add some notification to your small web service. But what do you do when the service will be down unexpectedly or network will be down at the short time? You need add some logic to clients for refreshing connection.
And docker recommends to use this way.
The problem of waiting for a database (for example) to be ready is
really just a subset of a much larger problem of distributed systems.
In production, your database could become unavailable or move hosts at
any time. Your application needs to be resilient to these types of
failures.
To handle this, your application should attempt to re-establish a
connection to the database after a failure. If the application retries
the connection, it should eventually be able to connect to the
database.
The best solution is to perform this check in your application code,
both at startup and whenever a connection is lost for any reason.

What's the meaning of expiry timeout in Activemq PooledConnectionFactory?

I don't understand the application of expiryTimeout field in Activemq PooledConnectionFactory. The java doc said "allow connections to expire, irrespective of load or idle time. This is useful with failover to force a reconnect from the pool, to reestablish load balancing or use of the master post recovery". please give me an example, a real scenario which expiryTimeout field effect in it.
The expiry timeout option is a bit of a legacy feature of the Pool that isn't all that useful in most applications these days. The way it works is that if you configure an expiration time then the Connection that is loaned out and is later closed will be completely closed and dropped should there be no other active users of the Connection, otherwise it stays alive until all active instances are closed, then the underlying Connection object is closed.
This works slightly differently than the Idle timeout which applies to Connection instances that are sitting unused in the pool and are closed after some length of time to release resources on the Broker side.
These days you are better off using a failover URI in the PooledConnectionFactory with broker support for rebalance of cluster clients enabled which would then dynamically redistribute the load in the broker cluster as opposed to the expiry timeout which only closes down Connection instances once everyone that is currently actively using them has released them by calling close on them.

TCP/IP long-term connections

I have a server application which runs on a Linux machine. I can connect this application from Windows/Linux machines and can send/recieve data. After a few hours, something occurs and I get following error on the client side.
On Windows: An existing connection was forcibly closed by the remote host
On Linux: Connection timed out
I have made a search on the web and found some posts which suggest to increase/decrease OS's keep alive time. However, it didin't work for me.
Can I found a soultion to this problem or should I simply try to reconnect to the server when the connection is forcibly closed?
EDIT: I have tracked the situation. I sent a data to the remote node and sent another data after waiting 5 hours. Sending side sent the first data, but whet the sender sent the second data it didn't response. TCP/IP stack of the sender repeated this 5 times by incrementing the times between retries. Finally, sender reset the connection. I can't be sure why this is happening (Maybe because of a firewall or NAT - see Section 2.4) but I applied two different approach to solve this problem:
Use TCP/IP keep alive using setsockopt (Section 4.2)
Make an application level keep alive. This is more reliable since the first approach is OS related.
It depends on what your application is supposed to do. A little more information and perhaps the code you use for listening and handling connections could be of help.
Regardless, technically a longer keep alive time, should prevent the OS from cutting you off. So perhaps it is something else causing the trouble.
Such a thing could be router malfunction or traffic causing your keep-alive packet to get lost.
If you aren't already testing it on a LAN (without heavy trafic) I suggest doing so.
It might also be due to how your socket is handled (which I can't determine from your question)
This article might help.
Non blocking socket with timeout
I'm not used to how connections are handled on Linux, but I expect the OS won't cut off a connection unnecessary.
You can re-establish connection as a recovery, but you need to take into account that not all disconnects are gentle, and therefore you could end up making recovery on a connection you actually wish to be closed.
Since it is TCP, it will do its best to make a gentle disconnect, but you can send a custom message telling the server or client not to re-establish the connection right before disconnecting. That way you be absolutely sure, despite that it should be unnecessary to do so.

Best practice: Keep TCP/IP connection open or close it after each transfer?

My Server-App uses a TIdTCPServer, several Client apps use TIdTCPClients to connect to the server (all computers are in the same LAN).
Some of the clients only need to contact the server every couple of minutes, others once every second and one will do this about 20 times a second.
If I keep the connection between a Client and the Server open, I'll save the re-connect, but have to check if the connection is lost.
If I close the connection after each transfer, it has to re-connect every time, but there's no need to check if the connection is still there.
What is the best way to do this?
At which frequency of data transfers should I keep the connection open in general?
What are other advantages / disadvantages for both scenarios?
I would suggest a mix of the two. When a new connection is opened, start an idle timer for it. Whenever data is exchanged, reset the timer. If the timer elapses, close the connection (or send a command to the client asking if it wants the connection to remain open). If the connection has been closed when data needs to be sent, open a new connection and repeat. This way, less-often-used connections can be closed periodically, while more-often-used connections can stay open.
Two Cents from experiment...
My first TCP/IP client/server application was using a new connection and a new thread for each request... years ago...
Then I discovered (using ProcessExplorer) that it consummed some network resources because all closed connection are indeed not destroyed, but remain in a particular state for some time. A lot of threads were created...
I even had some connection problems with a lot of concurent requests: I didn't have enough ports on my server!
So I rewrote it, following the HTTP/1.1 scheme, and the KeepAlive feature. It's much more efficient, use a small number of threads, and ProcessExplorer likes my new server. And I never run out of port again. :)
If the client has to be shutdown, I'll use a ThreadPool to, at least, don't create a thread per client...
In short: if you can, keep your client connections alive for some minutes.
While it may be fine to connect and disconnect for an application that is active once every few minutes, the application that is communicating several times a second will see a performance boost by leaving the connection open.
Additionally, your code will be much simple if you aren't trying to constantly open, close, or diagnose an open connection. With the proper open and close logic, and SEH around your read and writes, there's no reason to test if the socket is still connected before using, just use it. It will tell you when there is a problem.
I'd lean towards keeping a single connection open in most enterprise applications. It generally will lead to cleaner code, that is easier to maintain.
/twocents
I guess it all depends on your goal and the amount of requests made on the server in a given time not to mention the available bandwidth and the hardware on the server.
You need to think for the future as well, is there any chance that in the future you will need connections to be left open? if so, then you've answered your own question.
I've implemented a chat system for a project in which ~50 people(the number is growing with each 2 months) are always connected and besides chatting it also includes data transfer, database manipulation using certain commands, etc. My implementation is keeping the connection to the server open from the application startup until the application is closed, no issues so far, however if a connection is lost for some reason it is automatically reestablished and everything continues flawlessly.
Overall I suggest you try both(keeping the connection open and closing it after it's being used) and see which fits your needs best.
Unless you are scaling to many hundreds of concurrent connections I would definitely keep it open - this is by far the better of the two options. Once you scale past hundreds into thousands of concurrent connections you may have to drop and reconnect. I have architected my entire framework around this (http://www.csinnovations.com/framework_overview.htm) since it allows me to "push" data to the client from the server whenever required. You need to write a fair bit of code to ensure that the connection is up and working (network drop-outs, timed pings, etc), but if you do this in your "framework" then your application code can be written in such a way that you can assume that the connection is always "up".
The problem is the limit of threads per application, around 1400 threads. So max 1300 clients connected at the same time +-.
When closing connections as a client the port you used will be unavailable for a while. So at high volume you’re using loads of different ports. For anything repetitive i’d keep it open.

Resources