'docker stop' impact on established tcp connection - docker

I'm running 2 docker containers(TcpServer,TcpClient). In each container, there's an init.sh script which launches the applications. In the init.sh script, I've handled SIGTERM but I'm not doing any sort of handling for that(I'm not passing it to my application).
trap 'true' SIGTERM
After startup, a tcp connection is established between TcpServer and TcpClient.
TcpClient is a multi-threaded application, with 1 thread(receiver) doing:
while(true) {
//blocking tcp receive function call
//process received data or received error code.
}
So basically, the idea is that the receiver thread would always get to know about server going down 'cleanly'.
The observation is that, most of the times, when I issue 'docker stop serverContainer', the client application receives TCP 'FIN' packet after about 10 seconds. This is as per my expectations because docker first tries to kill via SIGTERM but since that is handled it then issues SIGKILL which it does only after about 10 seconds.
My current understanding is that, whenever sigkill/unhandled-sigterm is given to a process, the kernel will terminate that process and close all file descriptors opened by that process. If this is true, then I should always see a FIN packet going from server to client as soon as the process is killed.
However, a few times, FIN packet is not observed in the traces captured on both client and server end. As a result, the client doesn't get to know about the server getting down for a longer time(until it tries to send some data on that connection or the TCP's keepalive mechanism kicks in).
I'm not sure how this happens because if I explicitly issue SIGKILL to pid 1 of my server's container(from outside), then I've always seen the FIN packet. So why sometimes, and only when using docker stop, does server not send TCP FIN?
Basically I want to ask 2 things:
In Linux, when SIGKILL is issued to a TCP server process, is it guaranteed that the kernel/tcp stack will send TCP FIN packet to client before terminating?
When I use 'docker stop' how exactly are the processes spawned by the main process(PID 1 inside container) terminated? Because from what I had read, the SIGTERM/SIGKILL is given only to PID 1? So why are its child processes not adopted by init/systemd as happens otherwise(killing the parent process created outside the container).
Operating System: Red Hat Enterprise Linux Server release 7.7 (Maipo)
Docker version: Docker version 19.03.14, build 5eb3275d40

Related

Checking for port exhaustion with Netstat HttpClientFactory WPF

I want to check that I have properly implemented HttpClientFactory. I have a desktop application that pings my server every 20 seconds. When I open command prompt and run "netstat -ano | findstr {My server IP}" I can see there are always 2 or 3 connections. As time goes on and I continue to check, the ports will slowly change (go up in their port #'s, older ports disappear) but there are never more than 2 or 3 connections. Does this mean that the old ports are being released and I am not at risk for port exhaustion? Thanks.
As mentioned above. I am going to begin selling my application very soon and need to be sure that I am not going to exhaust my client's ports and hinder their network.

How to reliably kill a port on crash in Erlang?

What is the best practice to reliably kill a port created by open_port?
Port = open_port({spawn,"yes"},[binary]),
% use Port and clash
% leak process
mykill(Port),
You can link/1 to ports from any process (both have to be node-local). The port is automatically linked to the process that started it, so if this process exits or any linked process crashes, the port closes.

Preventing uwsgi_response_write_body_do() TIMEOUT

We use uwsgi with the python3 plugin, under nginx, to serve potentially hundreds of megabytes of data per query. Sometimes when nginx is queried from client a slow network connection, a uwsgi worker dies with "uwsgi_response_write_body_do() TIMEOUT !!!".
I understand the uwsgi python plugin reads from the iterator our app returns as fast as it can, trying to send the data over the uwsgi protocol unix socket to nginx. The HTTPS/TCP connection to the client from nginx will get backed up from a slow network connection and nginx will pause reading from its uwsgi socket. uwsgi will then fail some writes towards nginx, log that message and die.
Normally we run nginx with uwsgi buffering disabled. I tried enabling buffering, but it doesn't help as the amount of data it might need to buffer is 100s of MBs.
Our data is not simply read out of a file, so we can't use file offload.
Is there a way to configure uwsgi to pause reading from the our python iterator if that unix socket backs up?
The existing question here "uwsgi_response_write_body_do() TIMEOUT - But uwsgi_read_timeout not helping" doesn't help, as we have buffering off.
To answer my own question, adding socket-timeout = 60 is helping for all but the slowest client connection speeds.
That's sufficient so this question can be closed.

WSL2 + Docker - Keep Alive Bug in TCP stack

I wonder if others noticed this issue with the WSL2 Debian implementation of TCP.
I am connecting from a Docker container running WSL2 Debian v. 20
The TCP client sends a Keep-Alive packet every second which is kind of overkill. Then after roughly 5 minutes, the client terminates the connection without any reason. Is anybody seeing this behavior?
You can reproduce this by just opening a telnet session to another host. But the behavior happens on other types of sockets too.
And before you ask, this issue is not caused by the server, it does not occur when opening the same tcp connection from other hosts.
wireshark dump of the last few seconds of the idle TCP connection
I had the same problem with Ubuntu on WSL2. An outbound ssh connection closed after a period of time if there was no activity on that connection. Particularly anoying if you were running an application that produced no screen output.
I suspect that the internal router that connects wsl to the local network dropped the idle TCP connection.
The solution was to shorten the TCP keep-alive timers in /proc/sys/net/ipv4, the following worked for me:
echo 300 > /proc/sys/net/tcp_keepalive_time
echo 45 > /proc/sys/net/tcp_keepalive_intvl
So I figured this out. Unfortunately, the WSL2 implementation of Debian seems to have this hardcoded in the stack. I tried to change the parameters of the socket open call and they didn't cause a change in the behavior.

Ctrl-Z sent to Fedora server over telnet does not stop the process

Objective:
I am working on an iOS terminal emulator for accessing my Unix server through the telnet protocol. I am testing against both AIX and Fedora Linux.
Problem:
If I send Ctrl-Z (ASCII 26) to the AIX server, it behaves as expected: I get back a string like stopped programname, and then any further characters I send get echoed back.
When I send it to the Fedora server, I get no echo-back until I send Ctrl-Z a second time. The program is running under Bash on the Fedora machine.
Why am I seeing this difference in behavior?
You have to make 2 calls:
Stopping that process
kill -SIGSTOP 'pgrep process_name'
Continuing that process
kill -SIGCONT 'pgrep process_name'
SIGSTOP tells a process to “hold on” and SIGCONT tells a process to
“pick up where you left off”
See, if that help.

Resources