docker overlay network problems connecting containers - docker

We are running an environment of 6 engines each with 30 containers.
Two engines are running containers with nginx proxy. These two containers are the only way into the network.
It is now the second time that we are facing a major problem with a set of containers in this environment:
Both nginx container cannot reach some of the containers on other machines. Only one physical engine has this problem, all others are fine. It started with timeouts of some machines, and now after 24 hours, all containers on that machine have the problem.
Some more details:
Nginx is running on machine prod-3.
Second Nginx is running on machine prod-6.
Containers with problems are running on prod-7.
Both nginx cannot reach the containers, but the containers can reach the nginx via "ping".
At the beginning and today in the morning we could reach some of the containers, other not. It started with timeouts, now we cannot ping the containers in the overlay network. This time we are able to look at the traffic using tcpdump:
on the nginx container (10.10.0.37 on prod-3) we start a ping and
as you can see: 100% packet lost:
root#e89c16296e76:/# ping ew-engine-evwx-intro
PING ew-engine-evwx-intro (10.10.0.177) 56(84) bytes of data.
--- ew-engine-evwx-intro ping statistics ---
8 packets transmitted, 0 received, 100% packet loss, time 7056ms
root#e89c16296e76:/#
On the target machine prod-7 (not inside the container) - we see that all ping packages are received (so the overlay network is routing correctly to the prod-7):
wurzel#rv_____:~/eventworx-admin$ sudo tcpdump -i ens3 dst port 4789 |grep 10.10.0.177
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens3, link-type EN10MB (Ethernet), capture size 262144 bytes
IP 10.10.0.37.35270 > 10.10.0.177.http: Flags [S], seq 2637350294, win 28200, options [mss 1410,sackOK,TS val 1897214191 ecr 0,nop,wscale 7], length 0
IP 10.10.0.37.35270 > 10.10.0.177.http: Flags [S], seq 2637350294, win 28200, options [mss 1410,sackOK,TS val 1897214441 ecr 0,nop,wscale 7], length 0
IP 10.10.0.37.35326 > 10.10.0.177.http: Flags [S], seq 2595436822, win 28200, options [mss 1410,sackOK,TS val 1897214453 ecr 0,nop,wscale 7], length 0
IP 10.10.0.37 > 10.10.0.177: ICMP echo request, id 83, seq 1, length 64
IP 10.10.0.37.35326 > 10.10.0.177.http: Flags [S], seq 2595436822, win 28200, options [mss 1410,sackOK,TS val 1897214703 ecr 0,nop,wscale 7], length 0
IP 10.10.0.37 > 10.10.0.177: ICMP echo request, id 83, seq 2, length 64
IP 10.10.0.37 > 10.10.0.177: ICMP echo request, id 83, seq 3, length 64
IP 10.10.0.37 > 10.10.0.177: ICMP echo request, id 83, seq 4, length 64
IP 10.10.0.37 > 10.10.0.177: ICMP echo request, id 83, seq 5, length 64
IP 10.10.0.37 > 10.10.0.177: ICMP echo request, id 83, seq 6, length 64
IP 10.10.0.37 > 10.10.0.177: ICMP echo request, id 83, seq 7, length 64
IP 10.10.0.37 > 10.10.0.177: ICMP echo request, id 83, seq 8, length 64
^C304 packets captured
309 packets received by filter
0 packets dropped by kernel
wurzel#_______:~/eventworx-admin$
At first - you can see that there is no anwer ICMP (firewall is not reponsible, also not appamor).
Inside the responsible container (evwx-intro = 10.10.0.177) nothing is received, the interface eth0 (10.10.0.0) is just silent:
root#ew-engine-evwx-intro:/home/XXXXX# tcpdump -i eth0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel
root#ew-engine-evwx-intro:/home/XXXXX#
It's really strange.
Any other tool from docker which can help us to see what's going on?
We did not change anything to the firewall, also no automatic updates of the system (maybe security).
The only activity was, that some old containers have been reactivated after a long period (of maybe 1-2 month of inactivity).
We are really lost, if you experienced something comparable, it would be very helpful to understand the steps you took.
Many thanks for any help with this.
=============================================================
6 hours later
After trying nearly everything for a full day, we did the final try:
(1) stop all the containers
(2) stop docker service
(3) stop docker socket service
(4) restart machine
(5) start the containers
... now it looks good at the moment.
To conclude:
(1) we have no clue what was causing the problem. This is bad.
(2) We have learned that the overlay network is not the problem, because the traffic is reaching the target machine where the container is living
(3) We are able to trace the network traffic until it reaches the target machine. Somehow it is not "entering" the container. Because inside the container the network interface shows no activity at all.
We have no knowledge about the vxnet virtual network which is used by docker, so if anybody has a hint, could you help us with a link or tool about it?
Many many thanks in advance.
Andre
======================================================
4 days later...
Just had the same situation again after updating docker-ce 18.06 to 18.09.
We have two machines using docker-ce 18 in combination with ubuntu 18.04 and I just updated the docker-ce to 18.09 because of this problem (Docker container should not resolve DNS in Ubuntu 18.04 ... new resolved service).
I stopped all machines, updated docker, restart machine, started all machines.
Problem: Same problem as described in this post. The ping was received by the target host operating system but not forwarded to the container.
Solution:
1. stop all containers and docker
2. consul leave,
3. cleanup all entries in consul keystore on other machines (was not deleted by leave)
3. start consul
4. restart all enigines
5. restart nginx container ... gotcha, network is working now.

Once again the same problem was hitting us.
We have 7 servers (each running docker as described above), two nginx entry points.
It looks like, that some errors with in the consul key store is the real problem causing the docker network to show the strange behaviour (described above).
In our configuration all 7 server have their own local consul instance which synchronises with the others. For network setup each docker service is doing a lookup at its local consul key store.
In last week we notice that at the same time of the problem with network reachability also the consul clients report problems with synchronisation (leader election problems, repeats etc).
The final solution was to stop the docker engines and the consul clients. Delete the consul database on some servers, join it again to the others. Start the docker engines.
Looks like the consul service is a critical part for the network configuration...
In progress...

I faced the exact issue with overlay network Docker Swarm setup.
I've found that it's not OS or Docker problem. Servers affected are using Intel NIC X series. Other servers with I series NIC are working fine.
Do you use on-premise servers? Or any cloud provider?
We use OVH and it might be caused by some datacenter network misconfiguration.

Related

Dockerized Zabbix: Server Can't Connect to the Agents by IP

Problem:
I'm trying to config a fully containerized Zabbix version 6.0 monitoring system on Ubuntu 20.04 LTS using the Zabbix's Docker-Compose repo found HERE.
The command I used to raise the Zabbix server and also a Zabbix Agent is:
docker-compose -f docker-compose_v3_ubuntu_pgsql_latest.yaml --profile all up -d
Although the Agent rises in a broken state and shows a "red" status, when I change its' IP address FROM 127.0.0.1 TO 172.16.239.6 (default IP Docker-Compose assigns to it) the Zabbix Server can now successfully connect and monitoring is established. HOWEVER: the Zabbix Server cannot connect to any other Dockerized Zabbix Agents on REMOTE hosts which are raised with the docker run command:
docker run --add-host=zabbix-server:172.16.238.3 -p 10050:10050 -d --privileged --name DockerHost3-zabbix-agent -e ZBX_SERVER_HOST="zabbix-server" -e ZBX_PASSIVE_ALLOW="true" zabbix/zabbix-agent:ubuntu-6.0-latest
NOTE: I looked at other Stack groups to post this question, but Stackoverflow appeared to be the go-to group for these Docker/Zabbix issues having over 30 such questions.
Troubleshooting:
Comparative Analysis:
Agent Configuration:
Comparative analysis of the working ("green") Agent on the same host as the Zabbix Server with Agents on different hosts showing "red" statuses (not contactable by the Zabbix server) using the following command show the configurations have parity.
docker exec -u root -it (ID of agent container returned from "docker ps") bash
And then execute:
grep -Ev ^'(#|$)' /etc/zabbix/zabbix_agentd.conf
Ports:
The correct ports were showing as open on the "red" Agents as were open on the "green" agent running on the same host as the Zabbix Server from the output of the command:
ss -luntu
NOTE: This command was issued from the HOST, not the Docker container for the Agent.
Firewalling:
Review of the iptables rules from the HOST (not container) using the following command didn't reveal anything of concern:
iptables -nvx -L --line-numbers
But to exclude Firewalling, I nonetheless allowed everything in iptables in the FORWARD table on both the Zabbix server and an Agent in an "red" status used for testing.
I also allowed everything on the MikroTik GW router connecting the Zabbix Server to the different physical hosts running the Zabbix Agents.
Routing:
The Zabbix server can ping remote Agent interfaces proving there's a route to the Agents.
AppArmor:
I also stopped AppArmor to exclude it as being causal:
sudo systemctl stop apparmor
sudo systemctl status apparmor
Summary:
So everything is wide-open, the Zabbix Server can route to the Agents and the config of the "red" agents have parity with the config of the "green" Agent living on the same host at the Zabbix Server itself.
I've setup non-containerized Zabbix installation in production environments successfully so I'm otherwise familiar with Zabbix.
Why can't the containerized Zabbix Server connect to the containerized Zabbix Agents on different hosts?
Short Answer:
There was NOTHING wrong with the Zabbix config; this was a Docker-induced problem.
docker logs <hostname of Zabbix server> revealed that there appeared to be NAT'ing happening on the Zabbix SERVER, and indeed there was.
Docker was modifying iptables NAT table on the host running the Zabbix Server container causing the source address of the Zabbix Server to present as the IP of the physical host itself, not the Docker-Compose assigned IP address of 172.16.238.3.
Thus, the agent was not expecting this address and refused the connection. My experience of Dockerized apps is that they are mostly good at modifying IP tables to create the correct connectivity, but not in this particular case ;-).
I now reviewed the NAT table by executing the following command on the HOST (not container):
iptables -t nat -nvx -L --line-numbers
This revealed that Docker was being, erm "helpful" and NAT'ing the Zabbix server's traffic
I deleted the offending rules by their rule number:
iptables -t nat -D <chain> <rule #>
After which the Zabbix server's IP address was now presented correctly to the Agents who now accepted the connections and their statuses turned "green".
The problem is reproducible if you execute:
docker-compose -f docker-compose -f docker-compose_v3_ubuntu_pgsql_latest.yaml down
And then run the up command raising the containers again you'll see the offending iptables rule it restored to the NAT table of the host running the Zabbix Server's container breaking the connectivity with Agents.
Longer Answer:
Below are the steps required to identify and resolve the problem of the Zabbix server NAT'ing its' traffic out of the host's IP:
Identify If the HOST of the Zabbix Server container is NAT'ing:
We need to see how the IP of the Zabbix Server's container is presenting to the Agents, so we have to get the container ID for a Zabbix AGENT to review its' logs:
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b2fcf38d601f zabbix/zabbix-agent:ubuntu-6.0-latest "/usr/bin/tini -- /u…" 5 hours ago Up 5 hours 0.0.0.0:10050->10050/tcp, :::10050->10050/tcp DockerHost3-zabbix-agent
Next, supply container ID for the Agent to the docker logs command:
docker logs b2fcf38d601f
Then Review the rejected IP address in the log output to determine if it's NOT the Zabbix Server's IP:
81:20220328:000320.589 failed to accept an incoming connection: connection from "NAT'ed IP" rejected, allowed hosts: "zabbix-server"
The fact that you can see this error proves that there is no routing or connectivity issues: the connection is going through, it's just being rejected by the application- NOT the firewall.
If NAT'ing proved, continue to next step
On Zabbix SERVER's Host:
The remediation happens on the Zabbix Server's Host itself, not the Agents. Which is good because we can fix the problem in one place versus many.
Execute below command on the Host running the Zabbix Server's container:
iptables -t nat -nvx -L --line-numbers
Output of command:
Chain POSTROUTING (policy ACCEPT 88551 packets, 6025269 bytes)
num pkts bytes target prot opt in out source destination
1 0 0 MASQUERADE all -- * !br-abeaa5aad213 192.168.24.128/28 0.0.0.0/0
2 73786 4427208 MASQUERADE all -- * !br-05094e8a67c0 172.16.238.0/24 0.0.0.0/0
Chain DOCKER (2 references)
num pkts bytes target prot opt in out source destination
1 0 0 RETURN all -- br-abeaa5aad213 * 0.0.0.0/0 0.0.0.0/0
2 95 5700 RETURN all -- br-05094e8a67c0 * 0.0.0.0/0 0.0.0.0/0
We can see the counters are incrementing for the "POSTROUTING" and "DOCKER" chains- both rule #2 in their respective chains.
These rules are clearly matching and have effect.
Delete the offending rules on the HOST of the Zabbix server container which is NATing its' traffic to the Agents:
sudo iptables -t nat -D POSTROUTING 2
sudo iptables -t nat -D DOCKER 2
Wait a few moments and the Agents should now go "green"- assuming there are no other configuration or firewalling issues. If the Agents remain "red" after applying the fix then please work through the troubleshooting steps I documented in the Question section.
Conclusion:
I've tested and restarting the Zabbix-server container does not recreate the deleted rules. But again, please note that a docker-compose down followed by a docker-compose up WILL recreate the deleted rules and break Agent connectivity.
Hope this saves other folks wasted cycles. I'm a both a Linux and network engineer and this hurt my head, so this would be near impossible to resolve if you're not a dab hand with networking.

Docker - after uninstalling keeps pinging host - host.docker.internal

I tested Docker Desktop on Windows 10 and then uninstalled it in a standard way - showed "Docker removed". But I still can see on TCP/UDP watch (LiveTcpUdpWatch) requests to IP address which was automatically installed into hosts file during Docker setup:
192.168.0.115 host.docker.internal
TCP/UDP watcher shows record:
Process ID Process Name Protocol Local Port Local Address Remote Port Remote Port Name Remote Address Received Bytes Sent Bytes Received Packets Sent Packets
4 TCP IPv4 2869 192.168.0.115 37591 192.168.0.115 12/04/2022 11:30:46.231 12/04/2022 11:30:46.227 1
So the process has ID 4, running tasklist /fi "pid eq 4" shows:
Image Name PID Session Name Session# Mem Usage
========================= ======== ================ =========== ============
System 4 Services 0 148 K
I cannot kill System process.
So how to remove this garbage artefact and stop useless pinging?

Wrong TCP source port in docker container

Im running a docker container with asterisk inside. Asterisk is listening on port 5061 TCP, but in connects i see a different port in tcpdump.
docker run --log-driver none --name=asterisk-docker -dt --net=host --restart=always asterisk-docker
netstat output:
tcp 0 0 0.0.0.0:5061 0.0.0.0:* LISTEN 58731/asterisk
tcp 0 0 srv-tk:46315 217.0.26.101:sip ESTABLISHED
tcpdump output:
12:24:18.230394 IP 217.0.26.101.sip > srv-tk.46315: Flags [.], ack 3051, win 1217, options [nop,nop,TS val 391595046 ecr 3262544919], length 0
12:24:18.292636 IP 217.0.26.101.sip > srv-tk.46315: Flags [P.], seq 3444:4092, ack 3051, win 1217, options [nop,nop,TS val 391595061 ecr 3262544919], length 648
Why is this not 5061 as source/destination?
This is because behind docker networking is a lot of magic done with iptables. Packet forwarding, different networks layers, etc.
To check all this magic run this:
sudo iptables -S
To see high layer layout of your containers execute the following command:
docker ps
If you want to understand high level docker networking I can recommend to read this documentation: https://docs.docker.com/network/
More about system networking behind docker network you can find here:
https://argus-sec.com/docker-networking-behind-the-scenes/

Having problems sending data to statsd / graphite database on docker from outside container

I'm having issues getting data sent into a statsd container. I can successfully send data while on the command line inside the container itself. I need to be able to send statistic data to it from the host machine or from another Docker container.
I'm using Kitematic, I can see that the selection for 'bridge' network is checked on both containers. Using a bridge network was a suggestion I found for this issue.
I also tried passing '-P' to Docker while running the command to build the container, as that was supposed to expose the ports. I didn't notice a difference in the way it behaved when sending data from the other container.
Example of code that runs to create fake statistics using port 8125 on localhost (taken from this Docker container webpage https://hub.docker.com/r/graphiteapp/graphite-statsd)
Let's fake some stats with a random counter to prove things are working.
while true; do echo -n "example:$((RANDOM % 100))|c" | nc -w 1 -u 127.0.0.1 8125; done
The container is created using the following command:
docker run -d --name graphite --restart=always -p 80:80 -p 2003-2004:2003-2004 -p 2023-2024:2023-2024 -p 8125:8125/udp -p 8126:8126 graphiteapp/graphite-statsd
I've tried making sure both are on the same 'bridge' network. I'm running Docker Desktop on Windows 10 Enterprise. I've found several commands dealing with iptables and networking on Linux, but I feel like I'm missing something. I might also mention that statsd uses UDP connection on port 8125 by default.
If I try running the example command from another container on the bridge network, I don't get any result. I know the data (from another container) is not getting over correctly because I can't see it in the metrics that get received on the statsd dashboard.
I can ping localhost:8125 and get a response from within another container. From the outside (Powershell window on the host machine) it won't resolve.
PING localhost:8125 (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: seq=0 ttl=64 time=0.024 ms
64 bytes from 127.0.0.1: seq=1 ttl=64 time=0.052 ms
64 bytes from 127.0.0.1: seq=2 ttl=64 time=0.031 ms
^C
--- localhost:8125 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.024/0.035/0.052 ms>
If I run docker container ls then I get the following:
I found that I needed to get the specific IP address for each container, which can be found by running docker inspect (name of network). In this case, bridge.
Then I needed to specify the IP address of the container. I replaced the suggested address with that IP address and it worked.

Capture pid of process using port 6881 only once every 15 min

I can see from a tcpdump that an internal linux server is trying to contact an outside computer approximately every 15 min: one udp-packet on port 6881 (bittorrent), that's all.
As this server isn't supposed to contact anyone, I want to find out what evil soul generated this packet, i.e. I need some information about the process (e.g. pid, file, ...).
Because the timespan is so short, I can't use netstat or lsof.
The process is likely to be active about half of a microsecond, then it gets a destination unreachable (port unreachable) from firewall.
I have ssh access to the machine.
How can I capture network packets per PID? suggests to use the tcpdump option -k, however, linux tcpdump has no such option.
You can't do this with TCPDump, obviously, but you can do this from the host itself. Especially since it's UDP with no state, and since you can't predict when the process will be listening, you should look into using the kernel audit capabilities. For example:
auditctl -a exit,always -F arch=b64 -F a0=2 -F a1\&=2 -S socket -k SOCKET
This instructs the kernel to generate an audit event whenever there is a socket call. With this done, you can then wait until you see the suspicious packet leave the machine and then use ausearch to track down not only the process, but the binary that made the call.

Resources