Docker swarm overlay network with vxlan routing over openvpn - docker

I have setup a docker swarm with 3 nodes (docker 18.03). These nodes use an overlay network to communicate.
node1:
laptop
host tun0 172.16.0.6 --> openvpn -> nat gateway
container n1
ip = 192.169.1.10
node2:
aws ec2
host eth2 10.0.30.62
container n2
ip = 192.169.1.9
node3:
aws ec2
host eth2 10.0.140.122
container n3
ip = 192.169.1.12
nat-gateway:
aws ec2
tun0 172.16.0.1 --> openvpn --> laptop
eth0 10.0.30.198
The scheme is partly working:
1. Containers can ping eachother using name (n1,n2,n3)
2. Docker swarm commands are working, services can be deployed
The overlay is partly working. Some nodes cannot communicate with each other either using tcp/ip or udp. I tried all combinations of the 3 nodes with udp and tcp/ip:
I did a tcpdump on the nat gateway to monitor overlay vxlan network activity (port 4789):
tcpdump -l -n -i eth0 "port 4789"
tcpdump -l -n -i tun0 "port 4789"
Then I tried tcp/ip communication from node2 to node3. On node3:
nc -l -s 0.0.0.0 -p 8999
On node1:
telnet 192.169.1.12 8999
Node1 will then try to connect to node3. I see packets coming in on the nat-gateway over the tun0 interface:
on the nat-gateway eth0 interface:
it seems that the nat-gateway is not sending replies back over the tun0 interface.
The iptables configuration the nat-gateway
The routing of the nat-gateway
Can you help me solve this issue?

I have been able to fix the issue using the following configuration on the NAT gateway:
and
No masquerading of 172.16.0.0/22 is needed. All the workers and managers will route their traffic for 172.16.0.0/22 via the NAT gateway, and it knows how to send the packets over tun0.
Masquerading of eth0 was just wrong...
All the containers can now ping and establish tcp/ip connections to each other.

Related

Docker swarm worker behind NAT

I am wanting to have a worker node on a server I have that is behind a NAT (i.e can't expose ports publicly) I thought this wasn't a problem but it turns out to be one:
On this server behind the NAT I run:
docker swarm join --token SWMTKN-1... X.X.X.X:2377
Which in turn adds the server to the swarm. I am not sure where the "internal" IP address comes from but on traefik I then have a new server http://10.0.1.126:8080 (10.0.1.126 is definitely not the public IP) if I exec inside the traefik container:
docker exec -it 80f9cb33e24c sh
I can ping every server/node/worker in the list on traefik apart from the new one. Why?
When joining the swarm like this on the worker behind the vpn:
docker swarm join --advertise-addr=tun0 --token SWMTKN-1-... X.X.X.X:2377
I can see a new peer on my network from the manager:
$ docker network inspect traefik
...
"Peers": [
...
{
"Name": "c2f01f1f1452",
"IP": "12.0.0.2"
}
]
where 12.0.0.2 and tun0 is the vpn interface from the manager to the server behind the NAT. Unfortunately when I then run:
$ nmap -p 2377,2376,4789,7946 12.0.0.2
Starting Nmap 7.70 ( https://nmap.org ) at 2020-05-04 11:01 EDT
Nmap scan report for 12.0.0.2
Host is up (0.017s latency).
PORT STATE SERVICE
2376/tcp closed docker
2377/tcp closed swarm
4789/tcp closed vxlan
7946/tcp open unknown
I can see that the ports are closed for the docker worker which is weird?
Also if I use nmap -p 8080 10.0.1.0/24 inside the traefik container on the manager I get:
Nmap scan report for app.6ysph32io2l9q74g6g263wed3.mbnlnxusxv2wz0pa2njpqg2u1.traefik (10.0.1.62)
Host is up (0.00033s latency).
PORT STATE SERVICE
8080/tcp open http-proxy
on a succesfull swarm worker which has the network internal ip 10.0.1.62
but I get:
Nmap scan report for app.y7odtja923ix60fg7madydia3.jcfbe2ke7lzllbvb13dojmxzq.traefik (10.0.1.126)
Host is up (0.00065s latency).
PORT STATE SERVICE
8080/tcp filtered http-proxy
on the new swarm node. Why is it filtered? What am I doing wrong?
I'm adding this here as it's a bit longer.
I don't think it's enough for only the manager and the remote node to be able to communicate; nodes need to be able to communicate between themselves.
Try to configure the manager (who is connected to the VPN) to route packets to and from the remote worker through the VPN and add the needed routes on all nodes (including the remote one).
Something like:
# Manager
sysctl -w net.ipv4.ip_forward=1 # if you use systemd you might need extra steps
# Remote node
ip route add LOCAL_NODES_SUBNET via MANAGER_TUN_IP dev tun0
#Local nodes
ip route add REMOTE_NODE_TUN_IP/32 via MANAGER_IP dev eth0
If the above works correctly you need to make the routing changes above permanent.
To find the IP addresses for all your nodes run this command on the manager:
for NODE in $(docker node ls --format '{{.Hostname}}'); do echo -e "${NODE} - $(docker node inspect --format '{{.Status.Addr}}' "${NODE}")"; done

Docker network macvlan driver: gateway unreachable

I have a macvlan network created with the following command:
docker network create -d macvlan --subnet=192.168.1.0/24 --gateway=192.168.1.2 -o parent=wlp2s0 pub_ne
Where wlp2s0 is the name of the wireless interface of my laptop.
gateway is 192.168.1.1 and subnet 192.168.1.0/24
Then I have created and attached a container to this network:
docker run --rm -itd --network pub_ne --name myAlpine alpine:latest sh
In addition I have created a virtual machine using, virtualbox provider, with bridged network interface.
if I use ping command:
- docker container -> vm ubuntu (ip of vm: 192.168.1.200) : ping works
but if I use ping command:
- docker container -> gateway 192.168.1.1
or
- docker container -> external world (google.com): ping not works
suggestions?
edit 1:
On docker host if i run tcpdump ( tcpdump -i icmp ) i see:
14:53:30.015822 IP 192.168.1.56 > 216.58.205.142: ICMP echo request, id 5376, seq 29, length 64
14:53:31.016143 IP 192.168.1.56 > 216.58.205.142: ICMP echo request, id 5376, seq 30, length 64
14:53:32.016426 IP 192.168.1.56 > 216.58.205.142: ICMP echo request, id 5376, seq 31, length 64
14:53:33.016722 IP 192.168.1.56 > 216.58.205.142: ICMP echo request, id 5376, seq 32, length 64
Where 192.168.1.56 is my docker container and 216.58.205.142 should be google ip address. No echo reply is received.
Macvlan is unlikely to work with IEEE 802.11.
Your wifi access point, and/or your host network stack, are not going to be thrilled.
You might want to try ipvlan instead: add -o ipvlan_mode=l2 to your network creation call and see if that helps.
That might very well still not work... (for eg, if you rely on DHCP and your DHCP server uses macaddresses and not client id)
And your only (reasonable) solution might be to drop the wifi entirely and wire the device up instead... (or move away from macvlan and use host / bridge - whichever is the most convenient)

Docker container hits iptables to proxy

I have two VPSs, first machine (proxy from now) is for proxy and second machine (dock from now) is docker host. I want to redirect all traffic generated inside a docker container itself over proxy, to not exposure dock machines public IP.
As connection between VPSs is over internet, no local connection, created a tunnel between them by ip tunnel as follows:
On proxy:
ip tunnel add tun10 mode ipip remote x.x.x.x local y.y.y.y dev eth0
ip addr add 192.168.10.1/24 peer 192.168.10.2 dev tun10
ip link set dev tun10 mtu 1492
ip link set dev tun10 up
On dock:
ip tunnel add tun10 mode ipip remote y.y.y.y local x.x.x.x dev eth0
ip addr add 192.168.10.2/24 peer 192.168.10.1 dev tun10
ip link set dev tun10 mtu 1492
ip link set dev tun10 up
PS: Do not know if ip tunnel can be used for production, it is another question, anyway planning to use libreswan or openvpn as a tunnel between VPSs.
After, added SNAT rules to iptables on both VPSs and some routing rules as follows:
On proxy:
iptables -t nat -A POSTROUTING -s 192.168.10.2/32 -j SNAT --to-source y.y.y.y
On dock:
iptables -t nat -A POSTROUTING -s 172.27.10.0/24 -j SNAT --to-source 192.168.10.2
ip route add default via 192.168.10.1 dev tun10 table rt2
ip rule add from 192.168.10.2 table rt2
And last but not least created a docker network with one test container attached to it as follows:
docker network create --attachable --opt com.docker.network.bridge.name=br-test --opt com.docker.network.bridge.enable_ip_masquerade=false --subnet=172.27.10.0/24 testnet
docker run --network testnet alpine:latest /bin/sh
Unfortunately all these ended with no success. So the question is how to debug that? Is it correct way? How would you do the redirection over proxy?
Some words about theory: Traffic coming from 172.27.10.0/24 subnet hits iptables SNAT rule, source IP changes to 192.168.10.2. By routing rule it routes over tun10 device, it is the tunnel. And hits another iptables SNAT rule that changes IP to y.y.y.y and finally goes to destination.

Docker container with macvlan can't be pinged by other host

I know I can't ping the macvlan interface from the same host, but I can't ping my container's macvlan interface from hosts on a different subnet (even though they're connected via a router).
Host IP: 10.8.2.132/22
Macvlan container IP: 10.8.2.250/22
Other host IP: 10.4.16.141/22
Ping FROM 10.8.2.132 TO 10.4.16.141 is successful
Ping FROM 10.8.2.250 TO 10.4.16.141 is successful
Ping FROM 10.4.16.141 TO 10.8.2.132 is successful
Ping FROM 10.4.16.141 TO 10.8.2.250 fails with 100% packet loss
ip route get 10.8.2.250 shows that there is a known route:
10.8.2.250 via 10.4.16.1 dev eth0 src 10.4.16.141
cache mtu 1500 hoplimit 64
How can I go about debugging this?
The docker macvlan network was created with:
docker network create -d macvlan --subnet=10.8.0.0/22 --gateway=10.8.0.1 -o parent=em1 macnet
and when I run the container I specifically add "--ip=10.8.2.250"

Can't ping docker IPv6 container

I ran docker daemon for using it with global IPv6 for containers:
docker daemon --ipv6 --fixed-cidr-v6="xxxx:xxxx:xxxx:xxxx::/64"
After it I ran docker container:
docker run -d --name my-container some-image
It successfully got Global IPv6 address( I checked by docker inspect my-container). But I can't to ping my container by this ip:
Destination unreachable: Address unreachable
But I can successfully ping docker0 bridge by it's IPv6 address.
Output of route -n -6 contains next lines:
Destination Next Hop Flag Met Ref Use If
xxxx:xxxx:xxxx:xxxx::/64 :: U 256 0 0 docker0
xxxx:xxxx:xxxx:xxxx::/64 :: U 1024 0 0 docker0
fe80::/64 :: U 256 0 0 docker0
docker0 interface has global IPv6 address:
inet6 addr: xxxx:xxxx:xxxx:xxxx::1/64 Scope:Global
xxxx:xxxx:xxxx:xxxx:: everywhere is the same, and it's global IPv6 address of my eth0 interface
Does docker required something additional configs for accessing my containers via IPv6?
Assuming IPv6 in your guest OS is properly configured probably you are pinging the container not from host OS, but outside and network discovery protocol is not configured. Other hosts does not know if your container is behind of your host. I'm doing this after start of container with IPv6 (in host OS) (in ExecStartPost clauses of Systemd .service file)
/usr/sbin/sysctl net.ipv6.conf.interface_name.proxy_ndp=1
/usr/bin/ip -6 neigh add proxy $(docker inspect --format {{.NetworkSettings.GlobalIPv6Address}} container_name) dev interface_name"
Beware of IPv6: docker developers say in replies to bug reports they do not have enough time to make IPv6 production-ready in version 1.10 and say nothing about 1.11.
Mb you use wrong ping command. For ipv6 is ping6.
$ ping6 2607:f0d0:1002:51::4

Resources