Docker container hits iptables to proxy - docker

I have two VPSs, first machine (proxy from now) is for proxy and second machine (dock from now) is docker host. I want to redirect all traffic generated inside a docker container itself over proxy, to not exposure dock machines public IP.
As connection between VPSs is over internet, no local connection, created a tunnel between them by ip tunnel as follows:
On proxy:
ip tunnel add tun10 mode ipip remote x.x.x.x local y.y.y.y dev eth0
ip addr add 192.168.10.1/24 peer 192.168.10.2 dev tun10
ip link set dev tun10 mtu 1492
ip link set dev tun10 up
On dock:
ip tunnel add tun10 mode ipip remote y.y.y.y local x.x.x.x dev eth0
ip addr add 192.168.10.2/24 peer 192.168.10.1 dev tun10
ip link set dev tun10 mtu 1492
ip link set dev tun10 up
PS: Do not know if ip tunnel can be used for production, it is another question, anyway planning to use libreswan or openvpn as a tunnel between VPSs.
After, added SNAT rules to iptables on both VPSs and some routing rules as follows:
On proxy:
iptables -t nat -A POSTROUTING -s 192.168.10.2/32 -j SNAT --to-source y.y.y.y
On dock:
iptables -t nat -A POSTROUTING -s 172.27.10.0/24 -j SNAT --to-source 192.168.10.2
ip route add default via 192.168.10.1 dev tun10 table rt2
ip rule add from 192.168.10.2 table rt2
And last but not least created a docker network with one test container attached to it as follows:
docker network create --attachable --opt com.docker.network.bridge.name=br-test --opt com.docker.network.bridge.enable_ip_masquerade=false --subnet=172.27.10.0/24 testnet
docker run --network testnet alpine:latest /bin/sh
Unfortunately all these ended with no success. So the question is how to debug that? Is it correct way? How would you do the redirection over proxy?
Some words about theory: Traffic coming from 172.27.10.0/24 subnet hits iptables SNAT rule, source IP changes to 192.168.10.2. By routing rule it routes over tun10 device, it is the tunnel. And hits another iptables SNAT rule that changes IP to y.y.y.y and finally goes to destination.

Related

Putting a QEMU guest onto a network created by docker compose

I am trying to emulate a hardware configuration between two Intel NUC servers and a Raspberry Pi server using docker containers and docker compose. Since the latter is ARM, and the test host is x86, I decided to run the RPi image within QEMU encapsulated in one of the docker containers. The docker compose file is pretty simple:
services:
rpi:
build: ./rpi
networks:
iot:
ipv4_address: 192.168.1.3
ether:
ipv4_address: 192.168.2.3
nuc1:
build: ./nuc
networks:
- "iot"
nuc2:
build: ./nuc
networks:
- "ether"
networks:
iot:
driver: bridge
ipam:
driver: default
config:
- subnet: 192.168.1.0/24
gateway: 192.168.1.1
ether:
driver: bridge
ipam:
driver: default
config:
- subnet: 192.168.2.0/24
gateway: 192.168.2.1
You'll notice that there are two separate networks, I have two vlans I am trying to emulate and the rpi server is apart of both, hence the docker container is a part of both. I have QEMU running fine on the rpi container, and the docker containers themselves are behaving on the network as intended.
The problem I am having is trying to set up the networking so that the QEMU image appears to be just another node on the network at the addresses 192.168.1.3/192.168.2.3. My requirements are that:
The QEMU guest knows that its own IP on each network is 192.168.1.3 and 192.168.2.3 respectively.
The other NUC docker containers can reach the QEMU image at those IP address (i.e. I don't want to give the docker container running the QEMU image its own IP address, have the other containers hit that IP address, then NAT the address to the QEMU address).
I tried the steps listed on this gist with no luck. Additionally, I tried creating a TAP with an address of 10.0.0.1 on the QEMU docker container, a bound the QEMU guest to at TAP, then creating an IP tables rule to NAT traffic received to the TAP, however, the issue is that then the destination address is 10.0.0.1 and the QEMU guest thinks its own address is 192.168.1.3, so it won't receive the packet.
As you can see, I am a bit stuck conceptually, and need some help with a direction to take this.
Right now, this is the network configuration that I set up to handle traffic on the QEMU container (excuse the lack of consistent ip usage, iproute2 is not installed on the image I am using and I can't seem to get the containers to hit the internet):
brctl addbr br0
ip addr flush dev eth0
ip addr flush dev eth1
brctl addif br0 eth0
brctl addif br0 eth1
tunctl -t tap0 -u $(whoami)
brctl addif br0 tap0
ifconfig br0 up
ifconfig tap0 up
ip addr add 192.168.1.3 dev br0
ip addr add 192.168.2.3 dev br0
ip route add 192.168.1.1/24 dev br0
ip route add 192.168.2.1/24 dev br0
ip addr add 10.0.0.1 dev tap0
Then I've done the following forwarding rules:
iptables -t nat -A PREROUTING -i br0 -d 192.168.1.1 -j DNAT --to 10.0.0.1
iptables -t nat -A POSTROUTING -o br0 -s 10.0.0.1 -j SNAT --to 192.168.1.1
iptables -t nat -A PREROUTING -i br0 -d 192.168.2.1 -j DNAT --to 10.0.0.1
iptables -t nat -A POSTROUTING -o br0 -s 10.0.0.1 -j SNAT --to 192.168.2.1

How to grant internet access to application servers through load balancer

I have setup an environment in Jelastic including a load balancer (tested both Apache and Nginx with same results), with public IP and an application server running Univention UCS DC Master docker image (I have also tried a simple Ubuntu 20.04 install).
Now the application server has a private IP address and is correctly reachable from the internet, also I can correctly SSH into both, load balancer and app server.
The one thing I can't seem to achieve is to have the app server access the internet (outbound traffic).
I have tried setting up the network in the app server and tried a few Nginx load-balancing configurations but to be honest I've never used a load balancer before and I feel that configuring load balancing will not resolve my issue (might be wrong).
Of course my intention is to learn load balancing but if someone could just point me in the right direction I would be so grateful.
Question: what needs to be configured in Jelastic or in the servers to have the machines behind the load balancer access the internet?
Thank you for your time.
Cristiano
I was able to resolve the issue by simply detaching and re-attaching the public IP address to the server, so it was no setup problem just something in Jelastic got stuck..
Thanks all!
Edit: Actually to effectively resolve the issue, I have to detach the public IP address from the univention/ucs docker image, attach it to another node in the environment (ie an Ubuntu server I have), then attach the public IP back to the univention docker image. Can’t really figure why but works for me.
To have the machines access the internet you should add a route in them using your load balancer as a gw like this:
Destination GW Genmask
0.0.0.0 LB #IP 255.255.255.0
Your VMs firewalls should not block 80 and 443 ports for in/out traffic, using iptables :
sudo iptables -A INPUT -p tcp -m multiport --dports 80,443 -m conntrack --ctstate NEW,ESTABLISHED -j ACCEPT
sudo iptables -A OUTPUT -p tcp -m multiport --dports 80,443 -m conntrack --ctstate ESTABLISHED -j ACCEPT
In your load balancer you should masquerade outgoing traffic (change source ip) and forward input traffic to your vms subnet using the LB interface connected to this subnet:
sudo iptables --table NAT -A POSTROUTING --out-interface eth0 -j MASQUERADE
sudo iptables -A FORWARD -p tcp -dport 80 -i eth0 -o eth1 -j ACCEPT
sudo iptables -A FORWARD -p tcp -dport 443 -i eth0 -o eth1 -j ACCEPT
You should enable ip forwarding in your load balancer
echo 1 > /proc/sys/net/ipv4/ip_forward

Set ip for traffic for all traffic incoming outgoing

I have host and 5 IPs set for that host.
I can access host by any of these IPs.
Any connection that was made from that host and dockers too are detected as from IP1
I have a docker on that host that I want to have an IP2. How can I set that docker so when any connection made from that docker to other external servers they get info about IP# instead of IP1.
Thanks!
to achieve this you need to edit the routes on your machine. you can start by running these command to find out the current routes.
$ ip route show
default via 10.1.73.254 dev eth0 proto static metric 100
10.1.73.0/24 dev eth0 proto kernel scope link src 10.1.73.17 metric 100
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
172.18.0.0/16 dev docker_gwbridge proto kernel scope link src 172.18.0.1
then you need to change the default route like this
ip route add default via ${YOUR_IP} dev eth0 proto static metric 100
WHat have work for me at the end is to create docker network and add to iptables postrouting for that local range...
docker network create bridge-smtp --subnet=192.168.1.0/24 --gateway=192.168.1.1
iptables -t nat -I POSTROUTING -s 192.168.1.0/24 -j SNAT --to-source MYIP_ADD_HERE
docker run --rm --network bridge-smtp byrnedo/alpine-curl http://www.myip.ch

TPROXY compatibility with Docker

I'm trying to understand how TPROXY works in an effort to build a transparent proxy for Docker containers.
After lots of research I managed to create a network namespace, inject an veth interface into it and add TPROXY rules. The following script worked on a clean Ubuntu 18.04.3:
ip netns add ns0
ip link add br1 type bridge
ip link add veth0 type veth peer name veth1
ip link set veth0 master br1
ip link set veth1 netns ns0
ip addr add 192.168.3.1/24 dev br1
ip link set br1 up
ip link set veth0 up
ip netns exec ns0 ip addr add 192.168.3.2/24 dev veth1
ip netns exec ns0 ip link set veth1 up
ip netns exec ns0 ip route add default via 192.168.3.1
iptables -t mangle -A PREROUTING -i br1 -p tcp -j TPROXY --on-ip 127.0.0.1 --on-port 1234 --tproxy-mark 0x1/0x1
ip rule add fwmark 0x1 tab 30
ip route add local default dev lo tab 30
After that I launched a toy Python server from Cloudflare blog:
import socket
IP_TRANSPARENT = 19
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.setsockopt(socket.IPPROTO_IP, IP_TRANSPARENT, 1)
s.bind(('127.0.0.1', 1234))
s.listen(32)
print("[+] Bound to tcp://127.0.0.1:1234")
while True:
c, (r_ip, r_port) = s.accept()
l_ip, l_port = c.getsockname()
print("[ ] Connection from tcp://%s:%d to tcp://%s:%d" % (r_ip, r_port, l_ip, l_port))
c.send(b"hello world\n")
c.close()
And finally by running ip netns exec ns0 curl 1.2.4.8 I was able to observe a connection from 192.168.3.2 to 1.2.4.8 and receive the "hello world" message.
The problem is that it seems to have compatibility issues with Docker. All worked well in a clean environment, but once I start Docker things start to go wrong. It seems like the TPROXY rule was no longer working. Running ip netns exec ns0 curl 192.168.3.1 gave "Connection reset" and running ip netns exec ns0 curl 1.2.4.8 timed out (both should have produced the "hello world" message). I tried restoring all iptables rules, deleting ip routes and rules generated by Docker and shutting down Docker, but none worked even if I didn't configure any networks or containers.
What is happening behind the scenes and how can I get TPROXY working normally?
I traced all processes created by Docker using strace -f dockerd, and looked for lines containing exec. Most commands are iptables commands, which I have already excluded, and the lines with modprobe looked interesting. I loaded these modules one by one and figured out that the module causing the trouble is br_netfilter.
The module enables filtering of bridged packets through iptables, ip6tables and arptables. The iptables part can be disabled by executing echo "0" | sudo tee /proc/sys/net/bridge/bridge-nf-call-iptables. After executing the command, the script worked again without impacting Docker containers.
I am still confused though. I haven't understood the consequences of such a setting. I enabled packet tracing, but it seems that the packets matched the exact same set of rules before and after enabling bridge-nf-call-iptables, but in the former case the first TCP SYN packet got delivered to the Python server, in the latter case the packet got dropped for unknown reasons.
Try running docker with -p 1234
"By default, when you create a container, it does not publish any of its ports to the outside world. To make a port available to services outside of Docker, or to Docker containers which are not connected to the container’s network, use the --publish or -p flag."
https://docs.docker.com/config/containers/container-networking/

Docker swarm overlay network with vxlan routing over openvpn

I have setup a docker swarm with 3 nodes (docker 18.03). These nodes use an overlay network to communicate.
node1:
laptop
host tun0 172.16.0.6 --> openvpn -> nat gateway
container n1
ip = 192.169.1.10
node2:
aws ec2
host eth2 10.0.30.62
container n2
ip = 192.169.1.9
node3:
aws ec2
host eth2 10.0.140.122
container n3
ip = 192.169.1.12
nat-gateway:
aws ec2
tun0 172.16.0.1 --> openvpn --> laptop
eth0 10.0.30.198
The scheme is partly working:
1. Containers can ping eachother using name (n1,n2,n3)
2. Docker swarm commands are working, services can be deployed
The overlay is partly working. Some nodes cannot communicate with each other either using tcp/ip or udp. I tried all combinations of the 3 nodes with udp and tcp/ip:
I did a tcpdump on the nat gateway to monitor overlay vxlan network activity (port 4789):
tcpdump -l -n -i eth0 "port 4789"
tcpdump -l -n -i tun0 "port 4789"
Then I tried tcp/ip communication from node2 to node3. On node3:
nc -l -s 0.0.0.0 -p 8999
On node1:
telnet 192.169.1.12 8999
Node1 will then try to connect to node3. I see packets coming in on the nat-gateway over the tun0 interface:
on the nat-gateway eth0 interface:
it seems that the nat-gateway is not sending replies back over the tun0 interface.
The iptables configuration the nat-gateway
The routing of the nat-gateway
Can you help me solve this issue?
I have been able to fix the issue using the following configuration on the NAT gateway:
and
No masquerading of 172.16.0.0/22 is needed. All the workers and managers will route their traffic for 172.16.0.0/22 via the NAT gateway, and it knows how to send the packets over tun0.
Masquerading of eth0 was just wrong...
All the containers can now ping and establish tcp/ip connections to each other.

Resources