Leverage iptables to drop packets between docker containers - docker

I have three containers C1, C2 and C3, forming a cluster, and a DNS instance running. The containers resolve their IPs using the DNS and already can communicate with each other as they expose the needed ports using vanilla docker configuration.
How can I leverage iptables from the host to drop packets between say C1 and C2 at any point in time?

It's not clear from your question exactly what your goal is, so here are a few options.
Disabling ICC
If you run the Docker daemon with --icc=false, then containers will by default not be able to communicate unless you explicitly link them with --link.
If you follow this route, note this issue (tl;dr: you must ensure that the br_netfilter module is loaded on recent kernels).
Modifying iptables inside a container
You can use the nsenter tool to run iptables commands inside a container and add DROP rules to the INPUT chain of the container. For example, if you know (a) the PID of container C1 and (b) the ip address of container C2 (both of which you can get with docker inspect), you could run:
nsenter -t <pid_of_C1> --net iptables -A INPUT -s <ip_of_c2> -j DROP
Modifying iptables on the host
You can modify the FORWARD chain on your host to block traffic between particular containers. For example, to drop packets from C1 to C2:
iptables -I FORWARD 1 -s <ip_of_c1> -d <ip_of_c2> -j DROP
This insert the above rule at position 1 (-I FORWARD 1) of the FORWARD table. This is necessary because it must come before the -i docker0 -o docker0 -j ACCEPT rule that Docker will add to the FORWARD chain when --icc=true, which is the default.

Related

access docker container in different subnet (bridge)

I would like to reach container which is in another subnet (different bridge). The src and dst bridge is connected via a veth-pair.
This is needed for a test setup in which I would like to manipulate the connection propperties (rate, latancy, etc.) between those bridges. My VMs in these bridges are able to ping each other but not the containers (either the VMs nor the other containers if they are connected to the other bridge.
First I startet up the container without any network configuration and tried to connect their veth counterparts on the host to those bridges which i also created manualy.
Actually I created those bridges indirect with
docker network create --subnet 192.168.1.0/26 \
-o "com.docker.network.bridge.enable_icc"="true" \
-o "com.docker.network.driver.mtu"="1500" \
-o "com.docker.network.bridge.name"="br-side-a" \
br-side-a
docker network create --subnet 192.168.1.64/29 \
-o "com.docker.network.bridge.enable_icc"="true" \
-o "com.docker.network.driver.mtu"="1500" \
-o "com.docker.network.bridge.name"="br-side-b" \
br-side-b
and connected them with
ip link add dev vsidea type veth peer name vsideb
brctl addif br-side-a vsidea
brctl addif br-side-b vsideb
ip addr add 192.168.1.10/26 dev vsidea
ip addr add 192.168.1.66/29 dev vsideb
ip link set vsidea up
ip link set vsideb up
VMs that I connected to those bridges (with IPs of the connected subnets) are able to ping each other.
My containers are startet like this:
docker run -ti --network br-side-a --ip 192.168.1.20 -p 10001:10000 --name csidea --privileged debian bash
docker run -ti --network br-side-a --ip 192.168.1.67 -p 10001:10000 --name csideb --privileged debian bash
I can ping all (gateway-ips, vsidea/b, ...) on both container of each subnet but not the IPs which I assigned to those containers. Nor could the VMs reach the container IPs.
I think docker does some routing/filtering which I must turn off but I have no idea how.
So I found a solution to my problem. Like mentioned docker does filtering, I now know.
Docker automatically creates iptables rules to restrict network access for created bridges. To show them just use iptables [-L|-S] there should be three specific rule chains 'DOCKER-USER', 'DOCKER-ISOLATION-STAGE-1' and 'Docker-ISOLATION-STAGE-2'.
The rules in those isolation stage chains does prevent networking between my networks. They are in the form:
-I DOCKER-ISOLATION-STAGE-1 -i <my_network> ! -o <my_network> -j DOCKER-ISOLATION-STAGE2
-I DOCKER-ISOLATION-STAGE-2 -i <my_network> ! -o <my_network> DROP
I first set the last rules from DROP to ACCEPT just to justify my discoveries. And see the networking works between those nets.
So I searched how to prevent docker to create those rules, but you can only disable creation of any iptables entries by docker not just some of them. Also it is not recommended to change the isolations chains, but the DOCKER-USER chain is for exactly that purpose. It will be evaluated before any other docker rules, so you can specify to accept these packages instead of dropping them. Add following rule for every subnet you will let to communicate iptables -I DOCKER-USER -i <my_bridge_network> ! -o <my_bridge_network> ACCEPT.
PS: Sorry for my english. I hope it is understandable, but if there are unbearable mistakes feel free to give me a hint how I could do better.

Docker IP-TABLES Error

Hey i'm quite new to these docker stuff. I tried to start an docker container with bitbucket, but i get this output.
root#rv1175:~# docker run -v bitbucketVolume:/var/atlassian/application-data/bitbucket --name="bitbucket" -d -p 7990:7990 -p 7999:7999 atlassian/bitbucket-server
6da32052deeba204d5d08518c93e887ac9cc27ac10ffca60fa20581ff45f9959
docker: Error response from daemon: driver failed programming external connectivity on endpoint bitbucket (55d12e0e4d76ad7b7e8ae59d5275f6ee85c8690d9f803ec65fdc77a935a25110): (iptables failed: iptables --wait -t filter -A DOCKER ! -i docker0 -o docker0 -p tcp -d 172.17.0.2 --dport 7999 -j ACCEPT: iptables: No chain/target/match by that name.
(exit status 1)).
root#rv1175:~#
I got the same output every time i tried to activate any docker
container. Can someone help me?
P.S. one more question.
What does 172.1.0.2 mean? I can only say, that this is not my ip.
172.17.0.2 would be the IP assigned to the container within the default Docker bridge network (docker0 virtual interface). These are not reachable from the outside, though you are instructing the Docker engine to "publish" (in Docker terminology) two ports.
To do so, the engine creates port forwarding rules with iptables, which forward (in your case) all incoming traffic to ports tcp/7990 and tcp/7999 on all interfaces of the host to the same ports at 172.17.0.2 on the docker0 interface (where the process in the container is hopefully listening).
It looks like the DOCKER iptables chain where this happens is not present. Maybe you have other tools manipulating iptables that might be erasing what the Docker engine is doing. Try to identify them and restart the Docker engine (it should re-create everything on startup).
You can also instruct the engine not to manipulate iptables by configuring the Docker daemon appropriately. You would then need to set things up yourself if you want to use the network bridge driver (though you could also use the host driver). Here is a good example of doing so.

docker with public IP as a client

I have a host with 10.1.1.2 and I'd like to create a docker container on it that will have the IP address 10.1.1.3 and that will be able to ping (and later to send its syslog) to an external machine on the same network. (eg. 10.1.1.42). I'd also like the packets to arrive from 10.1.1.3. So as far as I understand no NAT.
I am not interested in inbound network connections to the docker container but outbound.
There is apparently an unresolved issue for this feature right now, so the only current solution is to manually create the necessary iptables rules after launching your container. E.g., something like:
iptables -t nat -I POSTROUTING 1 -s <container_ip> -j SNAT --to-source 10.1.1.3
You will also need to add that address to an interface on your host:
ip addr add 10.1.1.3/24 dev eth0

How ot map docker container ip to a host ip (NAT instead of NAPT)?

The main goal is to do a real NAT instead of NAPT. Note normal docker run -p ip:port2:port1 command actally is doing NAPT (address+port translation) instead of NAT(address translation). Is it possible to map address only, but keep all exposed ports the same as the container, like docker run -p=ip1:*:* ... , instead of one by one or a range?
ps.1. My port range is rather big (22-50070, ssh-hdfs) so port range approach won't work.
ps.2. Maybe I need a swarm of virtual machines and join the host into the swarm.
ps.3 I raised an feature request on github. Not sure if they will accept it but currently there are 2000+ open issues (it's so popular).
Solution
On linux, you can access any container by ip and port without any binding (no -p) ootb. Docker version: CE 17+
If your host is windows, and docker is running on a linux VM like me, to access the containers, the only thing need to do is adding the route on windows route add -p 172.16.0.0 mask 255.240.0.0 ip_of_your_vm. Now you can access all containers by IP:port without any port mapping from both windows host and linux VM.
There are few options you have. One is to decide which PORT range you want to map then use that in your docker run
docker run -p 192.168.33.101:80-200:80-200 <your image>
Above will map all ports from 80 to 200 on your container. Assuming your idle IP is 192.168.33.100. But unfortunately it is not possible to map a larger port range as docker creates multiple iptables forks to setup the tables and bombs the memory. It would raise an error like below
docker: Error response from daemon: driver failed programming external connectivity on endpoint zen_goodall (0ae6cec360831b46fe3668d6aad9f5f72b6dac5d26cc6c817452d1402d12f02c): (iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 0/0 --dport 8513 -j DNAT --to-destination 172.17.0.3:8513 ! -i docker0: (fork/exec /sbin/iptables: resource temporarily unavailable)).
This is not right way of docker mapping it. But this is not a use case that they would agree to, so may not fix the above issue. Next option is to run your docker container without any port publishing and use below iptables rules
DOCKER_IP=172.17.0.2
ACTION=A
IP=192.168.33.101
sudo iptables -t nat -$ACTION DOCKER -d $IP -j DNAT --to-destination $DOCKER_IP ! -i docker0
sudo iptables -t filter -$ACTION DOCKER ! -i docker0 -o docker0 -p tcp -d $DOCKER_IP -j ACCEPT
sudo iptables -t nat -$ACTION POSTROUTING -p tcp -s $DOCKER_IP -d $DOCKER_IP -j MASQUERADE
ACTION=A will add the rules and ACTION=D will delete the rules. This would setup complete traffic from your IP to the DOCKER_IP. This only good if you are doing it on a testing server. Not recommended on staging or production. Docker adds a lot more rules to prevent other containers poking into your container but this offers no protection whatsoever
I dont think there is a direct way to do what you are asking.
If you use "-P" option with "docker run", all ports that are exposed using "EXPOSE" in Dockerfile will automatically get exposed with random ports in the host. With "-p" option, the only way is to specify the option multiple times for multiple ports.

Communicating between Docker containers in different networks on the same host

Any possibility to make containers in different networks within the same host to communicate? Please note that I am not using docker-compose at the moment.
The following is a summary of what I did. I created two networks using the following commands
docker network create --driver bridge mynetwork1
docker network create --driver bridge mynetwork2
Then I ran two containers on each of these created networks using the commands:
docker run --net=mynetwork1 -it name=mynet1container1 mycontainerimage
docker run --net=mynetwork1 -it name=mynet1container2 mycontainerimage
docker run --net=mynetwork2 -it name=mynet2container1 mycontainerimage
docker run --net=mynetwork2 -it name=mynet2container2 mycontainerimage
I then identified the IP Addresses of each of the containers from the networks created using
docker network inspect mynetwork1
docker network inspect mynetwork2
Using those I was able to communicate between the containers in the same network, but I could not communicate between the containers across the networks. Communication was possible only by adding the containers to the same network.
Much thanks...
Containers in different networks can not communicate with each other because iptables drop such packets. This is shown in the DOCKER-ISOLATION-STAGE-1 and DOCKER-ISOLATION-STAGE-2 chains in the filter table.
sudo iptables -t filter -vL
Rules can be added to DOCKER-USER chain to allow communication between different networks. In the above scenario, the following commands will allow ANY container in mynetwork1 to communicate with ANY containers in mynetwork2.
The bridge interface names of the network (mynetwork1 and mynetwork2) need to be found first. Their names are usually look like br-07d0d51191df or br-85f51d1cfbf6 and they can be found using command "ifconfig" or "ip link show". Since there are multiple bridge interfaces, to identify the correct ones for the networks of interest, the inet address of the bridge interface (shown in ifconfig) should match the subnet address shown in command 'docker network inspect mynetwork1'
sudo iptables -I DOCKER-USER -i br-########1 -o br-########2 -j ACCEPT
sudo iptables -I DOCKER-USER -i br-########2 -o br-########1 -j ACCEPT
The rules can be fine tuned to allow only communications between specific IPs. E.g,
sudo iptables -I DOCKER-USER -i br-########1 -o br-########2 -s 172.17.0.2 -d 172.19.0.2 -j ACCEPT
sudo iptables -I DOCKER-USER -i br-########2 -o br-########1 -s 172.19.0.2 -d 172.17.0.2 -j ACCEPT
Issue
Two containers cannot communicate because there are not on the same network.
Solution a)
Connect one container into the other network overlay (this may not meet the constraint you have).
Solution b)
Create a third network and plug both containers into this network.
How to
The command docker run accept only one occurrence of the option --net, what you have to do is to docker start the containers and then to docker network connect them to a shared network.
The answer you are looking for is here: https://stackoverflow.com/a/34038381/5321002
According to Docker Docs Containers can only communicate within networks but not across networks You can attach a container to two networks and be able to communicate that way.
edit: Although at that point why have two networks in the first place.
Here's the link:
https://docs.docker.com/engine/userguide/networking/dockernetworks/
-Bruce

Resources