Enable forwarding from Docker containers to the outside world - docker

I've been wondering why docker installation does not enable by default port forwarding to containers.
To save you a click, what I mean is:
$ sysctl net.ipv4.conf.all.forwarding=1
$ sudo iptables -P FORWARD ACCEPT
I assume it is some sort of security risk, but I just wonder what the risk it is.
Basically I want to create some piece of code that enables this by default, but I want to know what is the bad that can happen.
I googled this and couldn't find anything.
Generally FORWARD ACCEPT seems to be considered too permissive (?)
If so, what can I change to make this more secure?
My network is rather simple, it is a bunch of pcs in a local lan (10.0.0.0/24) with an openvpn server and those pcs may deploy docker hosts (I'm doing this by hand, not using docker compose or swarm or anything because nodes change) that need to see each other. So no real outside access. Another detail is that I am not using network overlay which I could do without swarm, but the writer of the post warns it could be deprecated soon, so also wonder if I should just start using docker-swarm straight away.
EDIT: My question here is maybe more theoretical I guess than what it may seem at first. I want to know why they decided not to do this. I pretty much need/want full communication between docker instances, they need to be ssh'd into and open up a bunch of different ports to talk to each other (and this is the limitation of my networking knowledge, I don't know how this really works, I suppose they are all high ports, but are those also blocked by docker?). I am not sure docker-swarm would help me much here either. They aimed at micro-services I maybe need interactive sessions from time to time, but this is probably asking too much in a single question.
Maybe the simplest version of this question is: "if I put that code up there as a script to load each time my computer boots up, how can someone abuse it".

Each docker container runs on a local bridge network with IPs generally in the range of 172.1x.xx.xx. You can get the ip address running:
docker inspect <container name> | jq -r ".[].NetworkSettings.Networks[].IPAddress"
You should either run your container exposing and publishing the specific container ports on the host running the containers.
Alternatively, you can use iptables to redirect traffic to a specific port from outside:
iptables -t nat -I PREROUTING -i <incoming interface> -p tcp -m tcp --dport <host listening port> --j DNAT --to-destination <container ip address>:<container port>
Change tcp for udp if the port is listening on a udp socket.
If you want to redirect all traffic you can still use the same approach, but may need to specify a secondary ip address on your host (e.g., 192.168.1.x) and redirect any traffic coming to that address to your container.

Related

How to make IP traffic fowarding for Docker user-defined networks permanent?

When creating a user-defined Docker bridged network (using docker network create myname -d bridge), containers connected to this network may not be able to communicate with the outside world unless IP traffic fowarding is enabled on the Docker host using
$ sysctl net.ipv4.conf.all.forwarding=1
$ sudo iptables -P FORWARD ACCEPT
as explained in Enable forwarding from Docker containers to the outside world
However, the docs also state
These settings do not persist across a reboot, so you may need to add them to a start-up script.
but do not explain how to do this?
I'm running Docker on Ubuntu 22.04.1 server if that matters.
If you are going to use iptables make sure to also download iptables-peristent. This command helps to keep the iptables rules you created after a reboot.
Make sure you save the rules with the command iptables-save. Also, make sure you have enabled ip forwarding in the kernel level. Since you are using Ubuntu 22.04.1 I would recommend to use nftables instead. The iptables command is deprecated.
You can also visit their web site at: https://www.nftables.org/
to read their documentation for more information.

Docker server networking - reject incoming connections but allow outgoing

We use Docker containers to deploy multiple small applications on our servers that are reachable on the public internet. Some of the services need to communicate to each other, but are deployed on different servers, due to different hardware requirements (the servers are on different network and different IP).
Q: What would be the best way to configure blocking of incoming requests to SERVER:PORT except for some allowed IPs and at the same time allow all outgoing connections of the Docker containers?
Two major things we played with and tried out to get them working:
Bound Docker port mappings to 127.0.0.1 and route every traffic through an nginx. This is really config heavy and some infrastructure components aren't possible to proxy via http(s), so we need to add them to nginx.conf stream-server block and therefore open a port on the server (that is accessible by everyone).
Use iptables to restrict access to the published ports. So something like this: iptables -A INPUT -I DOCKER-USER -p tcp -i eth0 -j DROP. But this also have 2 major downfalls. First it seems that it's quite hard to allow multiple IP adresses in such a construct and on the other hand this approach seems to block our docker outgoing connections (to the internet) as well. E. g.: After we activated it a ping google.com from within a docker container was rejected.
Not sure I get this. In term of design, what is available to the external world is in a DMZ or published through an API gateway.
Your docker swarm/kubernetes cluster shall not be accessible directly through the internet or only the API gateway or the application on the DMZ.
So quite likely your docker server shall not be accessible directly. And even if that is the case, if you don't explicitely export a port to the host/outside of the cluster, it stay restricted to the virtuals networks of docker to allow cross container communication.

What are docker-proxy processes for

When I look at all running processes on my Linux machine, there are quite a few docker-proxy processes. It seems like every running container (port) results in one docker-proxy!
Problem is I cannot find any documentation which processes docker actually starts and how their relationship/usage is.
Does anyone know if there is any documentation on that?
A full explanation of the docker-proxy is available here.
The summary is that the proxy is used to handle connections originating from the local machine that might otherwise not pass through the iptables rules that Docker configures to handle port forwarding, or when Docker has been configured such that it does not manipulate iptables at all.

docker routing no iptables

I have a docker nginx image running with the command
sudo docker run -p 8002:80 nginx
What I am wondering is how does the routing work to go from hostmachine:8002 to get to the container listening on port 80. Usually there are iptables that very explicitly do that, but if you disable iptables it still works. Then I noticed that there is a docker-proxy process listening on each exposed port that I would assume does the proxy/nat. So I disabled the userland proxy with --userland-proxy=false. After doing that I now only see one process docker-current, still listening on all exposed ports. I can only assume that the docker-current process is doing the nat, but that makes me wondering why the userland proxy and/or iptables are ever there? And is there a way that I can see/prove to myself where the nating is happening (ie turn something on/off and not be able to curl my nginx container and then be able to)?
Just for completeness I will answer my own questions.
The iptables rules are there to hopefully save time by nat'ing in kernel space rather than sending it up to userspace to do the nat'ing. You can kill the process that is listening on the port and remove the iptables nat rules and not get a response from the traffic, which is how I (among other tests) was able to prove to myself what was going on.

Coreos security

I'm playing with coreos and digitalocean, and I'd like to start allowing internal communication between my containers.
I've got private networking set up for all the hosts, and now I'd like to ensure that some containers only open ports to localhost and to the internal interface.
I've explored a lot of options for this, but none of them seem satisfactory:
Using the '-p', I can ensure docker binds to the local interface, but this has two downsides:
I can't easily test services by SSHing in, because that traffic originates from localhost
I need to write somewhat hacky shell scripts to start my services, in order to inject the address of the machine that the container is running on
I tried using flannel, but it doesn't make the traffic private (or I didn't set it up right)
I considered using iptables on the containers to prevent external access, but that doesn't seem as secure
I tried using iptables on the coreos hosts, but ... it's tricky, and I couldn't get it working.
When I tried to configured iptables on the host, I used the method here: https://docs.docker.com/articles/networking/#communication-between-containers-and-the-wider-world, by adding a DROP rule to the docker chain, but it didn't work, and packets still got through
So what's the best approach, and I'll invest time in making it work.
Overall, I guess I need to find something that I can:
Roll out to all the hosts reliably
Something that is reasonably flexible going forward
Something that allows for 'edge machines' which are accessible from the wider internet.
Solution
I'll go into how I ended up solving this. Thanks to larsks for their help. In the end, their approach was the correct one. It's tricky on coreos, because there aren't really stable addresses, like larsks assumes. The whole point of coreos it to be able to forget about ip addresses.
I solved this by finding a not-too-bad way to inject the ip address into the command in the service file. The tricky thing about this is that it doesn't really support a lot of the shell features I expected. What I wanted to do was to assign the ip address of the machine to a variable then inject it into the command:
ip=$(ifconfig eth1 | grep -o 'inet [0-9]*\.[0-9]*\.[0-9]*\.[0-9]*' | grep -o '[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*');
/usr/bin/docker run -p $ip:7000:7000 ...
But, as mentioned, that doesn't work. So what to do? Get the shell!
ExecStart=/usr/bin/sh -c "\
export ip=$(ifconfig eth1 | grep -o 'inet [0-9]*\.[0-9]*\.[0-9]*\.[0-9]*' | grep -o '[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*');\
echo $ip;\
/usr/bin/docker run -p $ip:7000:7000"
I hit a few problems along the way.
I'm pretty sure there aren't newlines in that command, so I had to add the ';' characters
when you test the above bash -c command in a shell, it'll have very different effects to when systemd does it. In the shell you need to escape the '$' characters, while in systemd config files, you don't.
I included the echo so that I could see what the command thought the ip was.
When I was doing all this, I actually inserted a small webserver to the docker image, so that I could just test using curl.
Downsides of this approach is that it's tied to the way ifconfig works, and ipv4. In fact, this approach doesn't work on my linux mint laptop, where ifconfig produces differently formatted output. The important lesson here is to output things in yaml or json, so that shell json tools can access things more easily.
Instead of grep-ping the IP address, you can use the environment files to get the IP address (both public and private) of the host the service gets scheduled on. This allows you to bind your container ports to either public or private ports in a simple way.
Like so:
[Service]
EnvironmentFile=/etc/environment
ExecStart=/usr/bin/docker run --name myservice -p \
${COREOS_PUBLIC_IPV4}:80:80 \
${COREOS_PRIVATE_IPV4}:3306:3306 \
ubuntu /bin/bash
I've got private networking set up for all the hosts, and now I'd like
to ensure that some containers only open ports to localhost and to the
internal interface.
This is exactly the behavior that you get with the -p option when you specify an ip address. Let's say I have a host with two external interfaces, eth0 (with address 10.0.0.10) and eth1 (with address 192.168.0.10), and the docker0 bridge at 172.17.42.1/16.
If I start a container like this:
docker run -p 192.168.0.10:80:80 -d larsks/mini-httpd
This will start a container that is accessible over the eth1 interface at 192.168.0.10, port 80. This service is also accessible -- from the host on which the container is located -- at the address assigned to the container on the docker0 network. This would be something like 172.17.0.39, port 80.
This seems to meet your goals:
The container port is exposed over the "private" eth1 interface.
The container port is accessible from the host.
I can't easily test services by SSHing in, because that traffic originates from localhost.
If you were running ssh inside a container, you would ssh to it at the "internal" address assigned by Docker. But if you are running ssh inside your containers, you may want to consider not doing that and rely on tools like docker exec instead.
I need to write somewhat hacky shell scripts to start my services, in order to inject the address of the machine that the container is running on
With this solution, there is no need to inject the machine ip into the container.

Resources