When deploying docker-compose with multiple networks, only the first interface have an access to the outside world
version: "3.9"
services:
speedtest:
build:
context: .
dockerfile: speedtest.Dockerfile
tty: true
networks:
- eth0
- eth1
networks:
eth0:
eth1:
Running inside the container ping for example ping -I eth0 google.com works fine
However running ping -I eth1 google.com will get the result
PING google.com (142.250.200.238) from 172.21.0.2 eth1: 56(84) bytes of data.
From c4d3b238f9a1 (172.21.0.2) icmp_seq=1 Destination Host Unreachable
From c4d3b238f9a1 (172.21.0.2) icmp_seq=2 Destination Host Unreachable
Any idea how to have egress to the internet on both networks?
Tried multiple combinations for creating the network, with external, bridge with custom config etc...
Update
After larsks answer, using ip route add for eth1 and running tcpdump -i any packets are coming in correctly:
11:26:12.098918 eth1 Out IP 8077ec32b69d > dns.google: ICMP echo request, id 3, seq 1, length 64
11:26:12.184195 eth1 In IP dns.google > 8077ec32b69d: ICMP echo reply, id 3, seq 1, length 64
But still 100% packet loss...
The problem here is that while there are two interfaces inside the container, there is only a single default route. Given a container with two interfaces, like this:
/ # ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
70: eth0#if71: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:c0:a8:10:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 192.168.16.2/20 brd 192.168.31.255 scope global eth0
valid_lft forever preferred_lft forever
72: eth1#if73: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:c0:a8:30:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 192.168.48.2/20 brd 192.168.63.255 scope global eth1
valid_lft forever preferred_lft forever
The routing table looks like this:
/ # ip route
default via 192.168.16.1 dev eth0
192.168.16.0/20 dev eth0 proto kernel scope link src 192.168.16.2
192.168.48.0/20 dev eth1 proto kernel scope link src 192.168.48.2
When you run ping google.com or ping -I eth0 google.com, in both
cases your ICMP request egresses through eth0, goes to the
appropriate default gateway, and eventually works it way to
google.com.
But when you run ping -I eth1 google.com, there's no way to reach
the default gateway from that address; the gateway is only reachable
via eth0. Since the kernel can't find a useful route, it attempts to
connect directly. If we run tcpdump on the host interface that is
the other end of with eth1, we see:
23:47:58.035853 ARP, Request who-has 142.251.35.174 tell 192.168.48.2, length 28
23:47:59.083553 ARP, Request who-has 142.251.35.174 tell 192.168.48.2, length 28
[...]
That's the kernel saying, "I've been told to connect to this address
using this specific interface, but there's no route, so I'm going to
assume the address is on the same network and just ARP for it".
Of course that fails.
We can make this work by adding an appropriate route. You need to run
a privileged container to do this (or at least have
CAP_NET_ADMIN):
ip route add default via 192.168.48.1 metric 101
(The gateway address is the .1 address of the network associated with eth1.)
We need the metric setting to differentiate this from the existing
default route; without that the command would fail with RTNETLINK answers: File exists.
After running that command, we have:
/ # ip route
default via 192.168.16.1 dev eth0
default via 192.168.48.1 dev eth1 metric 101
192.168.16.0/20 dev eth0 proto kernel scope link src 192.168.16.2
192.168.48.0/20 dev eth1 proto kernel scope link src 192.168.48.2
And we can successfully ping google.com via eth1:
/ # ping -c2 -I eth1 google.com
PING google.com (142.251.35.174) from 192.168.48.2 eth1: 56(84) bytes of data.
64 bytes from lga25s78-in-f14.1e100.net (142.251.35.174): icmp_seq=1 ttl=116 time=8.87 ms
64 bytes from lga25s78-in-f14.1e100.net (142.251.35.174): icmp_seq=2 ttl=116 time=8.13 ms
--- google.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 8.127/8.497/8.868/0.370 ms
Having gone through all that, I'll add that I don't see many
situations in which it would be necessary: typically you use
additional networks in order to isolate things like database servers,
etc, while using the "primary" interface (the one with which the
default route is associated) for outbound requests.
I tested all this using the following docker-compose.yaml:
version: "3"
services:
sleeper:
image: alpine
cap_add:
- NET_ADMIN
command:
- sleep
- inf
networks:
- eth0
- eth1
networks:
eth0:
eth1:
Related
When I create a docker network, a bridge is added:
docker network create DUMMY
Now executing ifconfig gives:
br-8a429249b4d9: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.255.136.1 netmask 255.255.255.0 broadcast 10.255.136.255
inet6 fe80::42:88ff:fe9b:9a33 prefixlen 64 scopeid 0x20<link>
ether 02:42:88:9b:9a:33 txqueuelen 0 (Ethernet)
RX packets 9 bytes 388 (388.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 74 bytes 11136 (11.1 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
Is it possible to retrieve that IP 10.255.136.1 from a container running inside the DUMMY network?
I am dockerizing an application which requires the source IP (which, in this case, is another application running on the host) to be whitelisted in some configuration file and I believe that should be the bridge's IP. Hence my question to retrieve that IP from within the actual container. Or alternatively, is there a way to provide that IP to the container via an environment variable?
The ip address of the bridge will be the default gateway inside the container. In other words, you can just parse the output from e.g. ip route to find the bridge address.
For example, if I create a DUMMY network, I get:
6: br-ea7804d337bc: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:8e:97:b4:02 brd ff:ff:ff:ff:ff:ff
inet 172.20.0.1/16 brd 172.20.255.255 scope global br-ea7804d337bc
valid_lft forever preferred_lft forever
inet6 fe80::42:8eff:fe97:b402/64 scope link
valid_lft forever preferred_lft forever
If I start a container on that network:
docker run -it --rm --net DUMMY alpine sh
I have the following routing table:
/ # ip route
default via 172.20.0.1 dev eth0
172.20.0.0/16 dev eth0 scope link src 172.20.0.2
And I can get the ip address itself by running that output through awk:
/ # ip route | awk '$1 == "default" {print $3}'
172.20.0.1
On a KVM guest of my RHEL8 host, whose KVM guest is running CentOS7, I was expecting firewalld to by default block outside access to an ephemeral port published to by a Docker Container running nginx. To my surprise the access ISN'T blocked.
Again, the host (myhost) is running RHEL8, and it has a KVM guest (myguest) running CentOS7.
The firewalld configuration on myguest is standard, nothin' fancy:
[root#myguest ~]# firewall-cmd --list-all
public (active)
target: default
icmp-block-inversion: no
interfaces: eth0 eth1
sources:
services: http https ssh
ports:
protocols:
masquerade: no
forward-ports:
source-ports:
icmp-blocks:
rich rules:
Here are the eth0 and eth1 interfaces that fall under the firewalld public zone:
[root#myguest ~]# ip a s dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 52:54:00:96:9c:fc brd ff:ff:ff:ff:ff:ff
inet 192.168.100.111/24 brd 192.168.100.255 scope global noprefixroute eth0
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe96:9cfc/64 scope link noprefixroute
valid_lft forever preferred_lft forever
[root#myguest ~]# ip a s dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 52:54:00:66:6c:a1 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.111/24 brd 192.168.1.255 scope global noprefixroute eth1
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe66:6ca1/64 scope link noprefixroute
valid_lft forever preferred_lft forever
On myguest I'm running Docker, and the nginx container is publishing its Port 80 to an ephemeral port:
[me#myguest ~]$ docker container ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
06471204f091 nginx "/docker-entrypoint.…" About an hour ago Up About an hour 0.0.0.0:49154->80/tcp focused_robinson
Notice that in the prior firewall-cmd output I was not permitting access via this ephemeral TCP Port 49154 (or to any other ephemeral ports for that matter). So, I was expecting that unless I did so, outside access to nginx would be blocked. But to my surprise, from another host in the home network running Windows, I was able to access it:
C:\Users\me>curl http://myguest:49154
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
.
.etc etc
If a container publishes its container port to an ephemeral one on the host (myguest in this case), shouldn't the host firewall utility protect access to that port in the same manner as it would a standard port? Am I missing something?
But I also noticed that in fact the nginx container is listening on a TCP6 socket:
[root#myguest ~]# netstat -tlpan | grep 49154
tcp6 0 0 :::49154 :::* LISTEN 23231/docker-proxy
It seems, then, that firewalld may not be blocking tcp6 sockets? I'm confused.
This is obviously not a production issue, nor something to lose sleep over. I'd just like to make sense of it. Thanks.
The integration between docker and firewalld has changed over the years, but based on your OS versions and CLI output I think you can get the behavior you expect by setting AllowZoneDrifting=no it /etc/firewalld/firewalld.conf 1 on the RHEL-8 host.
Due to zone drifting, it possible for packets received in a zone with --set-target=default (e.g. public zone) to drift to a zone with --set-target=accept (e.g. trusted zone). This means FORWARDed packets received in zone public will be forwarded to zone trusted. If your docker containers are using a real bridge interface, then this issue may apply to your setup. Docker defaults to SNAT so usually this problem is hidden.
Newer firewalld 2 releases have completely removed this behavior, because as you have found it's both unexpected and a security issue.
I've moved my Mongodb from a container to a local service (it was really flaky when containerised). Problem is I cannot connect from a Node api into the locally running MongoDB service. I can get this working on my Mac, but not on Ubuntu. I've tried:
- DB_HOST=mongodb://172.17.0.1:27017/proto?authSource=admin
- DB_HOST=mongodb://localhost:27017/proto?authSource=admin
// this works locally, but not on my Ubuntu server
- DB_HOST=mongodb://host.docker.internal:27017/proto?authSource=admin
Tried adding this to my docker file:
ip -4 route list match 0/0 | awk '{print $3 "host.docker.internal"}' >> /etc/hosts && \
Also tried network bridge to no avail. Example docker compose
version: '3.3'
services:
search-api:
build: ../search-api
environment:
- PORT=3333
- DB_HOST=mongodb://host.docker.internal:27017/search?authSource=admin
- DB_USER=dbuser
- DB_PASS=password
ports:
- 3333:3333
restart: always
Problem can be caused by MongoDb not listening on the correct ip address and therefore blocking your access.
Either make sure you're listening to a specific ip or listening to all: 0.0.0.0
On linux the config file is per default installed here: /etc/mongod.conf
Configuration specific Ip address:
net:
bindIp: 172.17.0.1 #being your host's ip address
port: 27017
Configuration open to all connections:
net:
bindIp: 0.0.0.0
port: 27017
To get your hosts ip address (from within a container)
On docker-for-mac and docker-for-windows you can use host.docker.internal
While on linux you need to run ip route show in the container.
When running Docker natively on Linux, you can access host services using the IP address of the docker0 interface. From inside the container, this will be your default route.
For example, on my system:
$ ip addr show docker0
7: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
inet6 fe80::f4d2:49ff:fedd:28a0/64 scope link
valid_lft forever preferred_lft forever
And inside a container:
# ip route show
default via 172.17.0.1 dev eth0
172.17.0.0/16 dev eth0 src 172.17.0.4
(copied from here: How to access host port from docker container)
So I'm trying to create a network (docker network create) so that its traffic will pass through an specific physical network interface (NIC); I have two: <iface1> (internal), and <iface2> (external).
I need the traffics of both NICs to be physically separated.
METHOD 1:
I think macvlan is the driver should use to create such network.
For most of what I found on the internet, the solutions refer to Pipework (deprecated now) and temporary docker-plugins (deprecated too).
For what most closely has helped me is this1
docker network create -d macvlan \
--subnet 192.168.0.0/16 \
--ip-range 192.168.2.0/24 \
-o parent=wlp8s0.1 \
-o macvlan_mode=bridge \
macvlan0
Then, in order for the container to be visible from the host, I need to do this in the host:
sudo ip link add macvlan0 link wlp8s0.1 type macvlan mode bridge
sudo ip addr add 192.168.2.10/16 dev macvlan0
sudo ifconfig macvlan0 up
Now the container and the host see each other :) BUT the container can't access the local network.
The idea, is that the container can access internet.
METHOD 2:
As I will use <iface2> manually, I'm ok if by default the traffic goes through <iface1>.
But no matter in which order I get the NICs up (I also tried removing the LKM for <iface2> temporarely); the whole traffic is always overtaken by the external NIC <iface2>.
And I found that it happens because the route table updates automatically at some "random" time.
In order to force the traffic to go through <iface1>, I have to (in the host):
sudo route del -net <net> gw 0.0.0.0 netmask 255.0.0.0 dev <iface2>
sudo route del default <iface2>
Now, I can verify (in several ways) that the traffic just goes through <iface1>.
But the moment that the route table updates (automatically), all traffic moves to <iface2>. Damn!
I'm sure there's a way to make the route table "static" or "persistent".
EDIT (18/Jul/2018):
The main idea is to be able to access internet through a docker container using only one of two available physical network interfaces.
My environment:
On the host created for vm virbr0 bridge with ip address 192.168.122.1 and up vm instance with interface ens3 and ip address 192.168.122.152.
192.168.122.1 - is gateway for 192.168.122.0/24 network.
Into vm:
Create network:
# docker network create --subnet 192.168.122.0/24 --gateway 192.168.122.1 --driver macvlan -o parent=ens3 vmnet
Create docker container:
# docker run -ti --network vmnet alpine ash
Check:
/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
12: eth0#if2: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP
link/ether 02:42:c0:a8:7a:02 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.2/24 brd 192.168.122.255 scope global eth0
valid_lft forever preferred_lft forever
/ # ping 192.168.122.152
PING 192.168.122.152 (192.168.122.152): 56 data bytes
^C
--- 192.168.122.152 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss
/ # ping 192.168.122.1
PING 192.168.122.1 (192.168.122.1): 56 data bytes
64 bytes from 192.168.122.1: seq=0 ttl=64 time=0.471 ms
^C
--- 192.168.122.1 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.471/0.471/0.471 ms
Ok, I up another vm with ip address 192.168.122.73 and check from docker:
/ # ping 192.168.122.73 -c2
PING 192.168.122.73 (192.168.122.73): 56 data bytes
64 bytes from 192.168.122.73: seq=0 ttl=64 time=1.630 ms
64 bytes from 192.168.122.73: seq=1 ttl=64 time=0.984 ms
--- 192.168.122.73 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.984/1.307/1.630 ms
From docker instance I can't ping interface on vm, but I can access to local network.
/ # ip n|grep 192.168.122.152
192.168.122.152 dev eth0 used 0/0/0 probes 6 FAILED
On vm I add macvlan0 nic:
# ip link add macvlan0 link ens3 type macvlan mode bridge
# ip addr add 192.168.122.100/24 dev macvlan0
# ip l set macvlan0 up
From the docker I can ping 192.168.122.100:
/ # ping 192.168.122.100 -c2
PING 192.168.122.100 (192.168.122.100): 56 data bytes
64 bytes from 192.168.122.100: seq=0 ttl=64 time=0.087 ms
64 bytes from 192.168.122.100: seq=1 ttl=64 time=0.132 ms
--- 192.168.122.100 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.087/0.109/0.132 ms
Problem: the Internet isn't accessible within a docker container.
on my bare metal Ubuntu 17.10 box...
$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=52 time=10.8 ms
but...
$ docker run --rm debian:latest ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
92 bytes from 7911d89db6a4 (192.168.220.2): Destination Host Unreachable
I think the root cause is that I had to set up a non-default network for docker0 because the default one 172.17.0.1 was already in use within my organization.
My /etc/docker/daemon.json file needs to look like this in order for docker to start successfully.
$ cat /etc/docker/daemon.json
{
"bip": "192.168.220.1/24",
"fixed-cidr": "192.168.220.0/24",
"fixed-cidr-v6": "0:0:0:0:0:ffff:c0a8:dc00/120",
"mtu": 1500,
"default-gateway": "192.168.220.10",
"default-gateway-v6": "0:0:0:0:0:ffff:c0a8:dc0a",
"dns": ["10.0.0.69","10.0.0.70","10.1.1.11"],
"debug": true
}
Note that the default-gateway setting looks wrong. However, if I correct it to read 192.168.220.1 the docker service fails to start. Running dockerd at the command line directly produces the most helpful logging, thus:
With "default-gateway": 192.168.220.1 in daemon.json...
$ sudo dockerd
-----8<-----
many lines removed
----->8-----
Error starting daemon: Error initializing network controller: Error creating default "bridge" network: failed to allocate secondary ip address (DefaultGatewayIPv4:192.168.220.1): Address already in use
Here's the info for docker0...
$ ip addr show docker0
10: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:10:bc:66:fd brd ff:ff:ff:ff:ff:ff
inet 192.168.220.1/24 brd 192.168.220.255 scope global docker0
valid_lft forever preferred_lft forever
inet6 fe80::42:10ff:febc:66fd/64 scope link
valid_lft forever preferred_lft forever
And routing table...
$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.62.131.1 0.0.0.0 UG 100 0 0 enp14s0
10.62.131.0 0.0.0.0 255.255.255.0 U 100 0 0 enp14s0
169.254.0.0 0.0.0.0 255.255.0.0 U 1000 0 0 enp14s0
192.168.220.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0
Is this the root cause? How do I achieve the, seemingly mutually exclusive states of:
docker0 interface address is x.x.x.1
gateway address is same, x.x.x.1
dockerd runs ok
?
Thanks!
Longer answer to Wedge Martin's question. I made the changes to daemon.json as you suggested:
{
"bip": "192.168.220.2/24",
"fixed-cidr": "192.168.220.0/24",
"fixed-cidr-v6": "0:0:0:0:0:ffff:c0a8:dc00/120",
"mtu": 1500,
"default-gateway": "192.168.220.1",
"default-gateway-v6": "0:0:0:0:0:ffff:c0a8:dc0a",
"dns": ["10.0.0.69","10.0.0.70","10.1.1.11"],
"debug": true
}
so at least the daemon starts, but I still don't have internet access within a container...
$ docker run -it --rm debian:latest bash
root#bd9082bf70a0:/# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
15: eth0#if16: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:c0:a8:dc:03 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 192.168.220.3/24 brd 192.168.220.255 scope global eth0
valid_lft forever preferred_lft forever
root#bd9082bf70a0:/# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
92 bytes from bd9082bf70a0 (192.168.220.3): Destination Host Unreachable
It turned out that less is more. Simplifying daemon.json to the following resolved my issues.
{
"bip": "192.168.220.2/24"
}
If you don't set the gw, docker will set it to first non-network address in the network, or .1, but if you set it, docker will conflict when allocating the bridge as the address .1 is in use. You should only set default_gateway if its outside of the network range.
Now the bip can tell docker to use a different address than the .1 and so setting the bip can avoid the conflict, but I am not sure that it will end up doing what you want. Probably will cause routing issues as non-network route will go to address that has no host responding.