Bad Gateway when using Traefik in docker swarm - docker-swarm

I'm currently struggling a lot to spin up a small traefik example on my docker swarm instance.
I started first with an docker-compose file for local development and everything is working as expected.
But when I define this as swarm file to bring that environment into production I always get an Bad Gateway from traefik.
After searching a lot about this it seems to be related to an networking issue from traefik since it tries to request between two different networks, but I'm not able to find the issue.
After certain iterations I tried to reproduce the Issue with "official" containers to provide an better example for other people.
So this is my traefik.yml
version: "3.7"
networks:
external:
external: true
services:
traefik:
image: "traefik:v2.8.1"
command:
- "--log.level=INFO"
- "--accesslog=true"
- "--api.insecure=true"
- "--providers.docker=true"
- "--providers.docker.swarmMode=true"
- "--providers.docker.exposedbydefault=false"
- "--providers.docker.network=external"
- "--entrypoints.web.address=:80"
- "--entrypoints.web.forwardedHeaders.insecure"
ports:
- "80:80"
- "8080:8080"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
networks:
- external
deploy:
placement:
constraints: [node.role == manager]
host-app:
image: traefik/whoami
ports:
- "9000:80"
networks:
- external
deploy:
labels:
- "traefik.enable=true"
- "traefik.http.routers.host-app.rule=PathPrefix(`/whoami`)"
- "traefik.http.services.host-app.loadbalancer.server.port=9000"
- "traefik.http.routers.host-app.entrypoints=web"
- "traefik.http.middlewares.host-app-stripprefix.stripprefix.prefixes=/"
- "traefik.http.routers.host-app.middlewares=host-app-stripprefix#docker"
- "traefik.docker.network=external"
The network is created with: docker network create -d overlay external
and I deploy the stack with docker stack deploy -c traefik.yml server
Until here no issues and everything spins up fine.
When I curl localhost:9000 I get the correct response:
curl localhost:9000
Hostname: 7aa77bc62b44
IP: 127.0.0.1
IP: 10.0.0.8
IP: 172.25.0.4
IP: 10.0.4.6
RemoteAddr: 10.0.0.2:35068
GET / HTTP/1.1
Host: localhost:9000
User-Agent: curl/7.68.0
Accept: */*
but on
curl localhost/whoami
Bad Gateway%
I always get the bad Gateway issue.
So I checked my network with docker network inspect external to ensure that both are running in the same network and this is the case.
[
{
"Name": "external",
"Id": "iianul6ua9u1f1bb8ibsnwkyc",
"Created": "2022-08-09T19:32:01.4491323Z",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "10.0.4.0/24",
"Gateway": "10.0.4.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"7aa77bc62b440e32c7b904fcbd91aea14e7a73133af0889ad9e0c9f75f2a884a": {
"Name": "server_host-app.1.m2f5x8jvn76p2ssya692f4ydp",
"EndpointID": "5d5175b73f1aadf2da30f0855dc0697628801a31d37aa50d78a20c21858ccdae",
"MacAddress": "02:42:0a:00:04:06",
"IPv4Address": "10.0.4.6/24",
"IPv6Address": ""
},
"e23f5c2897833f800a961ab49a4f76870f0377b5467178a060ec938391da46c7": {
"Name": "server_traefik.1.v5g3af00gqpulfcac84rwmnkx",
"EndpointID": "4db5d69e1ad805954503eb31c4ece5a2461a866e10fcbf579357bf998bf3490b",
"MacAddress": "02:42:0a:00:04:03",
"IPv4Address": "10.0.4.3/24",
"IPv6Address": ""
},
"lb-external": {
"Name": "external-endpoint",
"EndpointID": "ed668b033450646629ca050e4777ae95a5a65fa12a5eb617dbe0c4a20d84be28",
"MacAddress": "02:42:0a:00:04:04",
"IPv4Address": "10.0.4.4/24",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "4100"
},
"Labels": {},
"Peers": [
{
"Name": "3cb3e7ba42dc",
"IP": "192.168.65.3"
}
]
}
]
and by checking the traefik logs I get the following
10.0.0.2 - - [09/Aug/2022:19:42:34 +0000] "GET /whoami HTTP/1.1" 502 11 "-" "-" 4 "host-app#docker" "http://10.0.4.9:9000" 0ms
which is the correct server:port for the whoami service. And even connecting into the traefik container and ping 10.0.4.9 works fine.
PING 10.0.4.9 (10.0.4.9): 56 data bytes
64 bytes from 10.0.4.9: seq=0 ttl=64 time=0.066 ms
64 bytes from 10.0.4.9: seq=1 ttl=64 time=0.057 ms
^C
--- 10.0.4.9 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.057/0.061/0.066 ms
This logs and snippets are all on my local swarm on Docker for Windows with wsl2 Ubuntu distribution. But I tested this on an CentOS Swarm which can be requested within my company and also with https://labs.play-with-docker.com/ and leads all to the same error.
So please can anybody tell me what configuration I'm missing or what mistake I made to get this running?

After consulting a coworker and creating another example we finally found the solution by our self.
Its just my own failure that I used the published port for loadbalancing the traefik to the service which is wrong.
host-app:
image: traefik/whoami
ports:
- "9000:80"
networks:
- external
deploy:
labels:
- "traefik.enable=true"
- "traefik.http.routers.host-app.rule=PathPrefix(`/whoami`)"
- "traefik.http.services.host-app.loadbalancer.server.port=80" # <--- this was wrong
- "traefik.http.routers.host-app.entrypoints=web"
- "traefik.http.middlewares.host-app-stripprefix.stripprefix.prefixes=/"
- "traefik.http.routers.host-app.middlewares=host-app-stripprefix#docker"
- "traefik.docker.network=external"
and that's the reason for the Bad Gateway since traefik tries to reach the published port from the server which is not present internally.

Related

Docker containers in the same network can't communicate (ARP is possible but not upper layers messages)

I have been trying to split a simple API-rest into different services using docker. Unfortunately, I have not been able to make it work. I have read the docker docs several times and have followed multiple stack-over-flow and docker forum threads but non of the answers worked for me. I am new to Docker, so I might be missing something.
I detected that the communication host-container was ok but container-container wasn't, so in order to see what was going on I installed ping on get and post services (which run on a debian:bullseye-slim based image) and also wireshark in my host machine. What I have detected is that I can ping the host (172.22.0.1) and also the name resolution is okay (when I run ping post its IP is displayed) but for some reason when I send a ping request from post to get no reply is received.
My docker-compose.yaml file is the following:
version: '3.9'
services:
mydb:
image: mariadb:latest
environment:
MYSQL_DATABASE: 'cars'
MYSQL_ALLOW_EMPTY_PASSWORD: 'true'
ports:
- "3306:3306"
container_name: mydb
networks:
- mynw
post:
build: ./post-service
ports:
- "8081:8081"
container_name: post
networks:
- mynw
privileged: true
get:
build: ./get-service
ports:
- "8080:8080"
container_name: get
networks:
- mynw
privileged: true
nginx2:
build: ./nginx2
ports:
- "80:80"
container_name: nginx2
networks:
- mynw
networks:
mynw:
external: true
Initially, I was using the default network, but I read that this might cause internal DNS problems I changed it. I created the network by CLI without any special parameters (docker network create mynw). The JSON displayed when running docker network inspect mynw is the following:
[
{
"Name": "mynw",
"Id": "f925467f7efee99330f0eaaa82158006ac645cc92e7abda693f052c10da485bd",
"Created": "2022-10-14T18:42:14.145569533+02:00",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": {},
"Config": [
{
"Subnet": "172.22.0.0/16",
"Gateway": "172.22.0.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"4eb6e348d84b2433199e6581b4406eb74fb93c1fc2269691b81b34c13c723db5": {
"Name": "nginx2",
"EndpointID": "b19fab264c1489b616d919f09a5b80a1774561ea6f2538beb86157065c1e787b",
"MacAddress": "02:42:ac:16:00:03",
"IPv4Address": "172.22.0.3/16",
"IPv6Address": ""
},
"5f20802a59708bf4a592e137f52fca29dc857734983abc1c61548783e2e61896": {
"Name": "mydb",
"EndpointID": "3ef7b5d619b5b9ad9441dbc2efabd5a0e5a6bb2ea68bbd58fae8f7dfd2ac36ed",
"MacAddress": "02:42:ac:16:00:02",
"IPv4Address": "172.22.0.2/16",
"IPv6Address": ""
},
"dee816dd62aa08773134bb7a7a653544ab316275ec111817e11ba499552dea5b": {
"Name": "post",
"EndpointID": "cca2cbe801160fa6c35b3a34493d6cc9a10689cd33505ece36db9ca6dcf43900",
"MacAddress": "02:42:ac:16:00:04",
"IPv4Address": "172.22.0.4/16",
"IPv6Address": ""
},
"e23dcd0cecdb609e4df236fd8aed0999c12e1adc7b91b505fc88c53385a81292": {
"Name": "get",
"EndpointID": "83b73045887827ecbb1779cd27d5c4dac63ef3224ec42f067cfc39ba69b5484e",
"MacAddress": "02:42:ac:16:00:05",
"IPv4Address": "172.22.0.5/16",
"IPv6Address": ""
}
},
"Options": {},
"Labels": {}
}
]
Curiously, when sniffing the network using wireshark I see that the ARP messages between the containers are exchanged without problem (get service asks for post's MAC adress and this one replies with its MAC, and then this information is processed correctly to send the ICMP request).
I thought that maybe the network layer was dropping the replies for some reason and installed iptables to both services and added a ACCEPT rule for icmp messages to both INPUT and OUTPUT, but also didn't change anything. If someone knows what else could I do or what am I missing it would be very helpful.
Finally the solution was to delete everything and reinstall Docker and Docker Compose.

container running on docker swarm not accessible from outside

I am running my containers on the docker swarm. asset-frontend service is my frontend application which is running Nginx inside the container and exposing port 80. now if I do
curl http://10.255.8.21:80
or
curl http://127.0.0.1:80
from my host where I am running these containers I am able to see my asset-frontend application but it is not accessible outside of the host. I am not able to access it from another machine, my host machine operating system is centos 8.
this is my docker-compose file
version: "3.3"
networks:
basic:
services:
asset-backend:
image: asset/asset-management-backend
env_file: .env
deploy:
replicas: 1
depends_on:
- asset-mongodb
- asset-postgres
networks:
- basic
asset-mongodb:
image: mongo
restart: always
env_file: .env
ports:
- "27017:27017"
volumes:
- $HOME/asset/mongodb:/data/db
networks:
- basic
asset-postgres:
image: asset/postgresql
restart: always
env_file: .env
ports:
- "5432:5432"
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=password
- POSTGRES_DB=asset-management
volumes:
- $HOME/asset/postgres:/var/lib/postgresql/data
networks:
- basic
asset-frontend:
image: asset/asset-management-frontend
restart: always
ports:
- "80:80"
environment:
- ENV=dev
depends_on:
- asset-backend
deploy:
replicas: 1
networks:
- basic
asset-autodiscovery-cron:
image: asset/auto-discovery-cron
restart: always
env_file: .env
deploy:
replicas: 1
depends_on:
- asset-mongodb
- asset-postgres
networks:
- basic
this is my docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
auz640zl60bx asset_asset-autodiscovery-cron replicated 1/1 asset/auto-discovery-cron:latest
g6poofhvmoal asset_asset-backend replicated 1/1 asset/asset-management-backend:latest
brhq4g4mz7cf asset_asset-frontend replicated 1/1 asset/asset-management-frontend:latest *:80->80/tcp
rmkncnsm2pjn asset_asset-mongodb replicated 1/1 mongo:latest *:27017->27017/tcp
rmlmdpa5fz69 asset_asset-postgres replicated 1/1 asset/postgresql:latest *:5432->5432/tcp
My 80 port is open in firewall
following is the output of firewall-cmd --list-all
public (active)
target: default
icmp-block-inversion: no
interfaces: eth0
sources:
services: cockpit dhcpv6-client ssh
ports: 22/tcp 2376/tcp 2377/tcp 7946/tcp 7946/udp 4789/udp 80/tcp
protocols:
masquerade: no
forward-ports:
source-ports:
icmp-blocks:
rich rules:
if i inspect my created network the output is following
[
{
"Name": "asset_basic",
"Id": "zw73vr9xigfx7hy16u1myw5gc",
"Created": "2019-11-26T02:36:38.241352385-05:00",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "10.0.3.0/24",
"Gateway": "10.0.3.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"9348f4fc6bfc1b14b84570e205c88a67aba46f295a5e61bda301fdb3e55f3576": {
"Name": "asset_asset-frontend.1.zew1obp21ozmg8r1tzmi5h8g8",
"EndpointID": "27624fe2a7b282cef1762c4328ce0239dc70ebccba8e00d7a61595a7a1da2066",
"MacAddress": "02:42:0a:00:03:08",
"IPv4Address": "10.0.3.8/24",
"IPv6Address": ""
},
"943895f12de86d85fd03d0ce77567ef88555cf4766fa50b2a8088e220fe1eafe": {
"Name": "asset_asset-mongodb.1.ygswft1l34o5vfaxbzmnf0hrr",
"EndpointID": "98fd1ce6e16ade2b165b11c8f2875a0bdd3bc326c807ba6a1eb3c92f4417feed",
"MacAddress": "02:42:0a:00:03:04",
"IPv4Address": "10.0.3.4/24",
"IPv6Address": ""
},
"afab468aefab0689aa3488ee7f85dbc2cebe0202669ab4a58d570c12ee2bde21": {
"Name": "asset_asset-autodiscovery-cron.1.5k23u87w7224mpuasiyakgbdx",
"EndpointID": "d3d4c303e1bc665969ad9e4c9672e65a625fb71ed76e2423dca444a89779e4ee",
"MacAddress": "02:42:0a:00:03:0a",
"IPv4Address": "10.0.3.10/24",
"IPv6Address": ""
},
"f0a768e5cb2f1f700ee39d94e380aeb4bab5fe477bd136fd0abfa776917e90c1": {
"Name": "asset_asset-backend.1.8ql9t3qqt512etekjuntkft4q",
"EndpointID": "41587022c339023f15c57a5efc5e5adf6e57dc173286753216f90a976741d292",
"MacAddress": "02:42:0a:00:03:0c",
"IPv4Address": "10.0.3.12/24",
"IPv6Address": ""
},
"f577c539bbc3c06a501612d747f0d28d8a7994b843c6a37e18eeccb77717539e": {
"Name": "asset_asset-postgres.1.ynrqbzvba9kvfdkek3hurs7hl",
"EndpointID": "272d642a9e20e45f661ba01e8731f5256cef87898de7976f19577e16082c5854",
"MacAddress": "02:42:0a:00:03:06",
"IPv4Address": "10.0.3.6/24",
"IPv6Address": ""
},
"lb-asset_basic": {
"Name": "asset_basic-endpoint",
"EndpointID": "142373fd9c0d56d5a633b640d1ec9e4248bac22fa383ba2f754c1ff567a3502e",
"MacAddress": "02:42:0a:00:03:02",
"IPv4Address": "10.0.3.2/24",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "4100"
},
"Labels": {
"com.docker.stack.namespace": "asset"
},
"Peers": [
{
"Name": "8170c4487a4b",
"IP": "10.255.8.21"
}
]
}
]
Ran into this same issue and it turns out it was a clash between my local networks subnet and the subnet of the automatically created ingress network. This can be verified using docker network inspect ingress and checking if the IPAM.Config.Subnet value overlaps with your local network.
To fix you can update the configuration of the ingress network as specified in Customize the default ingress network; in summary:
Remove services that publish ports
Remove existing network: docker network rm ingress
Recreate using non-conflicting subnet:
docker network create \
--driver overlay \
--ingress \
--subnet 172.16.0.0/16 \ # Or whatever other subnet you want to use
--gateway 172.16.0.1 \
ingress
Restart services
You can avoid a clash to begin with by specifying the default subnet pool when initializing the swarm using the --default-addr-pool option.
docker service update your-service --publish-add 80:80
You can publish ports by updating the service.
Can you try this url instead of the ip adres? host.docker.internal so something like http://host.docker.internal:80
I suggest you verify the "right" behavior using docker-compose first. Then, try to use docker swarm without network specification just to verify there are no network interface problems.
Also, you could use the below command to verify your LISTEN ports:
netstat -tulpn
EDIT: I faced this same issue but I was able to access my services through 127.0.0.1
While running docker provide an port mapping, like
docker run -p 8081:8081 your-docker-image
Or, provide the port mapping in the docker desktop while starting the container.
I got into this same issue. It turns out that's my iptables filter causes external connections not work.
In docker swarm mode, docker create a virtual network bridge device docker_gwbridge to access to overlap network. My iptables has following line to drop packet forwards:
:FORWARD DROP
That makes network packets from physical NIC can't reach the docker ingress network, so that my docker service only works on localhost.
Change iptables rule to
:FORWARD ACCEPT
And problem solved without touching the docker.

docker-compose hostname to communicate between containers works with postgres but not app

I have the following docker-compose.yml:
services:
postgres:
image: "postgres:11.0-alpine"
app:
build: .
ports:
- "4000:4000"
depends_on:
- postgres
- nuxt
nuxt:
image: node:latest
ports:
- "3000:3000"
I need nuxt service to communicate with app.
Within the nuxt service (docker-compose run --rm --service-ports nuxt bash), if I run
root#62cafc299e8a:/app# ping postgres
PING postgres (172.18.0.2) 56(84) bytes of data.
64 bytes from avril_postgres_1.avril_default (172.18.0.2): icmp_seq=1 ttl=64 time=0.283 ms
64 bytes from avril_postgres_1.avril_default (172.18.0.2): icmp_seq=2 ttl=64 time=0.130 ms
64 bytes from avril_postgres_1.avril_default (172.18.0.2): icmp_seq=3 ttl=64 time=0.103 ms
but if I do:
root#62cafc299e8a:/app# ping app
ping: app: No address associated with hostname
Why does it work for postgres but not with app?
If I do docker network inspect 4fcb63b4b1c9, they appear to all be on the same network:
[
{
"Name": "myapp_default",
"Id": "4fcb63b4b1c9fe37ebb26e9d4d22c359c9d5ed6153bd390b6f0b63ffeb0d5c37",
"Created": "2019-05-16T16:46:27.820758377+02:00",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "172.18.0.0/16",
"Gateway": "172.18.0.1"
}
]
},
"Internal": false,
"Attachable": true,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"53b726bdd01159b5f18e8dcb858e979e6e2f8ef68c62e049b824899a74b186c3": {
"Name": "myapp_app_run_c82e91ca4ba0",
"EndpointID": "b535b6ca855a5dea19060b2f7c1bd82247b94740d4699eff1c8669c5b0677f78",
"MacAddress": "02:42:ac:12:00:03",
"IPv4Address": "172.18.0.3/16",
"IPv6Address": ""
},
"62cafc299e8a90fd39530bbe4a6af8b86098405e54e4c9e61128539ffd5ba928": {
"Name": "myapp_nuxt_run_3fb01bb2f778",
"EndpointID": "7eb8f5f8798baee4d65cbbfe5f0f5372790374b48f599f32490700198fa6d54c",
"MacAddress": "02:42:ac:12:00:04",
"IPv4Address": "172.18.0.4/16",
"IPv6Address": ""
},
"9dc1c848b2e347876292650c312e8aaf3f469f2efa96710fb50d033b797124b4": {
"Name": "myapp_postgres_1",
"EndpointID": "a925438ad5644c03731b7f7c926cff095709b2689fd5f404e9ac4e04c2fbc26a",
"MacAddress": "02:42:ac:12:00:02",
"IPv4Address": "172.18.0.2/16",
"IPv6Address": ""
}
},
"Options": {},
"Labels": {
"com.docker.compose.network": "default",
"com.docker.compose.project": "myapp",
"com.docker.compose.version": "1.23.2"
}
}
]
So why is that? Also tried with aliases, without success. :(
Your app container is most likely not running. Its appearance in docker network inspect means that the container exists but it may be exited (i.e. is not running). You can check with docker ps -a, for example:
$ docker ps -a
CONTAINER ID ... STATUS ... NAMES
fe908e014fdd Exited (0) Less than a second ago so_app_1
3b2ca418c051 Up 2 minutes so_postgres_1
container app exists but is not running: you won't be able to ping it even if it exists in the network
container postgres exists and is running: you will be able to ping it
It's probably due to the fact that docker-compose run --rm --service-ports nuxt bash will only create and run the nuxt container, it won't run app nor postgres. You are able to ping postgres because it was already running before you used docker-compose run nuxt bash
To be able to ping other containers after running docker-compose run nuxt ..., you should either:
Have the other containers already running before (such as by using docker-compose up -d)
Have the other containers depends_on the container you are trying to run, for example:
nuxt:
image: node:latest
ports:
- "3000:3000"
# this will ensure posgres and app are run as well when using docker-compose run
depends_on:
- app
- nuxt
Even with that, your container may fail to start (or exit right after start) and you won't be able to ping it. Check with docker ps -a that it is running and docker logs to see why it may have exited.
As #Pierre said, most probably your container is not running.
From below docker-compose from your question, it seems you are not doing anything in that container such as running a server or uwsgi to keep it alive.
app:
build: .
ports:
- "4000:4000"
depends_on:
- postgres
- nuxt
To keep it running in docker compose, add command directive like below.
app:
build: .
command: tail -f /dev/null #trick to stop immediate exit and keep the container alive.
ports:
- "4000:4000"
depends_on:
- postgres
- nuxt
It should be 'pingable' container now.
If you wish to run via docker run, use -t, which creates a psuedo-tty
docker run -t -d <image> <command>

Two services cannot see each other through a swarm overlay

I feel like this is simple, but I can't figure it out. I have two services, consul and traefik up in a single node swarm on the same host.
> docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
3g1obv9l7a9q consul_consul replicated 1/1 progrium/consul:latest
ogdnlfe1v8qx proxy_proxy global 1/1 traefik:alpine *:80->80/tcp, *:443->443/tcp
> docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
090f1ed90972 progrium/consul:latest "/bin/start -server …" 12 minutes ago Up 12 minutes 53/tcp, 53/udp, 8300-8302/tcp, 8400/tcp, 8500/tcp, 8301-8302/udp consul_consul.1.o0j8kijns4lag6odmwkvikexv
20f03023d511 traefik:alpine "/entrypoint.sh -c /…" 12 minutes ago Up 12 minutes 80/tcp
Both containers have access to the "consul" overlay network, which was created as such.
> docker network create --driver overlay --attachable consul
ypdmdyx2ulqt8l8glejfn2t25
Traefik is complaining that it can't reach consul.
time="2019-03-18T18:58:08Z" level=error msg="Load config error: Get http://consul:8500/v1/kv/traefik?consistent=&recurse=&wait=30000ms: dial tcp 10.0.2.2:8500: connect: connection refused, retrying in 7.492175404s"
I can go into the traefik container and confirm that I can't reach consul through the overlay network, although it is pingable.
> docker exec -it 20f03023d511 ash
/ # nslookup consul
Name: consul
Address 1: 10.0.2.2
/ # curl consul:8500
curl: (7) Failed to connect to consul port 8500: Connection refused
# ping consul
PING consul (10.0.2.2): 56 data bytes
64 bytes from 10.0.2.2: seq=0 ttl=64 time=0.085 ms
However, if I look a little deeper, I find that they are connected, just that the overlay network isn't transmitting traffic to the actual destination for some reason. If I go directly to the actual consul ip, it works.
/ # nslookup tasks.consul
Name: tasks.consul
Address 1: 10.0.2.3 0327c8e1bdd7.consul
/ # curl tasks.consul:8500
Moved Permanently.
I could workaround this, technically there will only ever be one copy of consul running, but I'd like to know why the data isn't routing in the first place before I get deeper into it. I can't think of anything else to try. Here is various information related to this setup.
> docker --version
Docker version 18.09.2, build 6247962
> docker network ls
NETWORK ID NAME DRIVER SCOPE
cee3cdfe1194 bridge bridge local
ypdmdyx2ulqt consul overlay swarm
5469e4538c2d docker_gwbridge bridge local
5fd928ea1e31 host host local
9v22k03pg9sl ingress overlay swarm
> docker network inspect consul
[
{
"Name": "consul",
"Id": "ypdmdyx2ulqt8l8glejfn2t25",
"Created": "2019-03-18T14:44:27.213690506-04:00",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "10.0.2.0/24",
"Gateway": "10.0.2.1"
}
]
},
"Internal": false,
"Attachable": true,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"0327c8e1bdd7ebb5a7871d16cf12df03240996f9e590509984783715a4c09193": {
"Name": "consul_consul.1.8v4bshotrco8fv3sclwx61106",
"EndpointID": "ae9d5ef1d19b67e297ebf40f6db410c33e4e3c0266c56e539e696be3ed4c81a5",
"MacAddress": "02:42:0a:00:02:03",
"IPv4Address": "10.0.2.3/24",
"IPv6Address": ""
},
"c21f5dfa93a2f43b747aedc64a343d94d6c1c2e6558d81bd4a52e2ba4b5fa90f": {
"Name": "proxy_proxy.sb6oindhmfukq4gcne6ynb2o2.4zvco02we58i3ulbyrsw1b2ok",
"EndpointID": "7596a208e0b05ba688f318814e24a2a1a3401765ed53ca421bf61c73e65c235a",
"MacAddress": "02:42:0a:00:02:06",
"IPv4Address": "10.0.2.6/24",
"IPv6Address": ""
},
"lb-consul": {
"Name": "consul-endpoint",
"EndpointID": "23e74716ef54f3fb6537b305176b790b4bc4132dda55f20588d7ce4ca71d7372",
"MacAddress": "02:42:0a:00:02:04",
"IPv4Address": "10.0.2.4/24",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "4099"
},
"Labels": {},
"Peers": [
{
"Name": "e11b9bd30b31",
"IP": "10.8.0.1"
}
]
}
]
> cat consul/docker-compose.yml
version: '3.1'
services:
consul:
image: progrium/consul
command: -server -bootstrap
networks:
- consul
volumes:
- consul:/data
deploy:
labels:
- "traefik.enable=false"
networks:
consul:
external: true
> cat proxy/docker-compose.yml
version: '3.3'
services:
proxy:
image: traefik:alpine
command: -c /traefik.toml
networks:
# We need an external proxy network and the consul network
# - proxy
- consul
ports:
# Send HTTP and HTTPS traffic to the proxy service
- 80:80
- 443:443
configs:
- traefik.toml
volumes:
- /var/run/docker.sock:/var/run/docker.sock
deploy:
# Deploy the service to all nodes that match our constraints
mode: global
placement:
constraints:
- "node.role==manager"
- "node.labels.proxy==true"
labels:
# Traefik uses labels to configure routing to your services
# Change the domain to your own
- "traefik.frontend.rule=Host:proxy.mcwebsite.net"
# Route traffic to the web interface hosted on port 8080 in the container
- "traefik.port=8080"
# Name the backend (not required here)
- "traefik.backend=traefik"
# Manually set entrypoints (not required here)
- "traefik.frontend.entryPoints=http,https"
configs:
# Traefik configuration file
traefik.toml:
file: ./traefik.toml
# This service will be using two external networks
networks:
# proxy:
# external: true
consul:
external: true
There were two optional kernel configs CONFIG_IP_VS_PROTO_TCP and CONFIG_IP_VS_PROTO_UDP disabled in my kernel which, you guessed it, enable tcp and udp load balancing.
I wish I'd checked that about four hours sooner than I did.

host name not working with docker swarm mode

I am using docker version 18.06.1-ce and compose version 1.22.0.
As per docker, it should be possible to call services using service names. This is working for me with docker compose without swarm mode, but on swarm mode it is not working. I have even tried setting aliases in my compose but no result.
Below is my docker-compose.yml
version: "3"
networks:
my_network:
external:
name: new_network
services:
config-service:
image: com.test/config-service:0.0.1
deploy:
placement:
constraints: [node.role == manager]
resources:
limits:
memory: 1024M
reservations:
memory: 768M
restart_policy:
condition: on-failure
healthcheck:
test: ["CMD", "curl", "-f", "http://config-service:8888/health"]
interval: 5s
timeout: 3s
retries: 5
ports:
- 8888:8888
networks:
my_network:
aliases:
- config-service
eureka-service:
image: com.test/eureka-service:0.0.1
deploy:
placement:
constraints: [node.role == manager]
resources:
limits:
memory: 1536M
reservations:
memory: 1024M
restart_policy:
condition: on-failure
healthcheck:
test: ["CMD", "curl", "-I", "http://eureka-service:8761/health"]
interval: 5s
timeout: 3s
retries: 5
ports:
- 8761:8761
depends_on:
- config-service
networks:
my_network:
aliases:
- eureka-service
When I inspect into my network I found
[
{
"Name": "new_network",
"Id": "s2m7yq7tz4996w7eg229l59nf",
"Created": "2018-08-30T13:58:59.75070753Z",
"Scope": "swarm",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "172.20.0.0/16",
"Gateway": "172.20.0.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"355efe27067ee20868455dabbedd859b354d50fb957dcef4262eac6f25d10686": {
"Name": "test_eureka-service.1.a4pjb3ntez9ly5zhu020h0tva",
"EndpointID": "50998abdb4cd2cd2f747fadd82be495150919531b81a3d6fb07251a940ef2749",
"MacAddress": "02:42:ac:14:00:02",
"IPv4Address": "172.20.0.2/16",
"IPv6Address": ""
},
"5cdb398c598c1cea6b9032d4c696fd1581e88f0644896edd958ef59895b698a4": {
"Name": "test_config-service.1.se8ajr73ajnjhvxt3rq31xzlm",
"EndpointID": "5b3c41a8df0054e1c115d93c32ca52220e2934b6f763f588452c38e60c067054",
"MacAddress": "02:42:ac:14:00:03",
"IPv4Address": "172.20.0.3/16",
"IPv6Address": ""
}
},
"Options": {},
"Labels": {}
}
]
Now if I connect into containers terminal and ping using the long name 'test_config-service.1.se8ajr73ajnjhvxt3rq31xzlm' it is able to ping but not 'config-service'.
I believe the issue you are experiencing is because you are using a swarm scoped bridge network, instead of an overlay network. I'm not sure if this configuration is supported. The DNS entry for the service when deployed in swarm mode is at the service level, not the individual containers. From my testing, that DNS entry, along with the code to setup a VIP, only appear to work with overlay networks. You may want to follow this issue if you really need your network to be configured as a bridge: https://github.com/moby/moby/issues/37672
Otherwise, the easy fix is to replace your network with an overlay network. You can remove your network aliases since they are redundant. And if you have other containers on the host that need to also be on this network, from outside of swarm mode, be sure to configure your overlay network as "attachable". If you have other applications currently attached to the network, you can replace that with a new network, or if you need to keep the same network name, swap it out in two phases:
# create a temporary network to free up the new_network name
docker network create -d overlay --attachable temp_network
docker network connect temp_network $container_id # repeat for each container
# finish the above step for all containers before continuing
docker network disconnect new_network $container_id #repeat for each container
# remove the old bridge network
docker network rm new_network
# now create a new_network as overlay
docker network create -d overlay --attachable new_network
docker network connect new_network $container_id # repeat for each container
# finish the above step for all containers before continuing
docker network disconnect temp_network $container_id #repeat for each container
# cleanup the temporary network
docker network rm temp_network
If everything is running in swarm mode, then there's no need for --attachable. After that, you should be able to start your swarm mode stack.
Try to list your services with a docker service ls command. Because if you use stack and give a name to your stack the service name will be nameofstack_config-service
And I see in your inspect test_eureka-service.1xxxxxx so the service name should be test_eureka-service
This is a known issue with version 18.06:
https://github.com/docker/for-win/issues/2327
https://github.com/docker/for-linux/issues/375
Try 18.03

Resources