How does service discovery work with modern docker/docker-compose?

How does service discovery work with modern docker/docker-compose? - docker

I'm using Docker 1.11.1 and docker-compose 1.8.0-rc2.
In the good old days (so, last year), you could set up a docker-compose.yml file like this:
app:
image: myapp
frontend:
image: myfrontend
links:
- app
And then start up the environment like this:
docker scale app=3 frontend=1
And your frontend container could inspect the environment variables
for variables named APP_1_PORT, APP_2_PORT, etc to discover the
available backend hosts and configure itself accordingly.
Times have changed. Now, we do this...
version: '2'
services:
app:
image: myapp
frontend:
image: myfrontend
links:
- app
...and instead of environment variables, we get DNS. So inside the
frontend container, I can ask for app_app_1 or app_app_2 or
app_app_3 and get the corresponding ip address. I can also ask for
app and get the address of app_app_1.
But how do I discover all of the available backend containers? I
guess I could loop over getent hosts ... until it fails:
counter=1
while :; do
getent hosts app_$counter || break
backends="$backends app_$counter"
let counter++
done
But that seems ugly and fragile.
I've heard rumors about round-robin dns, but (a) that doesn't seem to
be happening in my test environment, and (b) that doesn't necessarily
help if your frontend needs simultaneous connections to the backends.
How is simple container and service discovery meant to work in the
modern Docker world?

Docker's built-in Nameserver & Loadbalancer
Docker comes with a built-in nameserver. The server is, by default, reachable via 127.0.0.11:53.
Every container has by default a nameserver entry in /etc/resolv.conf, so it is not required to specify the address of the nameserver from within the container. That is why you can find your service from within the network with service or task_service_n.
If you do task_service_n then you will get the address of the corresponding service replica.
If you only ask for the service docker will perform internal load balancing between container in the same network and external load balancing to handle requests from outside.
When swarm is used, docker will additionally use two special networks.
The ingress network, which is actually an overlay network and handles incomming trafic to the swarm. It allows to query any service from any node in the swarm.
The docker_gwbridge, a bridge network, which connects the overlay networks of the individual hosts to an their physical network. (including ingress)
When using swarm to deploy services, the behavior as described in the examples below will not work unless endpointmode is set to dns roundrobin instead of vip.
endpoint_mode: vip - Docker assigns the service a virtual IP (VIP) that acts as the front end for clients to reach the service on a network. Docker routes requests between the client and available worker nodes for the service, without client knowledge of how many nodes are participating in the service or their IP addresses or ports. (This is the default.)
endpoint_mode: dnsrr - DNS round-robin (DNSRR) service discovery does not use a single virtual IP. Docker sets up DNS entries for the service such that a DNS query for the service name returns a list of IP addresses, and the client connects directly to one of these. DNS round-robin is useful in cases where you want to use your own load balancer, or for Hybrid Windows and Linux applications.
Example
For example deploy three replicas from dig/docker-compose.yml
version: '3.8'
services:
whoami:
image: "traefik/whoami"
deploy:
replicas: 3
DNS Lookup
You can use tools such as dig or nslookup to do a DNS lookup against the nameserver in the same network.
docker run --rm --network dig_default tutum/dnsutils dig whoami
; <<>> DiG 9.9.5-3ubuntu0.2-Ubuntu <<>> whoami
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 58433
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;whoami. IN A
;; ANSWER SECTION:
whoami. 600 IN A 172.28.0.3
whoami. 600 IN A 172.28.0.2
whoami. 600 IN A 172.28.0.4
;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11)
;; WHEN: Mon Nov 16 22:36:37 UTC 2020
;; MSG SIZE rcvd: 90
If you are only interested in the IP, you can provide the +short option
docker run --rm --network dig_default tutum/dnsutils dig +short whoami
172.28.0.3
172.28.0.4
172.28.0.2
Or look for specific service
docker run --rm --network dig_default tutum/dnsutils dig +short dig_whoami_2
172.28.0.4
Load balancing
The default loadbalancing happens on the transport layer or layer 4 of the OSI Model. So it is TCP/UDP based. That means it is not possible to inpsect and manipulate http headers with this method. In the enterprise edition it is apparently possible to use labels similar to the ones treafik is using in the example a bit further down.
docker run --rm --network dig_default curlimages/curl -Ls http://whoami
Hostname: eedc94d45bf4
IP: 127.0.0.1
IP: 172.28.0.3
RemoteAddr: 172.28.0.5:43910
GET / HTTP/1.1
Host: whoami
User-Agent: curl/7.73.0-DEV
Accept: */*
Here is the hostname from 10 times curl:
Hostname: eedc94d45bf4
Hostname: 42312c03a825
Hostname: 42312c03a825
Hostname: 42312c03a825
Hostname: eedc94d45bf4
Hostname: d922d86eccc6
Hostname: d922d86eccc6
Hostname: eedc94d45bf4
Hostname: 42312c03a825
Hostname: d922d86eccc6
Health Checks
Health checks, by default, are done by checking the process id (PID) of the container on the host kernel. If the process is running successfully, the container is considered healthy.
Oftentimes other health checks are required. The container may be running but the application inside has crashed. In many cases a TCP or HTTP check is preferred.
It is possible to bake a custom health checks into images. For example, using curl to perform L7 health checks.
FROM traefik/whoami
HEALTHCHECK CMD curl --fail http://localhost || exit 1
It is also possible to specify the health check via cli when starting the container.
docker run \
--health-cmd "curl --fail http://localhost || exit 1" \
--health-interval=5s \
--timeout=3s \
traefik/whoami
Example with Swarm
As initially mentioned, swarms behavior is different in that it will assign a virtual IP to services by default. Its actually not different its just docker or docker-compose doesn't create real services, it just imitates the behavior of swarm but still runs the container normally, as services can, in fact, only be created by manager nodes.
Keeping in mind we are on a swarm manager and thus the default mode is VIP
Create a overlay network that can be used by regular containers too
$ docker network create --driver overlay --attachable testnet
create some service with 2 replicas
$ docker service create --network testnet --replicas 2 --name digme nginx
Now lets use dig again and making sure we attach the container to the same network
$ docker run --network testnet --rm tutum/dnsutils dig digme
digme. 600 IN A 10.0.18.6
We see that indeed we only got one IP address back, so it appears that this is the virtual IP that has been assigned by docker.
Swarm allows actually to get the single IPs in this case without explicitly setting the endpoint mode.
We can query for tasks.<servicename> in this case that is tasks.digme
$ docker run --network testnet --rm tutum/dnsutils dig tasks.digme
tasks.digme. 600 IN A 10.0.18.7
tasks.digme. 600 IN A 10.0.18.8
This has brought us 2 A records pointing to the individual replicas.
Now lets create another service with endpointmode set to dns roundrobin
docker service create --endpoint-mode dnsrr --network testnet --replicas 2 --name digme2 nginx
$ docker run --network testnet --rm tutum/dnsutils dig digme2
digme2. 600 IN A 10.0.18.21
digme2. 600 IN A 10.0.18.20
This way we get both IPs without adding the prefix tasks.
Service Discovery & Loadbalancing Strategies
If the built in features are not sufficent, some strategies can be implemented to achieve better control. Below are some examples.
HAProxy
Haproxy can use the docker nameserver in combination with dynamic server templates to discover the running container. Then the traditional proxy features can be leveraged to achieve powerful layer 7 load balancing with http header manipulation and chaos engeering such as retries.
version: '3.8'
services:
loadbalancer:
image: haproxy
volumes:
- ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro
ports:
- 80:80
- 443:443
whoami:
image: "traefik/whoami"
deploy:
replicas: 3
...
resolvers docker
nameserver dns1 127.0.0.11:53
resolve_retries 3
timeout resolve 1s
timeout retry 1s
hold other 10s
hold refused 10s
hold nx 10s
hold timeout 10s
hold valid 10s
hold obsolete 10s
...
backend whoami
balance leastconn
option httpchk
option redispatch 1
retry-on all-retryable-errors
retries 2
http-request disable-l7-retry if METH_POST
dynamic-cookie-key MY_SERVICES_HASHED_ADDRESS
cookie MY_SERVICES_HASHED_ADDRESS insert dynamic
server-template whoami- 6 whoami:80 check resolvers docker init-addr libc,none
...
Traefik
The previous method is already pretty decent. However, you may have noticed that it requires knowing which services should be discovered and also the number of replicas to discover is hard coded. Traefik, a container native edge router, solves both problems. As long as we enable Traefik via label, the service will be discovered. This decentralized the configuration. It is as if each service registers itself.
The label can also be used to inspect and manipulate http headers.
version: "3.8"
services:
traefik:
image: "traefik:v2.3"
command:
- "--log.level=DEBUG"
- "--api.insecure=true"
- "--providers.docker=true"
- "--providers.docker.exposedbydefault=false"
- "--entrypoints.web.address=:80"
ports:
- "80:80"
- "8080:8080"
volumes:
- "/var/run/docker.sock:/var/run/docker.sock:ro"
whoami:
image: "traefik/whoami"
labels:
- "traefik.enable=true"
- "traefik.port=80"
- "traefik.http.routers.whoami.entrypoints=web"
- "traefik.http.routers.whoami.rule=PathPrefix(`/`)"
- "traefik.http.services.whoami.loadbalancer.sticky=true"
- "traefik.http.services.whoami.loadbalancer.sticky.cookie.name=MY_SERVICE_ADDRESS"
deploy:
replicas: 3
Consul
Consul is a tool for service discovery and configuration management. Services have to be registered via API request. It is a more complex solution that probably only makes sense in bigger clusters, but can be very powerful. Usually it recommended running this on bare metal and not in a container. You could install it alongside the docker host on each server in your cluster.
In this example it has been paired with the registrator image, which takes care of registering the docker services in consuls catalog.
The catalog can be leveraged in many ways. One of them is to use consul-template.
Note that consul comes with its own DNS resolver so in this instance the docker DNS resolver is somewhat neglected.
version: '3.8'
services:
consul:
image: gliderlabs/consul-server:latest
command: "-advertise=${MYHOST} -server -bootstrap"
container_name: consul
hostname: ${MYHOST}
ports:
- 8500:8500
registrator:
image: gliderlabs/registrator:latest
command: "-ip ${MYHOST} consul://${MYHOST}:8500"
container_name: registrator
hostname: ${MYHOST}
depends_on:
- consul
volumes:
- /var/run/docker.sock:/tmp/docker.sock
proxy:
build: .
ports:
- 80:80
depends_on:
- consul
whoami:
image: "traefik/whoami"
deploy:
replicas: 3
ports:
- "80"
Dockerfile for custom proxy image with consul template backed in.
FROM nginx
RUN curl https://releases.hashicorp.com/consul-template/0.25.1/consul-template_0.25.1_linux_amd64.tgz \
> consul-template_0.25.1_linux_amd64.tgz
RUN gunzip -c consul-template_0.25.1_linux_amd64.tgz | tar xvf -
RUN mv consul-template /usr/sbin/consul-template
RUN rm /etc/nginx/conf.d/default.conf
ADD proxy.conf.ctmpl /etc/nginx/conf.d/
ADD consul-template.hcl /
CMD [ "/bin/bash", "-c", "/etc/init.d/nginx start && consul-template -config=consul-template.hcl" ]
Consul template takes a template file and renders it according to the content of consuls catalog.
upstream whoami {
{{ range service "whoami" }}
server {{ .Address }}:{{ .Port }};
{{ end }}
}
server {
listen 80;
location / {
proxy_pass http://whoami;
}
}
After the template has been changed, the restart command is executed.
consul {
address = "consul:8500"
retry {
enabled = true
attempts = 12
backoff = "250ms"
}
}
template {
source = "/etc/nginx/conf.d/proxy.conf.ctmpl"
destination = "/etc/nginx/conf.d/proxy.conf"
perms = 0600
command = "/etc/init.d/nginx reload"
command_timeout = "60s"
}
Feature Table
Built In
HAProxy
Traefik
Consul-Template
Resolver
Docker
Docker
Docker
Consul
Service Discovery
Automatic
Server Templates
Label System
KV Store + Template
Health Checks
Yes
Yes
Yes
Yes
Load Balancing
L4
L4, L7
L4, L7
L4, L7
Sticky Session
No
Yes
Yes
Depends on proxy
Metrics
No
Stats Page
Dashboard
Dashboard
You can view some of the code samples in more detail on github.

Related

How to properly configure HAProxy in Docker Swarm to automatically route traffic to replicated services (via SSL)?

I'm trying to deploy a Docker Swarm of three host nodes with a single replicated service and put an HAProxy in front of it. I want the clients to be able to connect via SSL.
My docker-compose.yml:
version: '3.9'
services:
proxy:
image: haproxy
ports:
- 443:8080
volumes:
- haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg
deploy:
placement:
constraints: [node.role == manager]
networks:
- servers-network
node-server:
image: glusk/hackathon-2021:latest
ports:
- 8080:8080
command: npm run server
deploy:
mode: replicated
replicas: 2
networks:
- servers-network
networks:
servers-network:
driver: overlay
My haproxy.cfg (based on the official example):
# Simple configuration for an HTTP proxy listening on port 80 on all
# interfaces and forwarding requests to a single backend "servers" with a
# single server "server1" listening on 127.0.0.1:8000
global
daemon
maxconn 256
defaults
mode http
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
frontend http-in
bind *:80
default_backend servers
backend servers
server server1 127.0.0.1:8000 maxconn 32
My hosts are Lightsail VPS Ubuntu instances and share the same private network.
node-service runs each https server task inside its own container on: 0.0.0.0:8080.
The way I'm trying to make this work at the moment is to ssh into the manager node (which also has a static and public IP), copy over my configuration files from above, and run:
docker stack deploy --compose-file=docker-compose.yml hackathon-2021
but it doesn't work.

Well, first of all and regarding SSL (since it's the first thing that you mention) you need to configure it using the certificate and listen on the port 443, not port 80.
With that modification, your Proxy configuration would already change to:
global
daemon
maxconn 256
defaults
mode http
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
frontend http-in
bind *:80
default_backend servers
frontend https-in
bind *:443 ssl crt /etc/ssl/certs/hackaton2021.pem
default_backend servers
That would be a really simplified configuration for allowing SSL connection.
Now, let's go for the access to the different services.
First of all, you cannot access to the service on localhost, actually you shouldn't even expose the ports of the services you have to the host. The reason? That you already have those applications in the same network than the haproxy, so the ideal would be to take advantage of the Docker DNS to access directly to them
In order to do this, first we need to be able to resolve the service names. For that you need to add the following section to your configuration:
resolvers docker
nameserver dns1 127.0.0.11:53
resolve_retries 3
timeout resolve 1s
timeout retry 1s
hold other 10s
hold refused 10s
hold nx 10s
hold timeout 10s
hold valid 10s
hold obsolete 10s
The Docker Swarm DNS service is always available at 127.0.0.11.
Now to your previous existent configuration, we would have to add the server but using the service-name discovery:
backend servers
balance roundrobin
server-template node- 2 node-server:8080 check resolvers docker init-addr libc,none
If you check what we are doing, we are creating a server for each one of the discovered containers in the Swarm within the node-server service (so the replicas) and we will create those adding the prefix node- to each one of them.
Basically, that would be the equivalent to get the actual IPs of each of the replicas and add them stacked as a basic server configuration.
For deployment, you also have some errors, since we aren't interested into actually expose the node-server ports to the host, but to create the two replicas and use HAProxy for the networking.
For that, we should use the following Docker Compose:
version: '3.9'
services:
proxy:
image: haproxy
ports:
- 80:80
- 443:443
volumes:
- hackaton2021.pem:/etc/ssl/certs/hackaton2021.pem
- haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg
deploy:
placement:
constraints: [node.role == manager]
node-server:
image: glusk/hackathon-2021:latest
command: npm run server
deploy:
mode: replicated
replicas: 2
Remember to copy your haproxy.cfg and the self-signed (or real) certificate for your application to the instance before deploying the Stack.
Also, when you create that stack it will automatically create a network with the name <STACK_NAME>-default, so you don't need to define a network just for connecting both services.

How to deploy elasticsearch with docker swarm?

I create 3 virtual machine use docker-machine,there are:
NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS
cluster - virtualbox Running tcp://192.168.99.101:2376 v18.09.5
cluster2 - virtualbox Running tcp://192.168.99.102:2376 v18.09.5
master - virtualbox Running tcp://192.168.99.100:2376 v18.09.5
and then I create a docker swarm in master machine:
docker-machine ssh master "docker swarm init ----advertise-addr 192.168.99.100"
and in cluster and cluster2 join master:
docker-machine ssh cluster "docker swarm join --advertise-addr 192.168.99.101 --token xxxx 192.168.99.100:2377"
docker-machine ssh cluster2 "docker swarm join --advertise-addr 192.168.99.102 --token xxxx 192.168.99.100:2377"
the docker node ls info:
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
r4a6y9wie4zp3pl4wi4e6wqp8 cluster Ready Active 18.09.5
sg9gq6s3k6vty7qap7co6eppn cluster2 Ready Active 18.09.5
xb6telu8cn3bfmume1kcektkt * master Ready Active Leader 18.09.5
there is deploy config swarm.yml:
version: "3.3"
services:
elasticsearch:
image: elasticsearch:7.0.0
ports:
- "9200:9200"
- "9300:9300"
environment:
- cluster.name=elk
- network.host=_eth1:ipv4_
- network.bind_host=_eth1:ipv4_
- network.publish_host=_eth1:ipv4_
- discovery.seed_hosts=192.168.99.100,192.168.99.101
- cluster.initial_master_nodes=192.168.99.100,192.168.99.101
- bootstrap.memory_lock=false
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
networks:
- backend
deploy:
mode: replicated
replicas: 3
#endpoint_mode: dnsrr
restart_policy:
condition: none
resources:
limits:
cpus: "1.0"
memory: "1024M"
reservations:
memory: 20M
networks:
backend:
# driver: overlay
# attachable: true
i pull elasticsearch image to virtual machie:
docker-machine ssh master "docker image pull elasticsearch:7.0.0"
docker-machine ssh cluster "docker image pull elasticsearch:7.0.0"
docker-machine ssh cluster2 "docker image pull elasticsearch:7.0.0"
before run i run this command fix some elasticearch bootstrap error:
docker-machine ssh master "sudo sysctl -w vm.max_map_count=262144"
docker-machine ssh cluster "sudo sysctl -w vm.max_map_count=262144"
docker-machine ssh cluster2 "sudo sysctl -w vm.max_map_count=262144"
and then i run `docker stack deploy -c swarm.yml es, the elasticsearch cluster cannot work.
docker-machine ssh master
docker service logs es_elasticsearch -f
show:
es_elasticsearch.1.uh1x0s9qr7mb#cluster | {"type": "server", "timestamp": "2019-04-25T16:28:47,143+0000", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "elk", "node.name": "e8dba5562417", "message": "master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [192.168.99.100, 192.168.99.101] to bootstrap a cluster: have discovered []; discovery will continue using [192.168.99.100:9300, 192.168.99.101:9300] from hosts providers and [{e8dba5562417}{Jy3t0AAkSW-jY-IygOCjOQ}{z7MYIf5wTfOhCX1r25wNPg}{10.255.0.46}{10.255.0.46:9300}{ml.machine_memory=1037410304, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0" }
es_elasticsearch.2.swswlwmle9e9#cluster2 | {"type": "server", "timestamp": "2019-04-25T16:28:47,389+0000", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "elk", "node.name": "af5d88a04b42", "message": "master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [192.168.99.100, 192.168.99.101] to bootstrap a cluster: have discovered []; discovery will continue using [192.168.99.100:9300, 192.168.99.101:9300] from hosts providers and [{af5d88a04b42}{zhxMeNMAQN2evKDlsA33qA}{fpYPTvJ6STmyqrgxlMkD_w}{10.255.0.47}{10.255.0.47:9300}{ml.machine_memory=1037410304, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0" }
es_elasticsearch.3.x8ouukovhh80#master | {"type": "server", "timestamp": "2019-04-25T16:28:48,818+0000", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "elk", "node.name": "0e7e4d96b31a", "message": "master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [192.168.99.100, 192.168.99.101] to bootstrap a cluster: have discovered []; discovery will continue using [192.168.99.100:9300, 192.168.99.101:9300] from hosts providers and [{0e7e4d96b31a}{Xs9966RjTEWvEbuj4-ySYA}{-eV4lvavSHq6JhoW0qWu6A}{10.255.0.48}{10.255.0.48:9300}{ml.machine_memory=1037410304, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0" }
I guess the cluster formation failed may be due to network configuration error. I don't know how to fix it, I try many times modify the config, fail and fail again.

try, this is working :) docker-compose.yml
version: "3.7"
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.2.0
hostname: "{{.Node.Hostname}}"
environment:
- node.name={{.Node.Hostname}}
- cluster.name=my-cluster
- "ES_JAVA_OPTS=-Xms2g -Xmx2g"
- discovery.seed_hosts=elasticsearch
- cluster.initial_master_nodes=node1,node2,node3
- node.ml=false
- xpack.ml.enabled=false
- xpack.monitoring.enabled=false
- xpack.security.enabled=false
- xpack.watcher.enabled=false
- bootstrap.memory_lock=false
volumes:
- elasticsearch-data:/usr/share/elasticsearch/data
deploy:
mode: global
endpoint_mode: dnsrr
resources:
limits:
memory: 4G
nginx:
image: nginx:1.17.1-alpine
ports:
- 9200:9200
deploy:
mode: global
command: |
/bin/sh -c "echo '
user nobody nogroup;
worker_processes auto;
events {
worker_connections 1024;
}
http {
client_max_body_size 4g;
resolver 127.0.0.11 ipv6=off;
server {
listen *:9200;
location / {
proxy_set_header Connection keep-alive;
set $$url http://elasticsearch:9200;
proxy_pass $$url;
proxy_set_header Host $$http_host;
proxy_set_header X-Real-IP $$remote_addr;
proxy_set_header X-Forwarded-For $$proxy_add_x_forwarded_for;
}
}
}' | tee /etc/nginx/nginx.conf && nginx -t && nginx -g 'daemon off;'"
volumes:
elasticsearch-data:

Trying to manually specify all the specific IP's and bindings is tricky because of the swarm overlaying network.
Instead, simply make your ES nodes discoverable and let Swarm take care of the node discovery and communication. To make them discoverable, we can use a predictable name like the Swarm node hostname.
Try change your environment settings in the swarm.yml file as follows:
environment:
- network.host=0.0.0.0
- discovery.seed_hosts=elasticsearch #Service name, to let Swarm handle discovery
- cluster.initial_master_nodes=master,cluster,cluster2 #Swarm nodes host names
- node.name={{.Node.Hostname}} #To create a predictable node name
This of course assumes that we already known the swarm hostnames, which you pointed out in the screenshot above. Without knowing these values, we would have no way of having a predictable set of node names to look for. In that case, you could create 1 ES node entry with a particular node name, and then another entry which references the first entry's node name as the cluster.initial_master_nodes.

Use dnsrr mode without ports. Expose elasticsearch with nginx ;)
See my docker-compose.yml

In my experience https://github.com/shazChaudhry/docker-elastic works perfectly, and just one file from the entire repo is enough. I downloaded https://github.com/shazChaudhry/docker-elastic/blob/master/docker-compose.yml and removed the logstash bits, I didn't need that. Then added the following to .bashrc
export ELASTICSEARCH_HOST=$(hostname)
export ELASTICSEARCH_PASSWORD=foobar
export ELASTICSEARCH_USERNAME=elastic
export ELASTIC_VERSION=7.4.2
export INITIAL_MASTER_NODES=$ELASTICSEARCH_HOST
And docker stack deploy --compose-file docker-compose.yml elastic works.

Ideas I gleaned from Ahmet Vehbi Olgaç 's docker-compose.yml, which worked for me:
Use deployment / mode: global. This will cause the swarm to deploy one replica to each swarm worker, for each node that is configured like this.
Use deployment / endpoint_mode: dnsrr. This will let all containers in the swarm access the nodes by the service name.
Use hostname: {{.Node.Hostname}} or a similar template-based expression. This ensures a unique name for each deployed container.
Use environment / node.name={{.Node.Hostname}}. Again, you can vary the pattern. The point is that each es node should get a unique name.
Use cluster.initial_master_nodes=*hostname1*,*hostname2*,.... Assuming you know the hostnames of your docker worker machines. Use whatever pattern you used in #3, but substitute out the whole hostname, and include all the hostnames.
If you don't know your hostnames, you can do what Andrew Cachia's answer suggests: set up one container (do not replicate it) to act solely as the master seed and give it a predictable hostname, then have all other nodes refer to that node as the master seed. However, this introduces a single point of failure.

Elasticsearch 8.5.0 answer.
For my needs, I didn't want to add a reverse-proxy/load balancer, but I do want to expose port 9200 on the swarm nodes where Elasticsearch replicas are running (using just swarm), so that external clients can access the Elasticsearch REST API. So I used endpoint mode dnsrr (ref) and exposed port 9200 on the hosts where the replicas run.
If you don't need to expose port 9200 (i.e., nothing will connect to the elasticsearch replicas outside of swarm), remove the ports: config from the elasticsearch service.
I also only want elasticsearch replicas to run on a subset of my swarm nodes (3 of them). I created docker node label elasticsearch on those three nodes. Then mode: global and constraint node.labels.elasticsearch==True will ensure 1 replica runs on each of those nodes.
I run kibana on one of those 3 nodes too: swarm can pick which one, since port 5601 is exposed on swarm's ingress overlay network.
Lines you'll likely need to edit are maked with ######.
# docker network create -d overlay --attachable elastic-net
# cat elastic-stack-env
#!/bin/bash
export STACK_VERSION=8.5.0 # Elasticsearch and Kibana version
export ES_PORT=9200 # port to expose Elasticsearch HTTP API to the host
export KIBANA_PORT=5601 # port to expose Kibana to the host
read -p "Enter elastic user password: " ELASTIC_PASSWORD
read -p "Enter kibana_system user password: " KIBANA_PASSWORD
export KIBANA_URL=https://kibana.my-domain.com:$KIBANA_PORT #######
export SHARED_DIR=/some/nfs/or/shared/storage/elastic #######
export KIBANA_SSL_KEY_PATH=config/certs/kibana.key
export KIBANA_SSL_CERT_PATH=config/certs/kibana.crt
export ELASTIC_NODES=swarm_node1,swarm_node2,swarm_node3 #######
# ELASTIC_NODES must match what docker reports from {{.Node.Hostname}}
export KIBANA_SSL_CERT_AUTH_PATH=config/certs/My_Root_CA.crt #######
export CLUSTER_NAME=docker-cluster
export MEM_LIMIT=4294967296 # 4 GB; increase or decrease based on the available host memory (in bytes)
# cat elastic-stack.yml
version: "3.8"
services:
elasticsearch:
image: localhost:5000/elasticsearch:${STACK_VERSION:?} ####### I have a local registry
deploy:
endpoint_mode: dnsrr
mode: global # but note constraints below
placement:
constraints:
- node.labels.elasticsearch==True
resources:
limits:
memory:
${MEM_LIMIT}
dns: 127.0.0.11 # use docker DNS only (may not be required)
networks:
- elastic-net
volumes:
- ${SHARED_DIR:?}/certs:/usr/share/elasticsearch/config/certs
- /path/to/some/local/storage/elasticsearch:/usr/share/elasticsearch/data
ports: ##### remove if nothing outside of swarm needs to access port 9200
- target: 9200
published: ${ES_PORT} # we publish this port so that external clients can access the ES REST API
protocol: tcp
mode: host # required when using dnsrr
environment: # https://www.elastic.co/guide/en/elasticsearch/reference/master/settings.html
# https://www.elastic.co/guide/en/elasticsearch/reference/master/docker.html#docker-configuration-methods
- node.name={{.Node.Hostname}} # see Andrew Cachia's answer
- cluster.name=${CLUSTER_NAME}
- discovery.seed_hosts=elasticsearch # use service name here, since (docker's) DNS is used:
# https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#unicast.hosts
- cluster.initial_master_nodes=${ELASTIC_NODES} # use node.names here
# https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#initial_master_nodes
- ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
- xpack.security.enabled=true
- xpack.security.http.ssl.enabled=true
- xpack.security.http.ssl.key=certs/elasticsearch/elasticsearch.key
- xpack.security.http.ssl.certificate=certs/elasticsearch/elasticsearch.crt
- xpack.security.http.ssl.certificate_authorities=certs/ca/ca.crt
- xpack.security.http.ssl.verification_mode=certificate
- xpack.security.transport.ssl.enabled=true
- xpack.security.transport.ssl.key=certs/elasticsearch/elasticsearch.key
- xpack.security.transport.ssl.certificate=certs/elasticsearch/elasticsearch.crt
- xpack.security.transport.ssl.certificate_authorities=certs/ca/ca.crt
- xpack.security.transport.ssl.verification_mode=certificate
- xpack.license.self_generated.type=basic
healthcheck:
test:
[ "CMD-SHELL",
"curl -s --cacert config/certs/ca/ca.crt https://localhost:9200 | grep -q 'missing authentication credentials'",
]
interval: 10s
timeout: 10s
retries: 120
logging: # we use rsyslog
driver: syslog
options:
syslog-facility: "local2"
kibana:
# this service depends on the setup service (defined below), but docker stack has no
# way to specify dependencies, but more importantly, there's been a move away from this:
# https://stackoverflow.com/a/47714157/215945
image: localhost:5000/kibana:${STACK_VERSION:?} ######
hostname: kibana
deploy:
placement:
constraints:
- node.labels.elasticsearch==True # run KB on any one of the ES nodes
resources:
limits:
memory:
${MEM_LIMIT}
dns: 127.0.0.11 # use docker DNS only (may not be required)
networks:
- elastic-net
volumes:
- ${SHARED_DIR:?}/kibana:/usr/share/kibana/data
- ${SHARED_DIR:?}/certs:/usr/share/kibana/config/certs
ports:
- ${KIBANA_PORT}:5601
environment: # https://www.elastic.co/guide/en/kibana/master/settings.html
# https://www.elastic.co/guide/en/kibana/master/docker.html#environment-variable-config
# CAPS_WITH_UNDERSCORES must be used with Kibana
- SERVER_NAME=kibana
- ELASTICSEARCH_HOSTS=["https://elasticsearch:9200"]
- ELASTICSEARCH_USERNAME=kibana_system
- ELASTICSEARCH_PASSWORD=${KIBANA_PASSWORD}
- ELASTICSEARCH_SSL_CERTIFICATEAUTHORITIES=config/certs/ca/ca.crt
- SERVER_PUBLICBASEURL=${KIBANA_URL}
# if you don't want to use https/TLS with Kibana, comment-out
# the next four lines
- SERVER_SSL_ENABLED=true
- SERVER_SSL_KEY=${KIBANA_SSL_KEY_PATH}
- SERVER_SSL_CERTIFICATE=${KIBANA_SSL_CERT_PATH}
- SERVER_SSL_CERTIFICATEAUTHORITIES=${KIBANA_SSL_CERT_AUTH_PATH}
- TELEMETRY_OPTIN=false
healthcheck:
test:
[
"CMD-SHELL",
"curl -sIk https://localhost:5601 | grep -q 'HTTP/1.1 302 Found'",
]
interval: 10s
timeout: 10s
retries: 120
logging:
driver: syslog
options:
syslog-facility: "local2"
setup:
image: localhost:5000/elasticsearch:${STACK_VERSION:?} #######
deploy:
placement:
constraints:
- node.labels.elasticsearch==True
restart_policy: # https://docs.docker.com/compose/compose-file/compose-file-v3/#restart_policy
condition: none
volumes:
- ${SHARED_DIR:?}/certs:/usr/share/elasticsearch/config/certs
dns: 127.0.0.11 # use docker DNS only (may not be required)
networks:
- elastic-net
command: >
bash -c '
until curl -s --cacert config/certs/ca/ca.crt https://elasticsearch:9200 | grep -q "missing authentication credentials"
do
echo "waiting 30 secs for Elasticsearch availability..."
sleep 30
done
echo "setting kibana_system password"
until curl -s -X POST --cacert config/certs/ca/ca.crt -u elastic:${ELASTIC_PASSWORD} -H "Content-Type: application/json" https://elasticsearch:9200/_security/user/kibana_system/_password -d "{\"password\":\"${KIBANA_PASSWORD}\"}" | grep -q "^{}"
do
echo "waiting 10 secs before trying to set password again..."
sleep 10
done
echo "done"
'
logging:
driver: syslog
options:
syslog-facility: "local2"
networks:
elastic-net:
external: true
Deploy:
# . ./elastic-stack-env
# docker stack deploy -c elastic-stack.yml elastic
# # ... after Kibana comes up, you can remove the setup service if you want:
# docker service rm elastic_setup
Here's how I created the Elasticsearch CA and cert:
# cat elastic-certs.yml
version: "3.8"
services:
setup:
image: localhost:5000/elasticsearch:${STACK_VERSION:?} #######
volumes:
- ${SHARED_DIR:?}/certs:/usr/share/elasticsearch/config/certs
user: "0:0"
command: >
bash -c '
if [ ! -f certs/ca.zip ]; then
echo "Creating CA";
bin/elasticsearch-certutil ca --silent --pem -out config/certs/ca.zip;
unzip config/certs/ca.zip -d config/certs;
fi;
if [ ! -f certs/certs.zip ]; then
echo "Creating certs";
echo -ne \
"instances:\n"\
" - name: elasticsearch\n"\
" dns:\n"\
" - elasticsearch\n"\
" - localhost\n"\
" ip:\n"\
" - 127.0.0.1\n"\
> config/certs/instances.yml;
bin/elasticsearch-certutil cert --silent --pem -out config/certs/certs.zip --in config/certs/instances.yml --ca-cert config/certs/ca/ca.crt --ca-key config/certs/ca/ca.key;
unzip config/certs/certs.zip -d config/certs;
echo "Setting file permissions"
chown -R root:root config/certs;
find . -type d -exec chmod 750 \{\} \;;
find . -type f -exec chmod 640 \{\} \;;
fi;
sleep infinity
'
healthcheck:
test: ["CMD-SHELL", "[ -f config/certs/elasticsearch/elasticsearch.crt ]"]
interval: 1s
timeout: 5s
retries: 120
# . ./elastic-stack-env
# docker stack deploy -c elastic-certs.yml elastic-certs
# # ... ensure files are created under $SHARED_DIR/certs, then
# docker stack rm elastic-certs
How I created the Kibana cert is outside the scope of this question.
I run a Fluent Bit swarm service (mode: global, docker network elastic-net) to send logs to the elasticsearch service. Although outside the scope of this question, here's the salient config:
[OUTPUT]
name es
match <whatever is appropriate for you here>
host elasticsearch
port 9200
index my-index-default
http_user fluentbit
http_passwd ${FLUENTBIT_PASSWORD}
tls on
tls.ca_file /certs/ca/ca.crt
tls.crt_file /certs/elasticsearch/elasticsearch.crt
tls.key_file /certs/elasticsearch/elasticsearch.key
retry_limit false
suppress_type_name on
# trace_output on
Host elasticsearch will be resolved by docker's DNS server to the three IP addresses of the elasticsearch replicas, so there is no single point of failure.

Pihole and Unbound in Docker Containers - Unbound Not Receiving Requests

I'm trying to run 2 Docker containers on Raspberry pi 3, one for Unbound and one for Pihole. The idea is that Pihole will first block any requests before using Unbound as its DNS server. I've been following Pihole's documentation to get this running found here and have got both containers starting, and pihole working. However, when running docker exec pihole dig pi-hole.net #127.0.0.1 -p 5333 or -p 5354 I get a response of
; <<>> DiG 9.10.3-P4-Debian <<>> pi-hole.net #127.0.0.1 -p 5354
;; global options: +cmd
;; connection timed out; no servers could be reached
I theorized this could be to do with the pihole container not being able to communicate with the Unbound container through localhost, so updated my docker-compose to try and correct this using the netowkr bridge. However after that I still get the same error, no matter what ports I try. I'm new Docker and Unbound so this has been a bit of a dive in at the deep end! My docker-compose.yml and unbound.conf are below.
docker-compose.yml
version: "3.7"
services:
unbound:
cap_add:
- NET_ADMIN
- SYS_ADMIN
container_name: unbound
image: masnathan/unbound-arm
ports:
- 8953:8953/tcp
- 5354:53/udp
- 5354:53/tcp
- 5333:5333/udp
- 5333:5333/tcp
volumes:
- ./config/unbound.conf:/etc/unbound/unbound.conf
- ./config/root.hints:/var/unbound/etc/root.hints
restart: always
networks:
- unbound-pihole
pihole:
cap_add:
- NET_ADMIN
- SYS_ADMIN
container_name: pihole
image: pihole/pihole:latest
ports:
- 53:53/udp
- 53:53/tcp
- 67:67/udp
- 80:80
- 443:443
volumes:
- ./config/pihole/:/etc/pihole/
environment:
- ServerIP=10.0.0.20
- TZ=UTC
- WEBPASSWORD=random
- DNS1=127.0.0.1#5333
- DNS2=no
restart: always
networks:
- unbound-pihole
networks:
unbound-pihole:
driver: bridge
unbound.conf
server:
# If no logfile is specified, syslog is used
# logfile: "/var/log/unbound/unbound.log"
verbosity: 0
port: 5333
do-ip4: yes
do-udp: yes
do-tcp: yes
# May be set to yes if you have IPv6 connectivity
do-ip6: no
# Use this only when you downloaded the list of primary root servers!
root-hints: "/var/unbound/etc/root.hints"
# Trust glue only if it is within the servers authority
harden-glue: yes
# Require DNSSEC data for trust-anchored zones, if such data is absent, the zone becomes BOGUS
harden-dnssec-stripped: yes
# Don't use Capitalization randomization as it known to cause DNSSEC issues sometimes
# see https://discourse.pi-hole.net/t/unbound-stubby-or-dnscrypt-proxy/9378 for further details
use-caps-for-id: no
# Reduce EDNS reassembly buffer size.
# Suggested by the unbound man page to reduce fragmentation reassembly problems
edns-buffer-size: 1472
# TTL bounds for cache
cache-min-ttl: 3600
cache-max-ttl: 86400
# Perform prefetching of close to expired message cache entries
# This only applies to domains that have been frequently queried
prefetch: yes
# One thread should be sufficient, can be increased on beefy machines
num-threads: 1
# Ensure kernel buffer is large enough to not loose messages in traffic spikes
so-rcvbuf: 1m
# Ensure privacy of local IP ranges
private-address: 192.168.0.0/16
private-address: 169.254.0.0/16
private-address: 172.16.0.0/12
private-address: 10.0.0.0/8
private-address: fd00::/8
private-address: fe80::/10
Thanks!

From the docs https://nlnetlabs.nl/documentation/unbound/unbound.conf/ under the access-control section:
By default only localhost is allowed, the rest is refused. The
is refused, because that is protocol-friendly. The DNS
protocol is not designed to handle dropped packets due to pol-
icy, and dropping may result in (possibly excessive) retried
queries.
The unbound server, by default listen for connections from localhost only. in this case, the request to the DNS server can allow be accepted from inside the docker container running unbound.
Therefore, to allow the DNS to be resolved by the unbound in the docker-compose, add the following to the unbound.conf
server:
access-control: 0.0.0.0/0 allow

Why does Traefik not proxy new services in a Docker Swarm?

I try to setup traefik with a Docker Swarm. I have to VMs - one manger-node and one worker-node.
In addition I have created a external network with:
docker network create --driver=overlay proxy-net
I start traefik as a service within my manager-node with the following docker-compose.yml file:
version: '3'
services:
traefik:
image: traefik:v1.4.4
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- $PWD/management/traefik/traefik.toml:/etc/traefik/traefik.toml
ports:
- 80:80
- 8100:8080
deploy:
placement:
constraints:
- node.role == manager
networks:
default:
external:
name: proxy-net
My traefik.toml file looks like this:
Debug : "DEBUG"
defaultEntryPoints = ["http"]
[entryPoints]
[entryPoints.http]
address = ":80"
[web]
address = ":8080"
[docker]
watch = true
swarmmode = true
domain = "mydomain.com"
exposedbydefault = true
When I now start a new service (e.g. emilevauge/whoami) with:
docker service create \
--name whoami1 \
--publish mode=host,target=80,published=8002 \
--network proxy-net \
--label traefik.docker.network=proxy-net \
--label traefik.frontend.rule=Host:whoami.mydomain.com \
--label traefik.port=8002 \
emilevauge/whoami
The service is seen by the traefik web frontend. So at first every thing looks fine. I can access the service directly on my worker node on port 8002.
But traefik does not seem to be able to proxy this service. When I browse my endpoint URL (whomai.mydomain.com) I get the answer:
Bad Gateway
The traefik logfile (logLevel=DEBUG) shown messages like this:
proxy_traefik.1.zl50yv6got5f#tocidoc001 time="2017-12-03T20:09:28Z" level=debug msg="Filtering container without port and no traefik.port label swarmpit_app.1 : strconv.Atoi: parsing "": invalid syntax"
proxy_traefik.1.zl50yv6got5f#tocidoc001 time="2017-12-03T20:09:28Z" level=debug msg="Filtering container without port and no traefik.port label proxy_traefik.1 : strconv.Atoi: parsing "": invalid syntax"
proxy_traefik.1.zl50yv6got5f#tocidoc001 time="2017-12-03T20:09:28Z" level=debug msg="Filtering container without port and no traefik.port label swarmpit_db.1 : strconv.Atoi: parsing "": invalid syntax"
proxy_traefik.1.zl50yv6got5f#tocidoc001 time="2017-12-03T20:09:28Z" level=debug msg="Validation of load balancer method for backend backend-whoami1-whoami1-whoami1 failed: invalid load-balancing method ''. Using default method wrr."
proxy_traefik.1.zl50yv6got5f#tocidoc001 time="2017-12-03T20:09:28Z" level=debug msg="Configuration received from provider docker: {"backends":{"backend-whoami1-whoami1-whoami1":{"servers":{"service-0":{"url":"http://10.0.1.5:8002","weight":0}},"loadBalancer":{"method":"wrr"}}},"frontends":{"frontend-whoami1-whoami1-whoami1":{"entryPoints":["http"],"backend":"backend-whoami1-whoami1-whoami1","routes":{"service-whoami1":{"rule":"Host:whoami.mydomain.com"}},"passHostHeader":true,"priority":0,"basicAuth":[],"headers":{}}}}"
I played around several hours with different configurations. I also read the very concise documentation about traefik and docker-swarm. But I don't get any idea what I'm doing wrong.
Can any body help me with some tips how to better understand the problem?

I think it is not working because you Træfik service is not on the same docker network as your whoami1 service.
You should try to add proxy-net network to your Træfik service in your compose file.
There is a warning in Træfik documentation at the end of this page https://docs.traefik.io/configuration/backends/docker/
when running inside a container, Træfik will need network access through:
docker network connect <network> <traefik-container>

As already mentioned, they need to be in the same overlay network which is not ingress. The ingress network is only for manager nodes.
Further more, your traefik service is not assigned to the proxy-net network. You're creating proxy-net in your traefik config part, but don't assigned it to it
version: '3'
services:
traefik:
image: traefik:v1.4.4
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- $PWD/management/traefik/traefik.toml:/etc/traefik/traefik.toml
ports:
- 80:80
- 8100:8080
networks:
- proxy-net
deploy:
placement:
constraints:
- node.role == manager
networks:
proxy-net:
driver: overlay
Further more, you should create a config with docker config create. Otherwise with $PWD/management/traefik/traefik.toml you need to copy the traefik.toml file to every manager node.
Append your compose file with
configs:
traefik_conf_v1:
file: ./traefik.toml
and your traefik part with
configs:
- source: traefik_conf_v1
target: /etc/traefik/traefik.toml
Now back to your problem.
What's your service is missing is the label to the backend. Otherwise traefik doesn't know where the service is running (network assignment isn't enough!).
docker service create \
--name whoami1 \
--publish mode=host,target=80,published=8002 \
--network proxy-net \
--label traefik.backen=whoami1 \
--label traefik.docker.network=proxy-net \
--label traefik.frontend.rule=Host:whoami.mydomain.com \
--label traefik.port=8002 \
emilevauge/whoami
This should work. And when it does, stop publishing ports of your services. That makes everything complicated when you're in a hurry and need to scale. Remember, work balancing is handle by the swarm itself.
And yeah, dynamic flexible reverse proxys is still a problem nowadays :)
Remember, you got your entry points on manager nodes with traefik, but not on the worker nodes.

I finally I solved this issue. It was actually not a Traefik problem.
The problem was, that both VMs from my provider have the same private IPv4 address.
To register and join the docker-swarm it is important to provide the public IPv4 addresses with the option --advertise-addr
To register the swarm I have to run:
docker swarm init --advertise-addr [manager-ip-address]
to join the swarm by a worker-node also the public IPv4 address need to be set explicitly:
docker swarm join \
--token SWMTKN-1-xxxxxxxxxxxxxxxxxxxx-xxxxxxxx \
--advertise-addr [worker-ip-address]\
[manager-ip-address]:2377

I would say that your setup of service labels was wrong. Traefik redirects requests to swarm service port so it should go to port 80, not to published port 8002. I think that correct service create command should be:
docker service create \
--name whoami1 \
--publish mode=host,target=80,published=8002 \
--network proxy-net \
--label traefik.docker.network=proxy-net \
--label traefik.frontend.rule=Host:whoami.mydomain.com \
--label traefik.port=80 \
emilevauge/whoami
And publishing the 80 port for whoami service is not needed.

Using the host ip in docker-compose

I want to create a docker-compose file that is able to run on different servers.
For that I have to be able to specify the host-ip or hostname of the server (where all the containers are running) in several places in the docker-compose.yml.
E.g. for a consul container where I want to define how the server can be found by fellow consul containers.
consul:
image: progrium/consul
command: -server -advertise 192.168.1.125 -bootstrap
I don't want to hardcode 192.168.1.125 obviously.
I could use env_file: to specify the hostname or ip and adopt it on every server, so I have that information in one place and use that in docker-compose.yml. But this can only be used to specifiy environment variables and not for the advertise parameter.
Is there a better solution?

docker-compose allows to use environment variables from the environment running the compose command.
See documentation at https://docs.docker.com/compose/compose-file/#variable-substitution
Assuming you can create a wrapper script, like #balver suggested, you can set an environment variable called EXTERNAL_IP that will include the value of $(docker-machine ip).
Example:
#!/bin/sh
export EXTERNAL_IP=$(docker-machine ip)
exec docker-compose $#
and
# docker-compose.yml
version: "2"
services:
consul:
image: consul
environment:
- "EXTERNAL_IP=${EXTERNAL_IP}"
command: agent -server -advertise ${EXTERNAL_IP} -bootstrap
Unfortunately if you are using random port assignment, there is no way to add EXTERNAL_PORT, so the ports must be linked statically.
PS: Something very similar is enabled by default in HashiCorp Nomad, also includes mapped ports. Doc: https://www.nomadproject.io/docs/jobspec/interpreted.html#interpreted_env_vars

I've used docker internal network IP that seems to be static: 172.17.0.1

Is there a better solution?
Absolutely! You don't need the host ip at all for communication between containers. If you link containers in your docker-compose.yaml file, you will have access to a number of environment variables that you can use to discover the ip addresses of your services.
Consider, for example, a docker-compose configuration with two containers: one using consul, and one running some service that needs to talk to consul.
consul:
image: progrium/consul
command: -server -bootstrap
webserver:
image: larsks/mini-httpd
links:
- consul
First, by starting consul with just -server -bootstrap, consul figures out it's own advertise address, for example:
consul_1 | ==> Consul agent running!
consul_1 | Node name: 'f39ba7ef38ef'
consul_1 | Datacenter: 'dc1'
consul_1 | Server: true (bootstrap: true)
consul_1 | Client Addr: 0.0.0.0 (HTTP: 8500, HTTPS: -1, DNS: 53, RPC: 8400)
consul_1 | Cluster Addr: 172.17.0.4 (LAN: 8301, WAN: 8302)
consul_1 | Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
consul_1 | Atlas: <disabled>
In the webserver container, we find the following environment variables available to pid 1:
CONSUL_PORT=udp://172.17.0.4:53
CONSUL_PORT_8300_TCP_START=tcp://172.17.0.4:8300
CONSUL_PORT_8300_TCP_ADDR=172.17.0.4
CONSUL_PORT_8300_TCP_PROTO=tcp
CONSUL_PORT_8300_TCP_PORT_START=8300
CONSUL_PORT_8300_UDP_END=udp://172.17.0.4:8302
CONSUL_PORT_8300_UDP_PORT_END=8302
CONSUL_PORT_53_UDP=udp://172.17.0.4:53
CONSUL_PORT_53_UDP_ADDR=172.17.0.4
CONSUL_PORT_53_UDP_PORT=53
CONSUL_PORT_53_UDP_PROTO=udp
CONSUL_PORT_8300_TCP=tcp://172.17.0.4:8300
CONSUL_PORT_8300_TCP_PORT=8300
CONSUL_PORT_8301_TCP=tcp://172.17.0.4:8301
CONSUL_PORT_8301_TCP_ADDR=172.17.0.4
CONSUL_PORT_8301_TCP_PORT=8301
CONSUL_PORT_8301_TCP_PROTO=tcp
CONSUL_PORT_8301_UDP=udp://172.17.0.4:8301
CONSUL_PORT_8301_UDP_ADDR=172.17.0.4
CONSUL_PORT_8301_UDP_PORT=8301
CONSUL_PORT_8301_UDP_PROTO=udp
CONSUL_PORT_8302_TCP=tcp://172.17.0.4:8302
CONSUL_PORT_8302_TCP_ADDR=172.17.0.4
CONSUL_PORT_8302_TCP_PORT=8302
CONSUL_PORT_8302_TCP_PROTO=tcp
CONSUL_PORT_8302_UDP=udp://172.17.0.4:8302
CONSUL_PORT_8302_UDP_ADDR=172.17.0.4
CONSUL_PORT_8302_UDP_PORT=8302
CONSUL_PORT_8302_UDP_PROTO=udp
CONSUL_PORT_8400_TCP=tcp://172.17.0.4:8400
CONSUL_PORT_8400_TCP_ADDR=172.17.0.4
CONSUL_PORT_8400_TCP_PORT=8400
CONSUL_PORT_8400_TCP_PROTO=tcp
CONSUL_PORT_8500_TCP=tcp://172.17.0.4:8500
CONSUL_PORT_8500_TCP_ADDR=172.17.0.4
CONSUL_PORT_8500_TCP_PORT=8500
CONSUL_PORT_8500_TCP_PROTO=tcp
There is a set of variables for each port EXPOSEd by the consul
image. For example, in that second image, we could interact with the consul REST API by connecting to:
http://${CONSUL_PORT_8500_TCP_ADDR}:8500/

With the new version of Docker Compose (1.4.0) you should be able to do something like this:
docker-compose.yml
consul:
image: progrium/consul
command: -server -advertise HOSTIP -bootstrap
bash
$ sed -e "s/HOSTIP/${HOSTIP}/g" docker-compose.yml | docker-compose --file - up
This is thanks to the new feature:
Compose can now read YAML configuration from standard input, rather than from a file, by specifying - as the filename. This makes it easier to generate configuration dynamically:
$ echo 'redis: {"image": "redis"}' | docker-compose --file - up

Environment variables, as suggested in the earlier solution, are created by Docker when containers are linked. But the env vars are not automatically updated if the container is restarted. So, it is not recommended to use environment variables in production.
Docker, in addition to creating the environment variables, also updates the host entries in /etc/hosts file. In fact, Docker documentation recommends using the host entries from etc/hosts instead of the environment variables.
Reference: https://docs.docker.com/userguide/dockerlinks/
Unlike host entries in the /etc/hosts file, IP addresses stored in the environment variables are not automatically updated if the source container is restarted. We recommend using the host entries in /etc/hosts to resolve the IP address of linked containers.

extra_hosts works, it's hard coded into docker-compose.yml but for my current static setup, at this moment that's all I need.
version: '3'
services:
my_service:
container_name: my-service
image: my-service:latest
extra_hosts:
- "myhostname:192.168.0.x"
...
networks:
- host
networks:
host:

Create a script to set, every boot, your host IP in an environment variable.
sudo vi /etc/profile.d/docker-external-ip.sh
Then copy inside this code:
export EXTERNAL_IP=$(hostname -I | awk '{print $1}')
Now you can use it in your docker-compose.yml file:
version: '3'
services:
my_service:
container_name: my-service
image: my-service:latest
environment:
- EXTERNAL_IP=${EXTERNAL_IP}
extra_hosts:
- my.external-server.net:${EXTERNAL_IP}
...
environment --> to set as system environment var in your docker
container
extra_hosts --> to add these hosts to your docker container

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart