FreeIPA Docker Compose WEB UI - docker

After spending hours searching why I cannot access to my webUI, I turn to you.
I setup freeipa on docker using docker-compose. I opened some port to gain remote access using host-ip:port on my own computer. Freeipa is supposed to be run on my server (lets say 192.168.1.2) and the webui accessible with any other local computer on port 80 / 443 (192.168.1.4:80 or 192.168.1.4:443)
When I run my .yaml file, freeipa get setup with a "the ipa-server-install command was successful" message.
I thought it could come from my tight iptables rules and tried to put all policies to ACCEPT to debug. It didn't do it.
I'm a bit lost to how I could debbug this or find how to fix it.
OS : ubuntu 20.04.3
Docker version: 20.10.12, build e91ed57
freeipa image: freeipa/freeipa:centos-8-stream
Docker-compose version: 1.29.2, build 5becea4c
My .yaml file:
version: "3.8"
services:
freeipa:
image: freeipa/freeipa-server:centos-8-stream
hostname: sanctuary
domainname: serv.sanctuary.local
container_name: freeipa-dev
ports:
- 80:80
- 443:443
- 389:389
- 636:636
- 88:88
- 464:464
- 88:88/udp
- 464:464/udp
- 123:123/udp
dns:
- 10.64.0.1
- 1.1.1.1
- 1.0.0.1
restart: unless-stopped
tty: true
stdin_open: true
environment:
IPA_SERVER_HOSTNAME: serv.sanctuary.local
IPA_SERVER_IP: 192.168.1.100
TZ: "Europe/Paris"
command:
- -U
- --domain=sanctuary.local
- --realm=sanctuary.local
- --admin-password=pass
- --http-pin=pass
- --dirsrv-pin=pass
- --ds-password=pass
- --no-dnssec-validation
- --no-host-dns
- --setup-dns
- --auto-forwarders
- --allow-zone-overlap
- --unattended
cap_add:
- SYS_TIME
- NET_ADMIN
restart: unless-stopped
volumes:
- /etc/localtime:/etc/localtime:ro
- /sys/fs/cgroup:/sys/fs/cgroup:ro
- ./data:/data
- ./logs:/var/logs
sysctls:
- net.ipv6.conf.all.disable_ipv6=0
- net.ipv6.conf.lo.disable_ipv6=0
security_opt:
- "seccomp:unconfined"
labels:
- dev
I tried to tinker with the deployment file (add or remove conf found on internet such as add/remove IPA_SERVER_IP, add/remove an external bridge network)
Thank you very much for any help =)

Alright, for those who might have the same problem, I will explain everything I did to debug this.
I extensively relieded on the answers found here : https://floblanc.wordpress.com/2017/09/11/troubleshooting-freeipa-pki-tomcatd-fails-to-start/
First, I checked the status of each services with ipactl status. Depending of the problem, you might have different output but mine was like this :
Directory Service: RUNNING
krb5kdc Service: RUNNING
kadmin Service: RUNNING
named Service: RUNNING
httpd Service: RUNNING
ipa-custodia Service: RUNNING
pki-tomcatd Service: STOPPED
ipa-otpd Service: RUNNING
ipa-dnskeysyncd Service: RUNNING
ipa: INFO: The ipactl command was successful
I therefore checked the logs for tomcat /var/log/pki/pki-tomcat/ca/debug-xxxx. I realised I had connection refused with something related to the certificates.
Here, I first checked that my certificate was present in /etc/pki/pki-tomcat/alias using sudo certutil -L -d /etc/pki/pki-tomcat/alias -n 'subsystemCert cert-pki-ca'.
## output :
Certificate:
Data:
Version: 3 (0x2)
Serial Number: 4 (0x4)
...
...
Then I made sure that the private key can be read using the password found in /var/lib/pki/pki-tomcat/conf/password.conf (with the tag internal=…)
grep internal /var/lib/pki/pki-tomcat/conf/password.conf | cut -d= -f2 > /tmp/pwdfile.txt
certutil -K -d /etc/pki/pki-tomcat/alias -f /tmp/pwdfile.txt -n 'subsystemCert cert-pki-ca'
I still had nothings strange so I assumed that at this point :
pki-tomcat is able to access the certificate and the private key
The issue is likely to be on the LDAP server side
I tried to read the user entry in the LDAP to compare it to the certificate using ldapsearch -LLL -D 'cn=directory manager' -W -b uid=pkidbuser,ou=people,o=ipaca userCertificate description seeAlso but had an error after entering the password. Because my certs were OK and LDAP service running, I assumed something was off with the certificates date.
Indeed, during the install freeipa setup the certs using your current system date as base. But it also install chrony for server time synchronization. After reboot, my chrony conf were wrong and set my host date 2 years ahead.
I couldnt figure out the problem with the chrony conf so I stopped the service and set the date manually using timedatectl set-time "yyyy-mm-dd hh:mm:ss".
I restarted freeipa services amd my pki-tomcat service was working again.
After that, I set the freeipa IP in my router as DNS. I restarted services and computer in the local network so DNS config were refreshed. After that, the webUI was accessible !

Related

Connecting to XDEBUG on remote Docker Image

I am trying to debug using NetBeans 11 as my client, Xdebug 3 on the Docker image. The Docker container is on a remote host. I am unable to make a connection. Indicator at the bottom of the NetBeans screen scrolls forever with "waiting for connection (netbeans-xdebug)". I am not sure what I am doing wrong. I have had this work in the past without Docker and with Xdebug 2, I am not sure if I messed up the Xdebug 3, the Docker or both.
My configurations:
Dockerfile adds Xdebug properly and I can see it in my container.
docker-compose.yml
---
services:
drupal:
container_name: intranet-finkenb2
ports:
- "8082:80"
- "9092:9003"
volumes:
- /home/finkenb2/intranet/custom_themes:/opt/drupal/web/themes/custom
- /home/finkenb2/intranet/custom_modules:/opt/drupal/web/modules/custom
environment:
XDEBUG_MODE: debug,develop
XDEBUG_SESSION: netbeans-xdebug
XDEBUG_CONFIG: >
client_host = localhost
client_port = 9003
discover_client_host=true
start_with_request=yes
db:
container_name: intranet-finkenb2-db
solr:
container_name: intranet-finkenb2-solr
ports:
- "8982:8983"
volumes:
public_files:
private_files:
site_settings:
SSH Tunnel via PuTTY: R9092 localhost:9092
NetBeans PHP Debug config:
- Debugger Port: 9092
- Session ID: netbeans-xdebug
- Maximum Data Length: 8192
- Check: Stop at first line
NetBeans Project config (run configuration):
- Run As: Remote Web Site
- Project URL: http://intranet-finkenb2.devel.lib.msu.edu
- index file: index.php
- remote connection
- hostname intranet8.devel.lib.msu.edu /*docker host server*/
- user/pwd correct
- initial directory /var/www/
This is not correct:
ports
- "9092:9003"
Xdebug connects to your IDE, so you don't need to expose the port. These ports are for external exposures anyway, and you're SSH-ing into the container with -R already, so this makes no sense.
This is not correct:
XDEBUG_CONFIG: >
client_host = localhost
client_port = 9003
discover_client_host=true
start_with_request=yes
You can't use all of these as part of the XDEBUG_CONFIG variable, start_with_request, for example.
discover_client_host with Docker doesn't work, as it gets the wrong IP through the gateway
localhost is normally also not correct, as it needs to be the IP/hostname of the machine where your IDE listens. But you're SSH-ing into that container, so that should fine fine.
With that SSH tunnel, you need to set client_port to the port of your remote SSH end point (9092).
If anything else is unclear, make a log file, or try to debug a page with xdebug_info() in it, and it will tell you what Xdebug has tried.

Certificate of K3S cluster

I'm using a K3S Cluster in a docker(-compose) container in my CI/CD pipeline, to test my application code. However I have problem with the certificate of the cluster. I need to communicate on the cluster using the external addres. My docker-compose script looks as follows
version: '3'
services:
server:
image: rancher/k3s:v0.8.1
command: server --disable-agent
environment:
- K3S_CLUSTER_SECRET=somethingtotallyrandom
- K3S_KUBECONFIG_OUTPUT=/output/kubeconfig.yaml
- K3S_KUBECONFIG_MODE=666
volumes:
- k3s-server:/var/lib/rancher/k3s
# get the kubeconfig file
- .:/output
ports:
# - 6443:6443
- 6080:6080
- 192.168.2.110:6443:6443
node:
image: rancher/k3s:v0.8.1
tmpfs:
- /run
- /var/run
privileged: true
environment:
- K3S_URL=https://server:6443
- K3S_CLUSTER_SECRET=somethingtotallyrandom
ports:
- 31000-32000:31000-32000
volumes:
k3s-server: {}
accessing the cluster from python gives me
MaxRetryError: HTTPSConnectionPool(host='192.168.2.110', port=6443): Max retries exceeded with url: /apis/batch/v1/namespaces/mlflow/jobs?pretty=True (Caused by SSLError(SSLCertVerificationError("hostname '192.168.2.110' doesn't match either of 'localhost', '172.19.0.2', '10.43.0.1', '172.23.0.2', '172.18.0.2', '172.23.0.3', '127.0.0.1', '0.0.0.0', '172.18.0.3', '172.20.0.2'")))
Here are my two (three) question
how can I add additional IP adresses to the cert generation? I was hoping the --bind-address in the server command triggers taht
how can I fall back on http providing an --http-listen-port didn't achieve the expected result
any other suggestion how I can enable communication with the cluster
changing the python code is not really an option as I would like o keep the code unaltered for testing. (Fallback on http works via kubeconfig.
The solution is to use the parameter tls-san
server --disable-agent --tls-san 192.168.2.110

docker-compose UnknownHostException : but docker run works

I have a docker image (lfs-service:latest) that I'm trying to run as part of a suite of micro services.
RHELS 7.5
Docker version: 1.13.1
docker-compose version 1.23.2
Postgres 11 (installed on RedHat host machine)
The following command works exactly as I would like:
docker run -d \
-p 9000:9000 \
-v "$PWD/lfs-uploads:/lfs-uploads" \
-e "SPRING_PROFILES_ACTIVE=dev" \
-e dbhost=$HOSTNAME \
--name lfs-service \
[corp registry]/lfs-service:latest
This successfully:
creates/starts a container with my Spring Boot Docker image on port
9000
writes the uploads to disk into the lfs-uploads directory
and connects to a local Postgres DB that's running on the host
machine (not in a Docker container).
My service works as expected. Great!
Now, my problem:
I'm tring to run/manage my services using Docker Compose with the following content (I have removed all other services and my api gateway from docker-compose.yaml to simplify the scenario):
version: '3'
services:
lfs-service:
image: [corp registry]/lfs-service:latest
container_name: lfs-service
stop_signal: SIGINT
ports:
- 9000:9000
expose:
- 9000
volumes:
- "./lfs-uploads:/lfs-uploads"
environment:
- SPRING_PROFILES_ACTIVE=dev
- dbhost=$HOSTNAME
Relevant entries in application.yaml:
spring:
profiles: dev
datasource:
url: jdbc:postgresql://${dbhost}:5432/lfsdb
username: [dbusername]
password: [dbpassword]
jpa:
properties:
hibernate:
dialect: org.hibernate.dialect.PostgreSQLDialect
hibernate:
ddl-auto: update
Execution:
docker-compose up
...
The following profiles are active: dev
...
Tomcat initialized with port(s): 9000 (http)
...
lfs-service | Caused by: java.net.UnknownHostException: [host machine hostname]
lfs-service | at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184) ~[na:1.8.0_181]
lfs-service | at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[na:1.8.0_181]
lfs-service | at java.net.Socket.connect(Socket.java:589) ~[na:1.8.0_181]
lfs-service | at org.postgresql.core.PGStream.<init>(PGStream.java:70) ~[postgresql-42.2.5.jar!/:42.2.5]
lfs-service | at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:91) ~[postgresql-42.2.5.jar!/:42.2.5]
lfs-service | at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:192) ~[postgresql-42.2.5.jar!/:42.2.5]
...
lfs-service | 2019-01-11 18:46:54.495 WARN [lfs-service,,,] 1 --- [ main] o.s.b.a.orm.jpa.DatabaseLookup : Unable to determine jdbc url from datasource
lfs-service |
lfs-service | org.springframework.jdbc.support.MetaDataAccessException: Could not get Connection for extracting meta-data; nested exception is org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is org.postgresql.util.PSQLException: The connection attempt failed.
lfs-service | at org.springframework.jdbc.support.JdbcUtils.extractDatabaseMetaData(JdbcUtils.java:328) ~[spring-jdbc-5.1.2.RELEASE.jar!/:5.1.2.RELEASE]
lfs-service | at org.springframework.jdbc.support.JdbcUtils.extractDatabaseMetaData(JdbcUtils.java:356) ~[spring-jdbc-5.1.2.RELEASE.jar!/:5.1.2.RELEASE]
...
Both methods of starting should be equivalent but obviously there's a functional difference... Any ideas on how to resolve this issue / write a comperable docker-compose file which is functionally identical to the "docker run" command at the top?
NOTE: I've also tried the following values for dbhost: localhost, 127.0.0.1 - this won't work as it attempts to find the DB in the container, and not on the host machine.
CORRECTION:
Unfortunately, while this solution works in the simplest use case - it will break Eureka & API Gateways from functioning, as the container will be running on a separate network. I'm still looking for working solution.
To anyone looking for a solution to this question, this worked for me:
docker-compose.yaml:
lfs-service:
image: [corp repo]/lfs-service:latest
container_name: lfs-service
stop_signal: SIGINT
ports:
- 9000:9000
expose:
- 9000
volumes:
- "./lfs-uploads:/lfs-uploads"
environment:
- SPRING_PROFILES_ACTIVE=dev
- dbhost=localhost
network_mode: host
Summary of changes made to docker-compose.yaml:
change $HOSTNAME to "localhost"
Add "network_mode: host"
I have no idea if this is the "correct" way to resolve this, but since it's only for our remote development server the solution is working for me. I'm open to suggestions if you have a better solution.
Working solution
The simple solution is to just provide the host machine IP address (vs hostname).
environment:
- SPRING_PROFILES_ACTIVE=dev
- dbhost=172.18.0.1
Setting this via an environment variable would probably be more portable:
export DB_HOST_IP=172.18.0.1
docker-compose.yaml
environment:
- SPRING_PROFILES_ACTIVE=dev
- dbhost=${DB_HOST_IP}

Pihole and Unbound in Docker Containers - Unbound Not Receiving Requests

I'm trying to run 2 Docker containers on Raspberry pi 3, one for Unbound and one for Pihole. The idea is that Pihole will first block any requests before using Unbound as its DNS server. I've been following Pihole's documentation to get this running found here and have got both containers starting, and pihole working. However, when running docker exec pihole dig pi-hole.net #127.0.0.1 -p 5333 or -p 5354 I get a response of
; <<>> DiG 9.10.3-P4-Debian <<>> pi-hole.net #127.0.0.1 -p 5354
;; global options: +cmd
;; connection timed out; no servers could be reached
I theorized this could be to do with the pihole container not being able to communicate with the Unbound container through localhost, so updated my docker-compose to try and correct this using the netowkr bridge. However after that I still get the same error, no matter what ports I try. I'm new Docker and Unbound so this has been a bit of a dive in at the deep end! My docker-compose.yml and unbound.conf are below.
docker-compose.yml
version: "3.7"
services:
unbound:
cap_add:
- NET_ADMIN
- SYS_ADMIN
container_name: unbound
image: masnathan/unbound-arm
ports:
- 8953:8953/tcp
- 5354:53/udp
- 5354:53/tcp
- 5333:5333/udp
- 5333:5333/tcp
volumes:
- ./config/unbound.conf:/etc/unbound/unbound.conf
- ./config/root.hints:/var/unbound/etc/root.hints
restart: always
networks:
- unbound-pihole
pihole:
cap_add:
- NET_ADMIN
- SYS_ADMIN
container_name: pihole
image: pihole/pihole:latest
ports:
- 53:53/udp
- 53:53/tcp
- 67:67/udp
- 80:80
- 443:443
volumes:
- ./config/pihole/:/etc/pihole/
environment:
- ServerIP=10.0.0.20
- TZ=UTC
- WEBPASSWORD=random
- DNS1=127.0.0.1#5333
- DNS2=no
restart: always
networks:
- unbound-pihole
networks:
unbound-pihole:
driver: bridge
unbound.conf
server:
# If no logfile is specified, syslog is used
# logfile: "/var/log/unbound/unbound.log"
verbosity: 0
port: 5333
do-ip4: yes
do-udp: yes
do-tcp: yes
# May be set to yes if you have IPv6 connectivity
do-ip6: no
# Use this only when you downloaded the list of primary root servers!
root-hints: "/var/unbound/etc/root.hints"
# Trust glue only if it is within the servers authority
harden-glue: yes
# Require DNSSEC data for trust-anchored zones, if such data is absent, the zone becomes BOGUS
harden-dnssec-stripped: yes
# Don't use Capitalization randomization as it known to cause DNSSEC issues sometimes
# see https://discourse.pi-hole.net/t/unbound-stubby-or-dnscrypt-proxy/9378 for further details
use-caps-for-id: no
# Reduce EDNS reassembly buffer size.
# Suggested by the unbound man page to reduce fragmentation reassembly problems
edns-buffer-size: 1472
# TTL bounds for cache
cache-min-ttl: 3600
cache-max-ttl: 86400
# Perform prefetching of close to expired message cache entries
# This only applies to domains that have been frequently queried
prefetch: yes
# One thread should be sufficient, can be increased on beefy machines
num-threads: 1
# Ensure kernel buffer is large enough to not loose messages in traffic spikes
so-rcvbuf: 1m
# Ensure privacy of local IP ranges
private-address: 192.168.0.0/16
private-address: 169.254.0.0/16
private-address: 172.16.0.0/12
private-address: 10.0.0.0/8
private-address: fd00::/8
private-address: fe80::/10
Thanks!
From the docs https://nlnetlabs.nl/documentation/unbound/unbound.conf/ under the access-control section:
By default only localhost is allowed, the rest is refused. The
is refused, because that is protocol-friendly. The DNS
protocol is not designed to handle dropped packets due to pol-
icy, and dropping may result in (possibly excessive) retried
queries.
The unbound server, by default listen for connections from localhost only. in this case, the request to the DNS server can allow be accepted from inside the docker container running unbound.
Therefore, to allow the DNS to be resolved by the unbound in the docker-compose, add the following to the unbound.conf
server:
access-control: 0.0.0.0/0 allow

How does service discovery work with modern docker/docker-compose?

I'm using Docker 1.11.1 and docker-compose 1.8.0-rc2.
In the good old days (so, last year), you could set up a docker-compose.yml file like this:
app:
image: myapp
frontend:
image: myfrontend
links:
- app
And then start up the environment like this:
docker scale app=3 frontend=1
And your frontend container could inspect the environment variables
for variables named APP_1_PORT, APP_2_PORT, etc to discover the
available backend hosts and configure itself accordingly.
Times have changed. Now, we do this...
version: '2'
services:
app:
image: myapp
frontend:
image: myfrontend
links:
- app
...and instead of environment variables, we get DNS. So inside the
frontend container, I can ask for app_app_1 or app_app_2 or
app_app_3 and get the corresponding ip address. I can also ask for
app and get the address of app_app_1.
But how do I discover all of the available backend containers? I
guess I could loop over getent hosts ... until it fails:
counter=1
while :; do
getent hosts app_$counter || break
backends="$backends app_$counter"
let counter++
done
But that seems ugly and fragile.
I've heard rumors about round-robin dns, but (a) that doesn't seem to
be happening in my test environment, and (b) that doesn't necessarily
help if your frontend needs simultaneous connections to the backends.
How is simple container and service discovery meant to work in the
modern Docker world?
Docker's built-in Nameserver & Loadbalancer
Docker comes with a built-in nameserver. The server is, by default, reachable via 127.0.0.11:53.
Every container has by default a nameserver entry in /etc/resolv.conf, so it is not required to specify the address of the nameserver from within the container. That is why you can find your service from within the network with service or task_service_n.
If you do task_service_n then you will get the address of the corresponding service replica.
If you only ask for the service docker will perform internal load balancing between container in the same network and external load balancing to handle requests from outside.
When swarm is used, docker will additionally use two special networks.
The ingress network, which is actually an overlay network and handles incomming trafic to the swarm. It allows to query any service from any node in the swarm.
The docker_gwbridge, a bridge network, which connects the overlay networks of the individual hosts to an their physical network. (including ingress)
When using swarm to deploy services, the behavior as described in the examples below will not work unless endpointmode is set to dns roundrobin instead of vip.
endpoint_mode: vip - Docker assigns the service a virtual IP (VIP) that acts as the front end for clients to reach the service on a network. Docker routes requests between the client and available worker nodes for the service, without client knowledge of how many nodes are participating in the service or their IP addresses or ports. (This is the default.)
endpoint_mode: dnsrr - DNS round-robin (DNSRR) service discovery does not use a single virtual IP. Docker sets up DNS entries for the service such that a DNS query for the service name returns a list of IP addresses, and the client connects directly to one of these. DNS round-robin is useful in cases where you want to use your own load balancer, or for Hybrid Windows and Linux applications.
Example
For example deploy three replicas from dig/docker-compose.yml
version: '3.8'
services:
whoami:
image: "traefik/whoami"
deploy:
replicas: 3
DNS Lookup
You can use tools such as dig or nslookup to do a DNS lookup against the nameserver in the same network.
docker run --rm --network dig_default tutum/dnsutils dig whoami
; <<>> DiG 9.9.5-3ubuntu0.2-Ubuntu <<>> whoami
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 58433
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;whoami. IN A
;; ANSWER SECTION:
whoami. 600 IN A 172.28.0.3
whoami. 600 IN A 172.28.0.2
whoami. 600 IN A 172.28.0.4
;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11)
;; WHEN: Mon Nov 16 22:36:37 UTC 2020
;; MSG SIZE rcvd: 90
If you are only interested in the IP, you can provide the +short option
docker run --rm --network dig_default tutum/dnsutils dig +short whoami
172.28.0.3
172.28.0.4
172.28.0.2
Or look for specific service
docker run --rm --network dig_default tutum/dnsutils dig +short dig_whoami_2
172.28.0.4
Load balancing
The default loadbalancing happens on the transport layer or layer 4 of the OSI Model. So it is TCP/UDP based. That means it is not possible to inpsect and manipulate http headers with this method. In the enterprise edition it is apparently possible to use labels similar to the ones treafik is using in the example a bit further down.
docker run --rm --network dig_default curlimages/curl -Ls http://whoami
Hostname: eedc94d45bf4
IP: 127.0.0.1
IP: 172.28.0.3
RemoteAddr: 172.28.0.5:43910
GET / HTTP/1.1
Host: whoami
User-Agent: curl/7.73.0-DEV
Accept: */*
Here is the hostname from 10 times curl:
Hostname: eedc94d45bf4
Hostname: 42312c03a825
Hostname: 42312c03a825
Hostname: 42312c03a825
Hostname: eedc94d45bf4
Hostname: d922d86eccc6
Hostname: d922d86eccc6
Hostname: eedc94d45bf4
Hostname: 42312c03a825
Hostname: d922d86eccc6
Health Checks
Health checks, by default, are done by checking the process id (PID) of the container on the host kernel. If the process is running successfully, the container is considered healthy.
Oftentimes other health checks are required. The container may be running but the application inside has crashed. In many cases a TCP or HTTP check is preferred.
It is possible to bake a custom health checks into images. For example, using curl to perform L7 health checks.
FROM traefik/whoami
HEALTHCHECK CMD curl --fail http://localhost || exit 1
It is also possible to specify the health check via cli when starting the container.
docker run \
--health-cmd "curl --fail http://localhost || exit 1" \
--health-interval=5s \
--timeout=3s \
traefik/whoami
Example with Swarm
As initially mentioned, swarms behavior is different in that it will assign a virtual IP to services by default. Its actually not different its just docker or docker-compose doesn't create real services, it just imitates the behavior of swarm but still runs the container normally, as services can, in fact, only be created by manager nodes.
Keeping in mind we are on a swarm manager and thus the default mode is VIP
Create a overlay network that can be used by regular containers too
$ docker network create --driver overlay --attachable testnet
create some service with 2 replicas
$ docker service create --network testnet --replicas 2 --name digme nginx
Now lets use dig again and making sure we attach the container to the same network
$ docker run --network testnet --rm tutum/dnsutils dig digme
digme. 600 IN A 10.0.18.6
We see that indeed we only got one IP address back, so it appears that this is the virtual IP that has been assigned by docker.
Swarm allows actually to get the single IPs in this case without explicitly setting the endpoint mode.
We can query for tasks.<servicename> in this case that is tasks.digme
$ docker run --network testnet --rm tutum/dnsutils dig tasks.digme
tasks.digme. 600 IN A 10.0.18.7
tasks.digme. 600 IN A 10.0.18.8
This has brought us 2 A records pointing to the individual replicas.
Now lets create another service with endpointmode set to dns roundrobin
docker service create --endpoint-mode dnsrr --network testnet --replicas 2 --name digme2 nginx
$ docker run --network testnet --rm tutum/dnsutils dig digme2
digme2. 600 IN A 10.0.18.21
digme2. 600 IN A 10.0.18.20
This way we get both IPs without adding the prefix tasks.
Service Discovery & Loadbalancing Strategies
If the built in features are not sufficent, some strategies can be implemented to achieve better control. Below are some examples.
HAProxy
Haproxy can use the docker nameserver in combination with dynamic server templates to discover the running container. Then the traditional proxy features can be leveraged to achieve powerful layer 7 load balancing with http header manipulation and chaos engeering such as retries.
version: '3.8'
services:
loadbalancer:
image: haproxy
volumes:
- ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro
ports:
- 80:80
- 443:443
whoami:
image: "traefik/whoami"
deploy:
replicas: 3
...
resolvers docker
nameserver dns1 127.0.0.11:53
resolve_retries 3
timeout resolve 1s
timeout retry 1s
hold other 10s
hold refused 10s
hold nx 10s
hold timeout 10s
hold valid 10s
hold obsolete 10s
...
backend whoami
balance leastconn
option httpchk
option redispatch 1
retry-on all-retryable-errors
retries 2
http-request disable-l7-retry if METH_POST
dynamic-cookie-key MY_SERVICES_HASHED_ADDRESS
cookie MY_SERVICES_HASHED_ADDRESS insert dynamic
server-template whoami- 6 whoami:80 check resolvers docker init-addr libc,none
...
Traefik
The previous method is already pretty decent. However, you may have noticed that it requires knowing which services should be discovered and also the number of replicas to discover is hard coded. Traefik, a container native edge router, solves both problems. As long as we enable Traefik via label, the service will be discovered. This decentralized the configuration. It is as if each service registers itself.
The label can also be used to inspect and manipulate http headers.
version: "3.8"
services:
traefik:
image: "traefik:v2.3"
command:
- "--log.level=DEBUG"
- "--api.insecure=true"
- "--providers.docker=true"
- "--providers.docker.exposedbydefault=false"
- "--entrypoints.web.address=:80"
ports:
- "80:80"
- "8080:8080"
volumes:
- "/var/run/docker.sock:/var/run/docker.sock:ro"
whoami:
image: "traefik/whoami"
labels:
- "traefik.enable=true"
- "traefik.port=80"
- "traefik.http.routers.whoami.entrypoints=web"
- "traefik.http.routers.whoami.rule=PathPrefix(`/`)"
- "traefik.http.services.whoami.loadbalancer.sticky=true"
- "traefik.http.services.whoami.loadbalancer.sticky.cookie.name=MY_SERVICE_ADDRESS"
deploy:
replicas: 3
Consul
Consul is a tool for service discovery and configuration management. Services have to be registered via API request. It is a more complex solution that probably only makes sense in bigger clusters, but can be very powerful. Usually it recommended running this on bare metal and not in a container. You could install it alongside the docker host on each server in your cluster.
In this example it has been paired with the registrator image, which takes care of registering the docker services in consuls catalog.
The catalog can be leveraged in many ways. One of them is to use consul-template.
Note that consul comes with its own DNS resolver so in this instance the docker DNS resolver is somewhat neglected.
version: '3.8'
services:
consul:
image: gliderlabs/consul-server:latest
command: "-advertise=${MYHOST} -server -bootstrap"
container_name: consul
hostname: ${MYHOST}
ports:
- 8500:8500
registrator:
image: gliderlabs/registrator:latest
command: "-ip ${MYHOST} consul://${MYHOST}:8500"
container_name: registrator
hostname: ${MYHOST}
depends_on:
- consul
volumes:
- /var/run/docker.sock:/tmp/docker.sock
proxy:
build: .
ports:
- 80:80
depends_on:
- consul
whoami:
image: "traefik/whoami"
deploy:
replicas: 3
ports:
- "80"
Dockerfile for custom proxy image with consul template backed in.
FROM nginx
RUN curl https://releases.hashicorp.com/consul-template/0.25.1/consul-template_0.25.1_linux_amd64.tgz \
> consul-template_0.25.1_linux_amd64.tgz
RUN gunzip -c consul-template_0.25.1_linux_amd64.tgz | tar xvf -
RUN mv consul-template /usr/sbin/consul-template
RUN rm /etc/nginx/conf.d/default.conf
ADD proxy.conf.ctmpl /etc/nginx/conf.d/
ADD consul-template.hcl /
CMD [ "/bin/bash", "-c", "/etc/init.d/nginx start && consul-template -config=consul-template.hcl" ]
Consul template takes a template file and renders it according to the content of consuls catalog.
upstream whoami {
{{ range service "whoami" }}
server {{ .Address }}:{{ .Port }};
{{ end }}
}
server {
listen 80;
location / {
proxy_pass http://whoami;
}
}
After the template has been changed, the restart command is executed.
consul {
address = "consul:8500"
retry {
enabled = true
attempts = 12
backoff = "250ms"
}
}
template {
source = "/etc/nginx/conf.d/proxy.conf.ctmpl"
destination = "/etc/nginx/conf.d/proxy.conf"
perms = 0600
command = "/etc/init.d/nginx reload"
command_timeout = "60s"
}
Feature Table
Built In
HAProxy
Traefik
Consul-Template
Resolver
Docker
Docker
Docker
Consul
Service Discovery
Automatic
Server Templates
Label System
KV Store + Template
Health Checks
Yes
Yes
Yes
Yes
Load Balancing
L4
L4, L7
L4, L7
L4, L7
Sticky Session
No
Yes
Yes
Depends on proxy
Metrics
No
Stats Page
Dashboard
Dashboard
You can view some of the code samples in more detail on github.

Resources