how to allow both host networking and ingress in docker swarm? - docker-swarm

I have a small network of computers, bundled into a docker swarm. I'm running a service that needs host networking, which is fine when it runs on a dedicated node.
But I want to let it run on any node - yet maintaining the ability to access its web UI by connecting to a fixed hostname/IP address, regardless of which node the service is actually running on.
This is normally handled by docker's ingress network, which allows me to connect to a published port on any node's IP address, and routes the connection to the proper node. However, apparently this doesn't work with host networking, and if I specify the ingress network explicitely, it gets rejected.
So, is there a way to both have host networking, while keeping ingress routing? Or what would be the recommended way to let me connect to the service without worrying about which node it's running on at any given moment?
EDIT:
My stack file is the following:
version: '3'
services:
app:
image: ghcr.io/home-assistant/home-assistant:stable
volumes:
- ...
privileged: true
deploy:
replicas: 1
restart_policy:
condition: any
placement:
constraints:
- node.hostname==nas
networks:
- host
networks:
host:
external: true

Related

How to properly configure HAProxy in Docker Swarm to automatically route traffic to replicated services (via SSL)?

I'm trying to deploy a Docker Swarm of three host nodes with a single replicated service and put an HAProxy in front of it. I want the clients to be able to connect via SSL.
My docker-compose.yml:
version: '3.9'
services:
proxy:
image: haproxy
ports:
- 443:8080
volumes:
- haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg
deploy:
placement:
constraints: [node.role == manager]
networks:
- servers-network
node-server:
image: glusk/hackathon-2021:latest
ports:
- 8080:8080
command: npm run server
deploy:
mode: replicated
replicas: 2
networks:
- servers-network
networks:
servers-network:
driver: overlay
My haproxy.cfg (based on the official example):
# Simple configuration for an HTTP proxy listening on port 80 on all
# interfaces and forwarding requests to a single backend "servers" with a
# single server "server1" listening on 127.0.0.1:8000
global
daemon
maxconn 256
defaults
mode http
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
frontend http-in
bind *:80
default_backend servers
backend servers
server server1 127.0.0.1:8000 maxconn 32
My hosts are Lightsail VPS Ubuntu instances and share the same private network.
node-service runs each https server task inside its own container on: 0.0.0.0:8080.
The way I'm trying to make this work at the moment is to ssh into the manager node (which also has a static and public IP), copy over my configuration files from above, and run:
docker stack deploy --compose-file=docker-compose.yml hackathon-2021
but it doesn't work.
Well, first of all and regarding SSL (since it's the first thing that you mention) you need to configure it using the certificate and listen on the port 443, not port 80.
With that modification, your Proxy configuration would already change to:
global
daemon
maxconn 256
defaults
mode http
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
frontend http-in
bind *:80
default_backend servers
frontend https-in
bind *:443 ssl crt /etc/ssl/certs/hackaton2021.pem
default_backend servers
That would be a really simplified configuration for allowing SSL connection.
Now, let's go for the access to the different services.
First of all, you cannot access to the service on localhost, actually you shouldn't even expose the ports of the services you have to the host. The reason? That you already have those applications in the same network than the haproxy, so the ideal would be to take advantage of the Docker DNS to access directly to them
In order to do this, first we need to be able to resolve the service names. For that you need to add the following section to your configuration:
resolvers docker
nameserver dns1 127.0.0.11:53
resolve_retries 3
timeout resolve 1s
timeout retry 1s
hold other 10s
hold refused 10s
hold nx 10s
hold timeout 10s
hold valid 10s
hold obsolete 10s
The Docker Swarm DNS service is always available at 127.0.0.11.
Now to your previous existent configuration, we would have to add the server but using the service-name discovery:
backend servers
balance roundrobin
server-template node- 2 node-server:8080 check resolvers docker init-addr libc,none
If you check what we are doing, we are creating a server for each one of the discovered containers in the Swarm within the node-server service (so the replicas) and we will create those adding the prefix node- to each one of them.
Basically, that would be the equivalent to get the actual IPs of each of the replicas and add them stacked as a basic server configuration.
For deployment, you also have some errors, since we aren't interested into actually expose the node-server ports to the host, but to create the two replicas and use HAProxy for the networking.
For that, we should use the following Docker Compose:
version: '3.9'
services:
proxy:
image: haproxy
ports:
- 80:80
- 443:443
volumes:
- hackaton2021.pem:/etc/ssl/certs/hackaton2021.pem
- haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg
deploy:
placement:
constraints: [node.role == manager]
node-server:
image: glusk/hackathon-2021:latest
command: npm run server
deploy:
mode: replicated
replicas: 2
Remember to copy your haproxy.cfg and the self-signed (or real) certificate for your application to the instance before deploying the Stack.
Also, when you create that stack it will automatically create a network with the name <STACK_NAME>-default, so you don't need to define a network just for connecting both services.

Use docker-compose to replicate a single service from a stack?

I'm a bit confused as I'm used to use docker-compose in a single-server environment. Now I have the idea to use a Docker Swarm cluster with docker-compose (as it's what I know better) but I'm a bit confused on how to make it work against my app's needs. For instance:
My app is made up by a manager app and multiple workers. My idea is to have the manager app run in the Docker Swarm manager's server (is that possible?) and then use docker-compose to replicate the workers only through the rest of the Swarm cluster nodes.
A small map would be something like:
Server A -> manager
Server B -> worker1, worker2, worker3
Server C -> worker4, worker5
The workers connect to the manager through a defined IP & port in the environment section in the docker-compose.yml file.
My question is: How do I start up the manager only on a single server, and how do I replicate the workers only in the other nodes, without having a manager per cluster node? (as I don't want/need that). Thanks in advance!
You can to define by constraints
version: '3.8'
services:
manager:
hostname: 'manager'
image: traefik
deploy:
placement:
max_replicas_per_node: 1
constraints: [node.role == manager]
service:
image: service
deploy:
mode: replicated
replicas: 5
placement:
constraints: [node.role == worker]

ClusterJ cannot connect to dockerized Mysql cluster from outside the container

I have setup MySQL cluster on my PC using mysql/mysql-cluster image on docker hub, and it starts up fine. However when I try to connect to the cluster from outside docker (via the host machine) using clusterJ it doesn't connect.
Initially I was getting the following error: Could not alloc node id at 127.0.0.1 port 1186: No free node id found for mysqld(API)
So I created a custom mysql-cluster.cnf, very similar to the one distributed with the docker image, but with a new api endpoint:
[ndbd default]
NoOfReplicas=2
DataMemory=80M
IndexMemory=18M
[ndb_mgmd]
NodeId=1
hostname=192.168.0.2
datadir=/var/lib/mysql
[ndbd]
NodeId=2
hostname=192.168.0.3
datadir=/var/lib/mysql
[ndbd]
NodeId=3
hostname=192.168.0.4
datadir=/var/lib/mysql
[mysqld]
NodeId=4
hostname=192.168.0.10
[api]
This is the configuration used for clusterJ setup:
com.mysql.clusterj.connect:
host: 127.0.0.1:1186
database: my_db
Here is the docker-compose config:
version: '3'
services:
#Sets up the MySQL cluster ndb_mgmd process
database-manager:
image: mysql/mysql-cluster
networks:
database_net:
ipv4_address: 192.168.0.2
command: ndb_mgmd
ports:
- "1186:1186"
volumes:
- /c/Users/myuser/conf/mysql-cluster.cnf:/etc/mysql-cluster.cnf
# Sets up the first MySQL cluster data node
database-node-1:
image: mysql/mysql-cluster
networks:
database_net:
ipv4_address: 192.168.0.3
command: ndbd
depends_on:
- database-manager
# Sets up the second MySQL cluster data node
database-node-2:
image: mysql/mysql-cluster
networks:
database_net:
ipv4_address: 192.168.0.4
command: ndbd
depends_on:
- database-manager
#Sets up the first MySQL server process
database-server:
image: mysql/mysql-cluster
networks:
database_net:
ipv4_address: 192.168.0.10
environment:
- MYSQL_ALLOW_EMPTY_PASSWORD=true
- MYSQL_DATABASE=my_db
- MYSQL_USER=my_user
command: mysqld
networks:
database_net:
ipam:
config:
- subnet: 192.168.0.0/16
When I try to connect to the cluster I get the following error: '127.0.0.1:1186' nodeId 0; Return code: -1 error code: 0 message: .
I can see that the app running ClusterJ is registered to the cluster, but then it disconnects. Here is a excerpt from the docker mysql manager logs:
database-manager_1 | 2018-05-10 11:18:43 [MgmtSrvr] INFO -- Node 3: Communication to Node 4 opened
database-manager_1 | 2018-05-10 11:22:16 [MgmtSrvr] INFO -- Alloc node id 6 succeeded
database-manager_1 | 2018-05-10 11:22:16 [MgmtSrvr] INFO -- Nodeid 6 allocated for API at 10.0.2.2
Any help solving this issue would be much appreciated.
Here is how ndb_mgmd handles the request to start the ClusterJ application.
You connect to the MGM server on port 1186. In this connection you
will get the configuration. This configuration contains the IP addresses
of the data nodes. To connect to the data nodes ClusterJ will try to
connect to 192.168.0.3 and 192.168.0.4. Since ClusterJ is outside Docker,
I presume those addresses point to some different place.
The management server will also provide a dynamic port to use when
connecting to the NDB data node. It is a lot easier to manage this
by setting ServerPort for NDB data nodes. I usually use 11860 as
ServerPort, 2202 is also popular to use.
I am not sure how you mix a Docker environment with an external
environment. I assume it is possible to solve somehow by setting
up proper IP translation tables in the correct places.

Linking Containers in POD in K8S

I want to link my selenium/hub container to my chrome and firefox node containers in a POD.
In docker, it was easily defined in the docker compose yaml file.
I want to know how to achieve this linking in kubernetes.
This is what appears on the log.:
This is the error image:
apiVersion: v1
kind: Pod
metadata:
name: mytestingpod
spec:
containers:
- name: seleniumhub
image: selenium/hub
ports:
- containerPort: 4444
hostPort: 4444
- name: chromenode
image: selenium/node-chrome-debug
ports:
- containerPort: 5901
links: seleniumhub:hub
- name: firefoxnode
image: selenium/node-firefox-debug
ports:
- containerPort: 5902
links: seleniumhub:hub
2:
You don't need to link them. The way Kubernetes works, all the containers in the same Pod are already on the same networking namespace, meaning that they can just talk to each other through localhost and the right port.
The applications in a pod all use the same network namespace (same IP and port space), and can thus “find” each other and communicate using localhost. Because of this, applications in a pod must coordinate their usage of ports. Each pod has an IP address in a flat shared networking space that has full communication with other physical computers and pods across the network.
If you want to access the chromenode container from the seleniumhub container, just send a request to localhost:5901.
If you want to access the seleniumhub container from the chromenode container, just send a request to localhost:4444.
Simply use kompose described in "Translate a Docker Compose File to Kubernetes Resources": it will translate your docker-compose.yml file into kubernetes yaml files.
You will then see how the selenium/hub container declaration is translated into kubernetes config files.
Note though that docker link are obsolete.
Try instead to follow the kubernetes examples/selenium which are described here.
The way you connect applications with Kubernetes is through a service:
See "Connecting Applications with Services".

How does service discovery work with modern docker/docker-compose?

I'm using Docker 1.11.1 and docker-compose 1.8.0-rc2.
In the good old days (so, last year), you could set up a docker-compose.yml file like this:
app:
image: myapp
frontend:
image: myfrontend
links:
- app
And then start up the environment like this:
docker scale app=3 frontend=1
And your frontend container could inspect the environment variables
for variables named APP_1_PORT, APP_2_PORT, etc to discover the
available backend hosts and configure itself accordingly.
Times have changed. Now, we do this...
version: '2'
services:
app:
image: myapp
frontend:
image: myfrontend
links:
- app
...and instead of environment variables, we get DNS. So inside the
frontend container, I can ask for app_app_1 or app_app_2 or
app_app_3 and get the corresponding ip address. I can also ask for
app and get the address of app_app_1.
But how do I discover all of the available backend containers? I
guess I could loop over getent hosts ... until it fails:
counter=1
while :; do
getent hosts app_$counter || break
backends="$backends app_$counter"
let counter++
done
But that seems ugly and fragile.
I've heard rumors about round-robin dns, but (a) that doesn't seem to
be happening in my test environment, and (b) that doesn't necessarily
help if your frontend needs simultaneous connections to the backends.
How is simple container and service discovery meant to work in the
modern Docker world?
Docker's built-in Nameserver & Loadbalancer
Docker comes with a built-in nameserver. The server is, by default, reachable via 127.0.0.11:53.
Every container has by default a nameserver entry in /etc/resolv.conf, so it is not required to specify the address of the nameserver from within the container. That is why you can find your service from within the network with service or task_service_n.
If you do task_service_n then you will get the address of the corresponding service replica.
If you only ask for the service docker will perform internal load balancing between container in the same network and external load balancing to handle requests from outside.
When swarm is used, docker will additionally use two special networks.
The ingress network, which is actually an overlay network and handles incomming trafic to the swarm. It allows to query any service from any node in the swarm.
The docker_gwbridge, a bridge network, which connects the overlay networks of the individual hosts to an their physical network. (including ingress)
When using swarm to deploy services, the behavior as described in the examples below will not work unless endpointmode is set to dns roundrobin instead of vip.
endpoint_mode: vip - Docker assigns the service a virtual IP (VIP) that acts as the front end for clients to reach the service on a network. Docker routes requests between the client and available worker nodes for the service, without client knowledge of how many nodes are participating in the service or their IP addresses or ports. (This is the default.)
endpoint_mode: dnsrr - DNS round-robin (DNSRR) service discovery does not use a single virtual IP. Docker sets up DNS entries for the service such that a DNS query for the service name returns a list of IP addresses, and the client connects directly to one of these. DNS round-robin is useful in cases where you want to use your own load balancer, or for Hybrid Windows and Linux applications.
Example
For example deploy three replicas from dig/docker-compose.yml
version: '3.8'
services:
whoami:
image: "traefik/whoami"
deploy:
replicas: 3
DNS Lookup
You can use tools such as dig or nslookup to do a DNS lookup against the nameserver in the same network.
docker run --rm --network dig_default tutum/dnsutils dig whoami
; <<>> DiG 9.9.5-3ubuntu0.2-Ubuntu <<>> whoami
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 58433
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;whoami. IN A
;; ANSWER SECTION:
whoami. 600 IN A 172.28.0.3
whoami. 600 IN A 172.28.0.2
whoami. 600 IN A 172.28.0.4
;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11)
;; WHEN: Mon Nov 16 22:36:37 UTC 2020
;; MSG SIZE rcvd: 90
If you are only interested in the IP, you can provide the +short option
docker run --rm --network dig_default tutum/dnsutils dig +short whoami
172.28.0.3
172.28.0.4
172.28.0.2
Or look for specific service
docker run --rm --network dig_default tutum/dnsutils dig +short dig_whoami_2
172.28.0.4
Load balancing
The default loadbalancing happens on the transport layer or layer 4 of the OSI Model. So it is TCP/UDP based. That means it is not possible to inpsect and manipulate http headers with this method. In the enterprise edition it is apparently possible to use labels similar to the ones treafik is using in the example a bit further down.
docker run --rm --network dig_default curlimages/curl -Ls http://whoami
Hostname: eedc94d45bf4
IP: 127.0.0.1
IP: 172.28.0.3
RemoteAddr: 172.28.0.5:43910
GET / HTTP/1.1
Host: whoami
User-Agent: curl/7.73.0-DEV
Accept: */*
Here is the hostname from 10 times curl:
Hostname: eedc94d45bf4
Hostname: 42312c03a825
Hostname: 42312c03a825
Hostname: 42312c03a825
Hostname: eedc94d45bf4
Hostname: d922d86eccc6
Hostname: d922d86eccc6
Hostname: eedc94d45bf4
Hostname: 42312c03a825
Hostname: d922d86eccc6
Health Checks
Health checks, by default, are done by checking the process id (PID) of the container on the host kernel. If the process is running successfully, the container is considered healthy.
Oftentimes other health checks are required. The container may be running but the application inside has crashed. In many cases a TCP or HTTP check is preferred.
It is possible to bake a custom health checks into images. For example, using curl to perform L7 health checks.
FROM traefik/whoami
HEALTHCHECK CMD curl --fail http://localhost || exit 1
It is also possible to specify the health check via cli when starting the container.
docker run \
--health-cmd "curl --fail http://localhost || exit 1" \
--health-interval=5s \
--timeout=3s \
traefik/whoami
Example with Swarm
As initially mentioned, swarms behavior is different in that it will assign a virtual IP to services by default. Its actually not different its just docker or docker-compose doesn't create real services, it just imitates the behavior of swarm but still runs the container normally, as services can, in fact, only be created by manager nodes.
Keeping in mind we are on a swarm manager and thus the default mode is VIP
Create a overlay network that can be used by regular containers too
$ docker network create --driver overlay --attachable testnet
create some service with 2 replicas
$ docker service create --network testnet --replicas 2 --name digme nginx
Now lets use dig again and making sure we attach the container to the same network
$ docker run --network testnet --rm tutum/dnsutils dig digme
digme. 600 IN A 10.0.18.6
We see that indeed we only got one IP address back, so it appears that this is the virtual IP that has been assigned by docker.
Swarm allows actually to get the single IPs in this case without explicitly setting the endpoint mode.
We can query for tasks.<servicename> in this case that is tasks.digme
$ docker run --network testnet --rm tutum/dnsutils dig tasks.digme
tasks.digme. 600 IN A 10.0.18.7
tasks.digme. 600 IN A 10.0.18.8
This has brought us 2 A records pointing to the individual replicas.
Now lets create another service with endpointmode set to dns roundrobin
docker service create --endpoint-mode dnsrr --network testnet --replicas 2 --name digme2 nginx
$ docker run --network testnet --rm tutum/dnsutils dig digme2
digme2. 600 IN A 10.0.18.21
digme2. 600 IN A 10.0.18.20
This way we get both IPs without adding the prefix tasks.
Service Discovery & Loadbalancing Strategies
If the built in features are not sufficent, some strategies can be implemented to achieve better control. Below are some examples.
HAProxy
Haproxy can use the docker nameserver in combination with dynamic server templates to discover the running container. Then the traditional proxy features can be leveraged to achieve powerful layer 7 load balancing with http header manipulation and chaos engeering such as retries.
version: '3.8'
services:
loadbalancer:
image: haproxy
volumes:
- ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro
ports:
- 80:80
- 443:443
whoami:
image: "traefik/whoami"
deploy:
replicas: 3
...
resolvers docker
nameserver dns1 127.0.0.11:53
resolve_retries 3
timeout resolve 1s
timeout retry 1s
hold other 10s
hold refused 10s
hold nx 10s
hold timeout 10s
hold valid 10s
hold obsolete 10s
...
backend whoami
balance leastconn
option httpchk
option redispatch 1
retry-on all-retryable-errors
retries 2
http-request disable-l7-retry if METH_POST
dynamic-cookie-key MY_SERVICES_HASHED_ADDRESS
cookie MY_SERVICES_HASHED_ADDRESS insert dynamic
server-template whoami- 6 whoami:80 check resolvers docker init-addr libc,none
...
Traefik
The previous method is already pretty decent. However, you may have noticed that it requires knowing which services should be discovered and also the number of replicas to discover is hard coded. Traefik, a container native edge router, solves both problems. As long as we enable Traefik via label, the service will be discovered. This decentralized the configuration. It is as if each service registers itself.
The label can also be used to inspect and manipulate http headers.
version: "3.8"
services:
traefik:
image: "traefik:v2.3"
command:
- "--log.level=DEBUG"
- "--api.insecure=true"
- "--providers.docker=true"
- "--providers.docker.exposedbydefault=false"
- "--entrypoints.web.address=:80"
ports:
- "80:80"
- "8080:8080"
volumes:
- "/var/run/docker.sock:/var/run/docker.sock:ro"
whoami:
image: "traefik/whoami"
labels:
- "traefik.enable=true"
- "traefik.port=80"
- "traefik.http.routers.whoami.entrypoints=web"
- "traefik.http.routers.whoami.rule=PathPrefix(`/`)"
- "traefik.http.services.whoami.loadbalancer.sticky=true"
- "traefik.http.services.whoami.loadbalancer.sticky.cookie.name=MY_SERVICE_ADDRESS"
deploy:
replicas: 3
Consul
Consul is a tool for service discovery and configuration management. Services have to be registered via API request. It is a more complex solution that probably only makes sense in bigger clusters, but can be very powerful. Usually it recommended running this on bare metal and not in a container. You could install it alongside the docker host on each server in your cluster.
In this example it has been paired with the registrator image, which takes care of registering the docker services in consuls catalog.
The catalog can be leveraged in many ways. One of them is to use consul-template.
Note that consul comes with its own DNS resolver so in this instance the docker DNS resolver is somewhat neglected.
version: '3.8'
services:
consul:
image: gliderlabs/consul-server:latest
command: "-advertise=${MYHOST} -server -bootstrap"
container_name: consul
hostname: ${MYHOST}
ports:
- 8500:8500
registrator:
image: gliderlabs/registrator:latest
command: "-ip ${MYHOST} consul://${MYHOST}:8500"
container_name: registrator
hostname: ${MYHOST}
depends_on:
- consul
volumes:
- /var/run/docker.sock:/tmp/docker.sock
proxy:
build: .
ports:
- 80:80
depends_on:
- consul
whoami:
image: "traefik/whoami"
deploy:
replicas: 3
ports:
- "80"
Dockerfile for custom proxy image with consul template backed in.
FROM nginx
RUN curl https://releases.hashicorp.com/consul-template/0.25.1/consul-template_0.25.1_linux_amd64.tgz \
> consul-template_0.25.1_linux_amd64.tgz
RUN gunzip -c consul-template_0.25.1_linux_amd64.tgz | tar xvf -
RUN mv consul-template /usr/sbin/consul-template
RUN rm /etc/nginx/conf.d/default.conf
ADD proxy.conf.ctmpl /etc/nginx/conf.d/
ADD consul-template.hcl /
CMD [ "/bin/bash", "-c", "/etc/init.d/nginx start && consul-template -config=consul-template.hcl" ]
Consul template takes a template file and renders it according to the content of consuls catalog.
upstream whoami {
{{ range service "whoami" }}
server {{ .Address }}:{{ .Port }};
{{ end }}
}
server {
listen 80;
location / {
proxy_pass http://whoami;
}
}
After the template has been changed, the restart command is executed.
consul {
address = "consul:8500"
retry {
enabled = true
attempts = 12
backoff = "250ms"
}
}
template {
source = "/etc/nginx/conf.d/proxy.conf.ctmpl"
destination = "/etc/nginx/conf.d/proxy.conf"
perms = 0600
command = "/etc/init.d/nginx reload"
command_timeout = "60s"
}
Feature Table
Built In
HAProxy
Traefik
Consul-Template
Resolver
Docker
Docker
Docker
Consul
Service Discovery
Automatic
Server Templates
Label System
KV Store + Template
Health Checks
Yes
Yes
Yes
Yes
Load Balancing
L4
L4, L7
L4, L7
L4, L7
Sticky Session
No
Yes
Yes
Depends on proxy
Metrics
No
Stats Page
Dashboard
Dashboard
You can view some of the code samples in more detail on github.

Resources