Docker Stack Swarm - Service Replicas are not spread for Mutli Service Stack - docker

I have deployed a stack with a of 4 services on two hosts (docker compose version 3).
The services are Elasticsearch, Kibana. Redis, Visualiser and finally my Web App. I have't set any resource restrictions yet.
I spun two virtual host via docker-machine , one with 2GB and one with 1GB.
Then I increased the replicas of my web app to 2 replicas, which resolved to the following distribution:
Host1 (Master):
Kibana, Redis, Web App, Visualiser, WebApp
Host2 (Worker):
Elasticsearch
Why is the Swarm Manager distributing both Web App Containers to the same host. Wouldn't it be smarter if Web App is distributed to both hosts?
Besides node tagging I couldn't find any other way in the docs to influence the distribution.
Am I missing something?
Thanks
Bjorn
docker-compose.yml
version: "3"
services:
visualizer:
image: dockersamples/visualizer:stable
ports:
- "8080:8080"
volumes:
- "/var/run/docker.sock:/var/run/docker.sock"
deploy:
placement:
constraints: [node.role == manager]
networks:
- webnet
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:5.4.3
environment:
ES_JAVA_OPTS: -Xms1g -Xmx1g
ulimits:
memlock: -1
nofile:
hard: 65536
soft: 65536
nproc: 65538
deploy:
resources:
limits:
cpus: "0.5"
memory: 1g
volumes:
- esdata:/usr/share/elasticsearch/data
ports:
- 9200:9200
- 9300:9300
networks:
- webnet
web:
# replace username/repo:tag with your name and image details
image: bjng/workinseason:swarm
deploy:
replicas: 2
restart_policy:
condition: on-failure
ports:
- "80:6000"
networks:
- webnet
kibana:
image: docker.elastic.co/kibana/kibana:5.4.3
deploy:
placement:
constraints: [node.role == manager]
ports:
- "5601:5601"
networks:
- webnet
redis:
image: "redis:alpine"
networks:
- webnet
volumes:
esdata:
driver: local
networks:
webnet:

Docker schedules tasks (containers) based on available resources; if two nodes have enough resources, the container can be scheduled on either one.
Recent versions of Docker use "HA" scheduling by default, which means that tasks for the same service are spread over multiple nodes, if possible (see this pull request) https://github.com/docker/swarmkit/pull/1446

Related

Docker swarm with reverse proxy, run requests based on request uri path to certain node

I have the following nodes with hostnames docker-php-pos-web-1,docker-php-pos-web-2,docker-php-pos-web-3,and docker-php-pos-web-4 in a docker swarm cluster with caddy proxy configured on distributed mode
I want requests with cron anywhere in the url path to run on docker-php-pos-web-4. An example request would be demo.phppointofsale.com/index.php/ecommerce/cron. If "cron" is not in the url, it would route as normal.
I want to avoid having 2 copies of production_php_point_of_sale_app just for this.
I am already routing to docker-php-pos-web-4 from my load balancer for "cron" in request path, BUT since in docker swarm the mesh network can decide on which node actually "runs" it. I always want docker-php-pos-web-4 to run these tasks
Below is my docker-compose.yml file
version: '3.9'
services:
production_php_point_of_sale_app:
logging:
driver: "local"
deploy:
restart_policy:
condition: any
mode: global
labels:
caddy: "http://*.phppointofsale.com, http://*.phppos.com"
caddy.reverse_proxy.trusted_proxies: "private_ranges"
caddy.reverse_proxy: "{{upstreams}}"
image: phppointofsale/production-app
build:
context: "production_php_point_of_sale_app"
restart: always
env_file:
- production_php_point_of_sale_app/.env
- .env
networks:
- app_network
- mail
caddy_server:
image: lucaslorentz/caddy-docker-proxy:ci-alpine
ports:
- 80:80
networks:
- caddy_controller
- app_network
environment:
- CADDY_DOCKER_MODE=server
- CADDY_CONTROLLER_NETWORK=10.200.200.0/24
volumes:
- caddy_data:/data
deploy:
restart_policy:
condition: any
mode: global
labels:
caddy_controlled_server:
caddy_controller:
image: lucaslorentz/caddy-docker-proxy:ci-alpine
networks:
- caddy_controller
- app_network
environment:
- CADDY_DOCKER_MODE=controller
- CADDY_CONTROLLER_NETWORK=10.200.200.0/24
volumes:
- /var/run/docker.sock:/var/run/docker.sock
deploy:
restart_policy:
condition: any
placement:
constraints: [node.role == manager]
networks:
caddy_controller:
driver: overlay
ipam:
driver: default
config:
- subnet: "10.200.200.0/24"
app_network:
driver: overlay
mail:
driver: overlay
volumes:
caddy_data: {}

Traefik 2 Gateway timeout when attempting to proxy to a container that is on two networks

Plenty of questions on this front but they are using a more complex docker-compose.yml than I am, so I fear they may have mis-configurations in their compose file such as this one:
Traefik 2 Gateway Timeout
Within a single docker-compose.yml, I am trying to keep a database container on its own network, an app container on both the database network and the Traefik network, and the Traefik network managed elsewhere by Traefik.
version: '3.9'
services:
wordpress:
image: wordpress:6.1
container_name: dev-wp1
deploy:
resources:
limits:
cpus: '0.50'
memory: 256M
restart: always
environment:
WORDPRESS_DB_HOST: db
WORDPRESS_DB_USER: dev
WORDPRESS_DB_PASSWORD: dev
WORDPRESS_DB_NAME: dev
volumes:
- /opt/container_config/exampledomain.local/wp:/var/www/html
networks:
- traefik-network
- db-network
labels:
- traefik.enable=true
- traefik.http.routers.dev-wp1.rule=Host(`exampledomain.local`)
- traefik.http.routers.dev-wp1.entrypoints=websecure
- traefik.http.routers.dev-wp1.tls=true
db:
image: mariadb:10.10
container_name: dev-db1
deploy:
resources:
limits:
cpus: '0.50'
memory: 256M
restart: always
environment:
MYSQL_DATABASE: dev
MYSQL_USER: dev
MYSQL_PASSWORD: dev
MYSQL_RANDOM_ROOT_PASSWORD: '1'
volumes:
- /opt/container_config/exampledomain.local/db:/var/lib/mysql
networks:
- db-network
networks:
db-network:
name: db-network
traefik-network:
name: traefik-network
external: true
Attempting to hit exampledomain.local fails.
If I eliminate the db-network, and place the database on the traefik network, resolution to exampledomain.local works fine. I do not wish to expose the ports of the wp1 container and would desire traefik to be the only exposed ports on the host. I would prefer not having the db container on the traefik-network. What am I missing?

docker stack: Redis not working on worker node

I just completed the docker documentation and created two instances on aws (http://13.127.150.218, http://13.235.134.73). The first one is manager and the second one is the worker. Following is the composed file I used to deploy
version: "3"
services:
web:
# replace username/repo:tag with your name and image details
image: username/repo:tag
deploy:
replicas: 5
restart_policy:
condition: on-failure
resources:
limits:
cpus: "0.1"
memory: 50M
ports:
- "80:80"
networks:
- webnet
visualizer:
image: dockersamples/visualizer:stable
ports:
- "8080:8080"
volumes:
- "/var/run/docker.sock:/var/run/docker.sock"
deploy:
placement:
constraints: [node.role == manager]
networks:
- webnet
redis:
image: redis
ports:
- "6379:6379"
volumes:
- "/home/docker/data:/data"
deploy:
placement:
constraints: [node.role == manager]
command: redis-server --appendonly yes
networks:
- webnet
networks:
webnet:
Here the redis service has the constraint that restricts it to run only on manager node. Now my question is how the web service on worker instance is supposed to use the redis service.
You need to use the hostname parameter in all container, so you can use this value to access services from worker or to access from worker the services on manager.
version: "3"
services:
web:
# replace username/repo:tag with your name and image details
image: username/repo:tag
hostname: "web"
deploy:
replicas: 5
restart_policy:
condition: on-failure
resources:
limits:
cpus: "0.1"
memory: 50M
ports:
- "80:80"
networks:
- webnet
visualizer:
image: dockersamples/visualizer:stable
hostname: "visualizer"
ports:
- "8080:8080"
volumes:
- "/var/run/docker.sock:/var/run/docker.sock"
deploy:
placement:
constraints: [node.role == manager]
networks:
- webnet
redis:
image: redis
hostname: "redis"
ports:
- "6379:6379"
volumes:
- "/home/docker/data:/data"
deploy:
placement:
constraints: [node.role == manager]
command: redis-server --appendonly yes
networks:
- webnet
networks:
webnet:
In addictional if you use the portainer instead of visualizer you can control you SWARM stack with more options:
https://hub.docker.com/r/portainer/portainer
BR,
Carlos
Consider the stack file as per the below example -
Regardless of where it is placed manager|worker all the services in the stack file being on the same network can use the embedded DNS functionality which helps to resolve each service by the service name defined.
In this case the service web makes use of service redis by its service name.
Here is an example of the ping command able to resolve the service web from within the container associated with the redis service -
Read more about the Swarm Native Service Discovery to understand this.

Netdata in a docker swarm environment

I'm quite new to Netdata and also Docker Swarm. I ran Netdata for a while on single hosts but now trying to stream Netdata from workers to a manager node in a swarm environment where the manager also should act as a central Netdata instance. I'm aiming to only monitor the data from the manager.
Here's my compose file for the stack:
version: '3.2'
services:
netdata-client:
image: titpetric/netdata
hostname: "{{.Node.Hostname}}"
cap_add:
- SYS_PTRACE
security_opt:
- apparmor:unconfined
environment:
- NETDATA_STREAM_DESTINATION=control:19999
- NETDATA_STREAM_API_KEY=1x214ch15h3at1289y
- PGID=999
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /var/run/docker.sock:/var/run/docker.sock
networks:
- netdata
deploy:
mode: global
placement:
constraints: [node.role == worker]
netdata-central:
image: titpetric/netdata
hostname: control
cap_add:
- SYS_PTRACE
security_opt:
- apparmor:unconfined
environment:
- NETDATA_API_KEY_ENABLE_1x214ch15h3at1289y=1
ports:
- '19999:19999'
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /var/run/docker.sock:/var/run/docker.sock
networks:
- netdata
deploy:
mode: replicated
replicas: 1
placement:
constraints: [node.role == manager]
networks:
netdata:
driver: overlay
attachable: true
Netdata on the manager works fine and the container runs on the one worker node I'm testing on. According to log output it seems to run well and gathers names from the docker containers running as it does in a local environment.
Problem is that it can't connect to the netdata-central service running on the manager.
This is the error message:
2019-01-04 08:35:28: netdata INFO : STREAM_SENDER[7] : STREAM 7 [send to control:19999]: connecting...,
2019-01-04 08:35:28: netdata ERROR : STREAM_SENDER[7] : Cannot resolve host 'control', port '19999': Name or service not known,
not sure why it can't resolve the hostname, thought it should work that way on the overlay network. Maybe there's a better way to connect and not rely on the hostname?
Any help is appreciated.
EDIT: as this question might come up - the firewall (ufw) on the control host is inactive, also I think the error message clearly points to a problem with name resolution.
Your API-Key is in the wrong format..it has to be a GUID. You can generate one with the "uuidgen" command...
https://github.com/netdata/netdata/blob/63c96aa96f96f3aea10bdcd2ecd92c889f26b3af/conf.d/stream.conf#L7
In the latest image the environment variables does not work.
The solution is to create a configuration file for the stream.
My working compose file is:
version: '3.7'
configs:
netdata_stream_master:
file: $PWD/stream-master.conf
netdata_stream_client:
file: $PWD/stream-client.conf
services:
netdata-client:
image: netdata/netdata:v1.21.1
hostname: "{{.Node.Hostname}}"
depends_on:
- netdata-central
configs:
-
mode: 444
source: netdata_stream_client
target: /etc/netdata/stream.conf
security_opt:
- apparmor:unconfined
environment:
- PGID=999
volumes:
- /proc:/host/proc:ro
- /etc/passwd:/host/etc/passwd:ro
- /etc/group:/host/etc/group:ro
- /sys:/host/sys:ro
- /var/run/docker.sock:/var/run/docker.sock
deploy:
mode: global
netdata-central:
image: netdata/netdata:v1.21.1
hostname: control
configs:
-
mode: 444
source: netdata_stream_master
target: /etc/netdata/stream.conf
security_opt:
- apparmor:unconfined
environment:
- PGID=999
ports:
- '19999:19999'
volumes:
- /etc/passwd:/host/etc/passwd:ro
- /etc/group:/host/etc/group:ro
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /var/run/docker.sock:/var/run/docker.sock
deploy:
mode: replicated
replicas: 1
placement:
constraints: [node.role == manager]

docker swarm list dependencies of a service

Let's say we have the following stack file:
version: "3"
services:
ubuntu:
image: ubuntu
deploy:
replicas: 2
restart_policy:
condition: on-failure
resources:
limits:
cpus: "0.1"
memory: 50M
entrypoint:
- tail
- -f
- /dev/null
logging:
driver: "json-file"
ports:
- "80:80"
networks:
- webnet
web:
image: httpd
ports:
- "8080:8080"
hostname: "apache"
volumes:
- "/var/run/docker.sock:/var/run/docker.sock"
deploy:
placement:
constraints: [node.role == manager]
resources:
limits:
memory: 32M
reservations:
memory: 16M
depends_on:
- "ubuntu"
networks:
- webnet
networks:
webnet:
When I run docker service inspect mystack_web the output generated does not show any reference to the depends_on entry.
Is that okay? and how can I print the dependencies of a given docker service?
The depends_on isn't used on docker swarm:
The depends_on option is ignored when deploying a stack in swarm mode with a version 3 compose file. - from Docker Docs
Another good explanation on GitHub:
depends_on is a no-op when used with docker stack deploy. Swarm mode services are restarted when they fail, so there's no reason to delay their startup. Even if they fail a few times, they will eventually recover. - from GitHub

Resources