Docker Swarm: traffic in assigned state - docker-swarm

When I scale a service up from 1 node (Node A) to 2 nodes (Node A and Node B), I see traffic immediately being routed to both nodes (including the new Node B even though it isn't ready).
As a result, an Nginx proxy will return 502s half the time (until Node B is ready).
Any suggestions how you can delay this traffic?
Note: this isn't waiting for another container to come up as mentioned here: Docker Compose wait for container X before starting Y
This is about delaying the network connection until the container is ready.

If you do not configure a healthcheck section, docker will assume that the container is available as soon as it is started.
Note that the initial healthcheck is only done after the set interval.
So you could add something extremely basic like testing if port 80 is connectable (you need nc in your docker image):
healthcheck:
test: nc -w 1 127.0.0.1 80 < /dev/null
interval: 30s
timeout: 10s
retries: 3
start_period: 5s

Related

Don't spin up dependent containers if dependency container healthcheck results in unhealthy

I have 3 services defined in docker-compose A, B and C. B and C depends_on A. If A's healthcheck results in unhealthy, I want other 2 containers not be spinned up. Currently, containers B and C are started only when container A is started with healthy status, and this is my expected behaviour. However, if container A becomes unhealthy after it gets created, I don't want dependent containers to be started, currently when A is created and becomes unhealthy only one of the other 2 container exits (sometimes A exits and other times B but not both). Here's the output when A is unhealthy.
Creating A ... done
Creating C ... done
ERROR: for B Container "1339a6d12091" is unhealthy. ERROR: Encountered errors while bringing up the project.
As we can see here in the ERROR message, only for B it shows 1339a6d12091(container A) is unhealthy. Whereas it should have reported this error for both B and C containers.
docker-compose
version: '2.3'
services:
B:
image: base_image
depends_on:
A:
condition: service_healthy
entrypoint: bash -c "/app/entrypoint_scripts/execute_B.sh"
C:
image: base_image
depends_on:
A:
condition: service_healthy
entrypoint: bash -c "/app/entrypoint_scripts/execute_C.sh"
A:
image: base_image
healthcheck:
test: ["CMD-SHELL", "test -f /tmp/compliance/executedfetcher"]
interval: 30s
timeout: 3600s
retries: 1
entrypoint: bash -c "/app/entrypoint_scripts/execute_A.sh"
My Expectation: B and C should wait for A to become healthy before starting (which is working fine for me). If A starts and become unhealthy without becoming healthy even for once B and C should not start.
Container A was exiting immediately after becoming unhealthy, as a result, the status: unhealthy was not available for long enough for other containers to read its value. The status: unhealthy was visible only for a fraction of a second and in that fraction of a second only one of the container (either A or B) was able to read it.
I added a sleep 100 statement in the execute_A.sh entrypoint script after the container was becoming unhealthy so that both B and C can easily read A's status, and that fixed the issue.

haproxy not load balancing the test app in docker swarm

I have 3 vm (virtualbox). All of them setup to use a single VIP with keepalived. (192.168.100.200). I have one proxy on each vm and one test app on each vm. ( I am testing a high availability scenario where loosing one or two nodes, keeps the setup going). I have the keepalived working correctly. It is just that the requests are not loadbalanced, it always going to the same instance.
What is going wrong ?
version: "3.8"
services:
# HAproxy
haproxy :
image : haproxy:2.3.2
container_name : haproxy
networks :
- app-net
ports :
- 80:80
volumes :
- /etc/haproxy/haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg
deploy :
mode : global
restart_policy :
condition : on-failure
delay : 5s
max_attempts : 3
window : 120s
# Nginx test site
wwwsite :
image : nginxdemos/hello
networks :
- app-net
ports :
- 8080:80
deploy :
mode : global
networks :
app-net :
driver : overlay
name : app-net
attachable: true
haproxy.conf
global
stats socket /var/run/haproxy.stat mode 660 level admin
stats timeout 30s
user root
group root
resolvers docker
nameserver dns1 127.0.0.11:53
resolve_retries 3
timeout resolve 1s
timeout retry 1s
hold other 10s
hold refused 10s
hold nx 10s
hold timeout 10s
hold valid 10s
hold obsolete 10s
defaults
timeout connect 10s
timeout client 30s
timeout server 30s
mode http
frontend fe_web
mode http
bind *:80
default_backend nodes
backend nodes
balance roundrobin
server node1 192.168.100.201:8080 check
server node2 192.168.100.202:8080 check
server node3 192.168.100.203:8080 check
listen stats
bind *:8081
mode http
stats enable
stats uri /
stats hide-version

How can i get my container to go from starting -> healthy

Background: My Docker container has a very long startup time, and it is hard to predict when it is done. And when the health check kicks in, it first may show 'unhealthy' since the startup is sometimes not finished. This may cause a restart or container removal from our automation tools.
My specific question is if I can control my Docker container so that it shows 'starting' until the setup is ready and that the health check can somehow be started immediately after that? Or is there any other recommendation on how to handle states in a good way using health checks?
Side question: I would love to get a reference to how transitions are made and determined during container startup and health check initiating. I have tried googling how to determine Docker (container) states but I can't find any good reference.
My specific question is if I can control my container so that it shows
'starting' until the setup is ready and that the health check can
somehow be started immediately after that?
I don't think that it is possible with just K8s or Docker.
Containers are not designed to communicate with Docker Daemon or Kubernetes to tell that its internal setup is done.
If the application takes a time to setup you could play with readiness and liveness probe options of Kubernetes.
You may indeed configure readynessProbe to perform the initial check after a specific delay.
For example to specify 120 seconds as initial delay :
readinessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 5
periodSeconds: 120
Same thing for livenessProbe:
livenessProbe:
httpGet:
path: /healthz
port: 8080
httpHeaders:
- name: Custom-Header
value: Awesome
initialDelaySeconds: 120
periodSeconds: 3
For Docker "alone" while not so much configurable you could make it to work with the --health-start-period parameter of the docker run sub command :
--health-start-period : Start period for the container to initialize
before starting health-retries countdown
For example you could specify an important value such as :
docker run --health-start-period=120s ...
Here is my work around. First in docker-compose set long timeout, start_period+timeout should be grater than max expected starting time, eg.
healthcheck:
test: ["CMD", "python3", "appstatus.py", '500']
interval: 60s
timeout: 900s
retries: 2
start_period: 30s
and then run script which can wait (if needed) before return results. In example above it is appstatus.py. In the script is something like:
timeout = int(sys.argv[1])
t0 = time.time()
while True:
time.sleep(2)
if isReady():
sys.exit(os.EX_OK)
t = time.time() - t0
if t > timeout:
sys.exit(os.EX_SOFTWARE)

Docker healthcheck causes the container to crash

I have a customized rabbitmq image that I am using with docker-compose (3.7) to launch a docker cluster. This is necessary because of some peculiar issues when trying to deploy a cluster in docker swarm. The image has a shell script which runs on the primary and secondary nodes and makes the modifications needed to run a cluster. This involves stopping rabbitmq and running rabbitmqctl commands to create the cluster between the two nodes. This configuiration works flawlessly until I try to add in a healthcheck. I have tried adding it in to the image and adding it into the compose file. Both cause the image to crash and constantly restart. I have the following shell script which gets copied into the image:
#!/bin/bash
set -eo pipefail
# A RabbitMQ node is considered healthy if all the below are true:
# * the rabbit app finished booting & it's running
# * there are no alarms
# * there is at least 1 active listener
rabbitmqctl eval '
{ true, rabbit_app_booted_and_running } = { rabbit:is_booted(node()), rabbit_app_booted_and_running },
{ [], no_alarms } = { rabbit:alarms(), no_alarms },
[] /= rabbit_networking:active_listeners(),
rabbitmq_node_is_healthy.
' || exit 1
On an already running image this works and produces the correct result.
I tried the flowing in the compose file:
healthcheck:
interval: 60s
timeout: 60s
retries: 10
start_period: 600s
test: ["CMD", "docker-healthcheck"]
It seems that the start_period is completely ignored. I can see the health status with an error right away. I have also tried the following native rabbitmq diagnostics command:
rabbitmq-diagnostics -q check_running && rabbitmq-diagnostics -q check_local_alarms
This oddly fails with an "unable to find rabbitmq-diagnostics" error, despite the fact the program is definitely in the path. I can execute the command successfully in an already running container.
If I create the container without the healthcheck and then add it in after the fact from the command line with:
docker service update --health-cmd docker-healthcheck --health-interval 60s --health-timeout 60s --health-retries 10 [container id]
it marks the container healthy. So it works but just not in a start up configuration. It seems like to me that the healthcheck should not begin until 10 minutes have passed. It doesn't seem to matter how long I wait for everything to startup using the start_period parameter it still causes the container to fail.
Is this a bug or is there something mysterious about the way start_period works?
Anyone else every have this problem?

How to run a redis cluster on a docker cluster?

Context
I am trying to setup a redis cluster so that it runs on top off a docker cluster, to achieve maximum auto-healing.
More precisely, I have a docker compose file, which defines a service that has 3 replicas. Each service replica has a redis-server running on.
Then I have a program inside each replica that listens to changes on the docker cluster and that starts the cluster when conditions are met (each 3 redis-servers know each other).
Setting up the redis cluster works has expected, the cluster is formed and all the redis-servers communicate well, but the communication between redis-servers is inside the docker cluster.
The Problem
When I try to communicate from outside the docker cluster, because of the ingress mode I am able to talk to a redis-server, however when I try to add info (eg: set foo bar) and the client is moved to another redis-server the communication hangs and eventually times out.
Code
This is the docker-compose file.
version: "3.3"
services:
redis-cluster:
image: redis-srv-instance
volumes:
- /var/run/:/var/run
deploy:
mode: replicated
#endpoint_mode: dnsrr
replicas: 3
resources:
limits:
cpus: '0.5'
memory: 512M
ports:
- target: 6379
published: 30000
protocol: tcp
mode: ingress
The flux of commands that show the problem.
Client
~ ./redis-cli -c -p 30000
127.0.0.1:30000>
Redis-server
OK
1506533095.032738 [0 10.255.0.2:59700] "COMMAND"
1506533098.335858 [0 10.255.0.2:59700] "info"
Client
127.0.0.1:30000> set ghb fki
OK
Redis-server
1506533566.481334 [0 10.255.0.2:59718] "COMMAND"
1506533571.315238 [0 10.255.0.2:59718] "set" "ghb" "fki"
Client
127.0.0.1:30000> set rte fgh
-> Redirected to slot [3830] located at 10.0.0.3:6379
Could not connect to Redis at 10.0.0.3:6379: Operation timed out
Could not connect to Redis at 10.0.0.3:6379: Operation timed out
(150.31s)
not connected>
Any ideas? I have also tried making my one proxy/load balancer but didn't work.
Thank you! Have a nice day.
For this use case, sentinel might help. Redis on its own is not capably of high availability. Sentinel on the other side is a distributed system which can do the following for you:
Route the ingress trafic to the current Redis master.
Elect a new Redis master should the current one fail.
While I have previously done research on this topic, I have not yet managed to pull to getter a working example.
redis-cli would get the redis server ip inside the ingress network, and try to access the remote redis server by that ip directly. That is why redis-cli shows Redirected to slot [3830] located at 10.0.0.3:6379. But this internal 10.0.0.3 is not accessible to redis-cli.
One solution is to run another proxy service which attaches to the same network with redis cluster. The application sends all requests to that proxy service, and the proxy service talks with redis cluster.
Or you could create 3 swarm services that uses the bridge network and exposes the redis port to node. Your internal program needs to change accordingly.

Resources