I have some problem using docker swarm mode .
I want to have high availability with swarm mode.
I think I can do that with rolling update of swarm.
Something like this...
docker service update --env-add test=test --update-parallelism 1 --update-delay 10s 6bwm30rfabq4
However there is a problem.
My docker image have entrypoint. Because of this there is a little delay before the service(I mean docker container) is really up. But docker service just think the service is already running, because status of the container is 'Up'. Even the service still do some work on entrypoint. So some container return error when I try to connect the service.
For example, if I create docker service named 'test' and scale up to 4 with port 8080. I can access test:8080 on web browser. And I try to rolling update with --update-parallelism 1 --update-delay 10s options. After that I try to connect the service again.. one container return error.. Because Docker service think that container already run..even the container still doesn't up because of entrypoint. And after 10s another container return error.. because update is started and docker service also think that container is already up.
So.. Is there any solution to solve this problem?
Should I make some nginx settings for disconnect connection to error container and reconnect another one?
The HEALTHCHECK Dockerfile command works for this use case. You specify how Docker should check if the container is available, and it gets used during updates as well as checking service levels in Swarm.
There's a good article about it here: Reducing Deploy Risk With Docker’s New Health Check Instruction.
Related
I need some clarification in regards to using HEALTHCHECK on a docker service.
Context:
We are experimenting with a multi-node mariadb cluster and by utilizing HEALTHCHECK we would like the bootstrapping containers to remain unhealthy until bootstrapping is complete. We want this so that front-end users don’t access that particular container in the service until it is fully online and sync’d with the cluster. The issue is that bootstrapping relies on the network between containers in order to do a state transfer and it won’t work when a container isn’t accessible on the network.
Question:
When a container’s status is either starting or unhealthy does HEALTHCHECK completely kill network access to and from the container?
As an example, when a container is healthy I can run the command getent hosts tasks.<service_name>
inside the container which returns the IP address of other containers in a service. However, when the same container is unhealthy that command does not return anything… Hence my suspicion that HEALTHCHECK kills the network at the container level (as opposed to at the service/load balancer level) if the container isn’t healthy.
Thanks in advance
I ran some more tests and found my own answer. Basically docker does not kill container networking when it is either in the started or unhealthy phase. The reason getent hosts tasks.<service_name> command does not work during those phases is that that command goes back to get the container IP address through the service which does not have the unhealthy container(s) assigned to it.
I have a docker swarm setup with a typical web app stack (nginx & php). I need redis as a service in docker swarm. The swarm has 2 nodes and each node should have the web stack and redis service. But only one redis container should be active at a time (and be able to communicate on with each web stack), the other one must be there but in standby mode so that if the first redis fails, this one could switch quickly.
When you work with docker swarm, having a backup, standby container would be considered anti-pattern. A more recommended approach to deploy a reliable container using swarm would be to have a HEALTHCHECK command as part of your Dockerfile. You can set a specific interval after which the healthcheck commands comes into effect for your container to be able to warm up.
Now, club the HEALTHCHECK functionality with the fact that docker-swarm always maintains the specified number of contianers. Make your healthcheck script throw the exit code 1 if it becomes unhealthy. As soon as the swarm detects exit code 1, it kills the container and to maintain the number of containers, it spins up a new one.
The entire process takes only milliseconds and works seamlessly. Have multiple containers in case the warm-up time is long. This will prevent your service from becoming unavailable if one of the containers goes down.
Example of a healthcheck command:
HEALTHCHECK --interval=5m --timeout=3s CMD curl -f http://localhost/ || exit 1
We have a docker swarm and we normally run service container using Docker create service API. Now we are seeing after certain time interval the services are not responding ( means the application running inside container ). As of now the solution looks like restarting the service after specific time interval.And it worked when we tried it manually .
This is the top command output of Host worker node
And output of docker stats
Wanted to know what is the best approach to fix it. Also can we automate the solution.
I do the docker tutorial document at part 3. Because my computer is windows, I use the docker toolbox. Before part 3, I use the command docker run -p 8080:80 test, and it can connect to 192.168.99.100:8080, that's successful.
But when creates a swarm and deploies the docker-compose.yml, it was a success.
ID NAME MODE REPLICAS IMAGE PORTS
uskmy4zkflhf testswarm_web replicated 5/5 ***/get-started:test *:6666->80/tcp
However, when I used 192.168.99.100:6666 to connect, the page could not be displayed, and using ping, I could see that 192.168.99.100 could be connected.
When I uninstall the toolbox and then reinstall it, I deploy it only once, which means that the entire program sets the port only once and no containers occupy it. It doesn't work in this case either.
What's the problem with that?
The port publishing mechanism works differently when you use standalone or swarm mode. If you're using a compose file in swarm mode, you should not be using docker-compose up but docker stack deploy instead.
I would suggest taking it step-by-step, instead of using the stack deploy or compose approach, first learn to use the docker service create command, and take it one service at a time.
Try docker service create --name proxy --publish 8080:80 nginx and see if you can reach NGINX in 192.168.99.100:8080. Once you're there, try scaling it with docker service update --replicas=5 proxy.
Once you feel comfortable with this, you should be able to tell what's going on with more precision.
If you want to delve deeper into how por publishing works in swarm mode, I suggest this docs article.
Brand spanking new to Docker here. I have Docker running on a remote VM and am running a single dummy container on it (I can verify the container is running by issuing a docker ps command).
I'd like to secure my Docker installation by giving the docker user non-root access:
sudo usermod -aG docker myuser
But I'm afraid to muck around with Docker while any containers are running in case "hot deploys" create problems. So this has me wondering, in general: if I want to do any sort of operational work on Docker (daemon, I presume) while there are live containers running on it, what do I have to do? Do all containers need to be stopped/halted first? Or will Docker keep on ticking and apply the updates when appropriate?
Same goes for the containers themselves. Say I have a myapp-1.0.4 container deployed to a Docker daemon. Now I want to deploy myapp-1.0.5, how does this work? Do I stop 1.0.4, remove it from Docker, and then deploy/run 1.0.5? Or does Docker handle this for me under the hood?
if I want to do any sort of operational work on Docker (daemon, I presume) while there are live containers running on it, what do I have to do? Do all containers need to be stopped/halted first? Or will Docker keep on ticking and apply the updates when appropriate?
Usually, all containers are stopped first.
That happen typically when I upgrade docker itself: I find all my container stopped (except the data containers, which are just created, and remain so)
Say I have a myapp-1.0.4 container deployed to a Docker daemon. Now I want to deploy myapp-1.0.5, how does this work? Do I stop 1.0.4, remove it from Docker, and then deploy/run 1.0.5? Or does Docker handle this for me under the hood?
That depend on the nature and requirements of your app: for a completely stateless app, you could even run 1.0.5 (with different host ports mapped to your app exposed port), test it a bit, and stop 1.0.4 when you think 1.0.5 is ready.
But for an app with any kind of shared state or resource (mounted volumes, shared data container, ...), you would need to stop and rm 1.0.4 before starting the new container from 1.0.5 image.
(1) why don't you stop them [the data containers] when upgrading Docker?
Because... they were never started in the first place.
In the lifecycle of a container, you can create, then start, then run a container. But a data container, by definition, has no process to run: it just exposes VOLUME(S), for other container to mount (--volumes-from)
(2) What's the difference between a data/volume container, and a Docker container running, say a full bore MySQL server?
The difference is, again, that a data container doesn't run any process, so it doesn't exit when said process stops. That never happens, since there is no process to run.
The MySQL server container would be running as long as the server process doesn't stop.