How docker deamon retry to make docker container up? - docker

If the docker container is failing and I want that docker deamon should retry it at least 3 times without giving manual docker run command this to us without manual intervention what I should do ?

Make use of docker's restart policy.
https://docs.docker.com/engine/reference/commandline/run/#restart-policies---restart
If finding out whether a container is up or not is a bit more complex than just noticing that the process has exited, use docker's healthcheck.
https://scoutapm.com/blog/how-to-use-docker-healthcheck

Related

Cron job to kill all hanging docker containers

I am new to docker containers but we have containers being deployed and due to some internal application network bugs the process running in the container hangs and the docker container is not terminated. While we debug this issue I would like a way to find all those containers and setup a cron job to periodically check and kill those relevant containers.
So how would I determine from "docker ps -a" which containers should be dropped and how would I go about it? Any ideas? We are eventually moving to kubernetes which will help with these issues.
Docker already have a command to cleanup the docker environment, you can use it manually or maybe setup a job to run the following command:
$ docker system prune
Remove all unused containers, networks, images (both dangling and
unreferenced), and optionally, volumes.
refer to the documentation for more details on advanced usage.

How to disable/leave docker swarm mode when starting docker daemon?

is there any way to disable/leave the swarm mode of docker when starting the daemon manually, e.g. dockerd --leave-swarm, instead of starting the daemon and leave the swarm mode afterwards, e.g. using docker swarm leave?
Many thanks in advance,
Aljoscha
I don't think this is anticipated by docker developers. When node leaves swarm, it needs to notify swarm managers, that it will not be available anymore.
Leaving swarm is a one time action and passing this as an configuration option to the daemon is weird. You may try to suggest that on docker's github, but I don't think it will have much supporters.
Perhaps more intuitive option would be to have ability to start dockerd in a way that communication to docker swarm manager would be suspended - so your dockerd is running only locally, but if you start without that flag (--local?) it would reconnect to swarm that it was attached before.

How do you kill a docker containers default command without killing the entire container?

I am running a docker container which contains a node server. I want to attach to the container, kill the running server, and restart it (for development). However, when I kill the node server it kills the entire container (presumably because I am killing the process the container was started with).
Is this possible? This answer helped, but it doesn't explain how to kill the container's default process without killing the container (if possible).
If what I am trying to do isn't possible, what is the best way around this problem? Adding command: bash -c "while true; do echo 'Hit CTRL+C'; sleep 1; done" to each image in my docker-compose, as suggested in the comments of the linked answer, doesn't seem like the ideal solution, since it forces me to attach to my containers after they are up and run the command manually.
This is by design by Docker. Each container is supposed to be a stateless instance of a service. If that service is interrupted, the container is destroyed. If that service is requested/started, it is created. If you're using an orchestration platform like k8s, swarm, mesos, cattle, etc at least.
There are applications that exist to represent PID 1 rather than the service itself. But this goes against the design philosophy of microservices and containers. Here is an example of an init system that can run as PID 1 instead and allow you to kill and spawn processes within your container at will: https://github.com/Yelp/dumb-init
Why do you want to reboot the node server? To apply changes from a config file or something? If so, you're looking for a solution in the wrong direction. You should instead define a persistent volume so that when the container respawns the service would reread said config file.
https://docs.docker.com/engine/admin/volumes/volumes/
If you need to restart the process that's running the container, then simply run a:
docker restart $container_name_or_id
Exec'ing into a container shouldn't be needed for normal operations, consider that a debugging tool.
Rather than changing the script that gets run to automatically restart, I'd move that out to the docker engine so it's visible if your container is crashing:
docker run --restart=unless-stopped ...
When a container is run with the above option, docker will restart it for you, unless you intentionally run a docker stop on the container.
As for why killing pid 1 in the container shuts it down, it's the same as killing pid 1 on a linux server. If you kill init/systemd, the box will go down. Inside the namespace of the container, similar rules apply and cannot be changed.

DevOps: automatically restarting a failed container

What is a light-wight approach to restart a failed docker container automatically -- that is, without having to install and setup tools like Swarm or Kubernetes?
I am asking because I need to have some resilience for a running container in the event the container "stops" as a result of failure of the process that it's running.
Check first if you can add restart policies to your docker run command.
They are the built-in Docker mechanism for restarting containers when they exit.
If set, restart policies will be used when the Docker daemon starts up, as typically happens after a system boot.
For instance:
on-failure[:max-retries]
Restart only if the container exits with a non-zero exit status.
Optionally, limit the number of restart retries the Docker daemon attempts.
If not, see "Automatically start containers"

Deploying changes to Docker and its containers on the fly

Brand spanking new to Docker here. I have Docker running on a remote VM and am running a single dummy container on it (I can verify the container is running by issuing a docker ps command).
I'd like to secure my Docker installation by giving the docker user non-root access:
sudo usermod -aG docker myuser
But I'm afraid to muck around with Docker while any containers are running in case "hot deploys" create problems. So this has me wondering, in general: if I want to do any sort of operational work on Docker (daemon, I presume) while there are live containers running on it, what do I have to do? Do all containers need to be stopped/halted first? Or will Docker keep on ticking and apply the updates when appropriate?
Same goes for the containers themselves. Say I have a myapp-1.0.4 container deployed to a Docker daemon. Now I want to deploy myapp-1.0.5, how does this work? Do I stop 1.0.4, remove it from Docker, and then deploy/run 1.0.5? Or does Docker handle this for me under the hood?
if I want to do any sort of operational work on Docker (daemon, I presume) while there are live containers running on it, what do I have to do? Do all containers need to be stopped/halted first? Or will Docker keep on ticking and apply the updates when appropriate?
Usually, all containers are stopped first.
That happen typically when I upgrade docker itself: I find all my container stopped (except the data containers, which are just created, and remain so)
Say I have a myapp-1.0.4 container deployed to a Docker daemon. Now I want to deploy myapp-1.0.5, how does this work? Do I stop 1.0.4, remove it from Docker, and then deploy/run 1.0.5? Or does Docker handle this for me under the hood?
That depend on the nature and requirements of your app: for a completely stateless app, you could even run 1.0.5 (with different host ports mapped to your app exposed port), test it a bit, and stop 1.0.4 when you think 1.0.5 is ready.
But for an app with any kind of shared state or resource (mounted volumes, shared data container, ...), you would need to stop and rm 1.0.4 before starting the new container from 1.0.5 image.
(1) why don't you stop them [the data containers] when upgrading Docker?
Because... they were never started in the first place.
In the lifecycle of a container, you can create, then start, then run a container. But a data container, by definition, has no process to run: it just exposes VOLUME(S), for other container to mount (--volumes-from)
(2) What's the difference between a data/volume container, and a Docker container running, say a full bore MySQL server?
The difference is, again, that a data container doesn't run any process, so it doesn't exit when said process stops. That never happens, since there is no process to run.
The MySQL server container would be running as long as the server process doesn't stop.

Resources