Docker: how to use a restart policy? - docker

Documentation says:
unless-stopped Similar to always, except that when the container is stopped (manually or otherwise), it is not restarted even after Docker daemon restarts.
Ok. I understand what manually means: docker stop container_name. But what or otherwise stands for?

The paragraph after the table clarifies (emphasis mine):
configures it to always restart unless it is explicitly stopped or Docker is restarted.
One example is if the host reboots. Containers will be implicitly stopped (the container metadata and filesystems exist but the main container process does not), and at this point restart policies apply as well.
Event
no
on-failure
unless-stopped
always
docker stop
Stopped
Stopped
Stopped
Stopped
Host reboot
Stopped
Stopped
Stopped
Restarted
Process exits (code=0)
Stopped
Stopped
Restarted
Restarted
Process exits (codeā‰ 0)
Stopped
Restarted
Restarted
Restarted
The documentation hints that this also applies if the Docker daemon is restarted, but this is a somewhat unusual case. My memory is that this event frequently seems to not affect running containers at all.

Related

Kubernetes Cluster - Containers do not restart after reboot

I have a kubernetes cluster setup at home on two bare metal machines.
I used kubespray to install both and it uses kubeadm behind the scenes.
The problem I encounter is that all containers within the cluster have a restartPolicy: no which makes my cluster break when I restart the main node.
I have to manually run "docker container start" for all containers in "kube-system" namespace to make it work after reboot.
Does anyone have an idea where the problem might be coming from ?
Docker provides restart policies to control whether your containers start automatically when they exit, or when Docker restarts. Here your containers have the restart policy - no which means this policy will never automatically start the container under any circumstance.
You need to change the restart policy to Always which restarts the container if it stops. If it is manually stopped, it is restarted only when Docker daemon restarts or the container itself is manually restarted.
You can change the restart policy of an existing container using docker update. Pass the name of the container to the command. You can find container names by running docker ps -a.
docker update --restart=always <CONTAINER NAME>
Restart policy details:
Keep the following in mind when using restart policies:
A restart policy only takes effect after a container starts successfully. In this case, starting successfully means that the container is up for at least 10 seconds and Docker has started monitoring it. This prevents a container which does not start at all from going into a restart loop.
If you manually stop a container, its restart policy is ignored until the Docker daemon restarts or the container is manually restarted. This is another attempt to prevent a restart loop.
I am answering my question:
It wasn't probably very clear but I was talking about the kube-system pods that manage the whole cluster and that should automatically start when the machine restarts.
It turns out those pods (ex: code-dns, kube-proxy, etc) have a restart policy of "no" intentionally and it is the kubelet service on the node that spins up the whole cluster when you restart your node.
https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/
In my case kubelet could not start due to missing cri-dockerd process.
Check the issue I opened at kubespray:
Verifying the kubelet logs is done like so:
journalctl -u kubelet -f

Docker container not shutting down in Swarm cluster

On my docker swarm cluster, when I perform a docker stack deploy with a new version of my service's image or do a docker service update --force, the old containers of the service(s) get desired state SHUTDOWN, they remain with a current state running.
However, they don't seem te be actually running, I can't do anything with them, docker logs, docker inspect, docker exec, ... nothing.
The only way to get rid of them is to restart the docker daemon.
What would you consider look at to try to understand and fix this recurring issue ?
We faced the same issue a few days ago: Turned out, we had a logging-driver configured, but the logging-server was not available. We stopped using this anyway, but forgot to remove the configuration from the service:
logging:
driver: fluentd
options:
fluentd-address: localhost:24224
fluentd-async-connect: "true"
Removing this configuration fixed the issue for future containers. Old instances were still hanging around, but restarting Docker helped.

Does restarting docker service kills all containers?

I'm having trouble with docker where docker ps won't return and is stuck.
I found that doinng docker service restart something like
sudo service docker restart (https://forums.docker.com/t/what-to-do-when-all-docker-commands-hang/28103/4)
However I'm worried if it will kill all the running containers? (I guess the service do provide service so that docker containers can run?)
In the default configuration, your assumption is correct: If the docker daemon is stopped, all running containers are shut down.. But, as outlined on the link, this behaviour can be changed on docker >= 1.12 by adding
{
"live-restore": true
}
to /etc/docker/daemon.json. Crux: the daemon must be restarted for this change to take effect. Please take note of the limitations of live reload, e.g. only patch version upgrades are supported, not major version upgrades.
Another possibility is to define a restart policy when starting a container. To do so, pass one of the following values as value for the command line argument --restart when starting the container via docker run:
no Do not automatically restart the container. (the default)
on-failure Restart the container if it exits due to an error, which manifests
as a non-zero exit code.
always Always restart the container if it stops. If it is manually stopped,
it is restarted only when Docker daemon restarts or the container
itself is manually restarted.
(See the second bullet listed in restart policy details)
unless-stopped Similar to always, except that when the container is stopped
(manually or otherwise), it is not restarted even after Docker
daemon restarts.
For your specific situation, this would mean that you could:
Restart all containers with --restart always (more on that further below)
Re-configure the docker daemon to allow for live reload
Restart the docker daemon (which is not yet configured for live reload, but will be after this restart)
This restart would shut down and then restart all your containers once. But from then on, you should be free to stop the docker daemon without your containers terminating.
Handling major version upgrades
As mentioned above, live reload cannot handle major version upgrades. For a major version upgrade, one has to tear down all running containers. With a restart policy of always, however, the containers will be restarted after the docker daemon is restarted after the upgrade.

Docker hangs and gets corrupted on reboot

We are running a scheduling engine with docker, chronos & mesos.
Running 2 mesos slaves on each node. Sometimes, too many Jobs gets executed on each node and docker becomes unresponsive and docker gets corrupted on rebooting the server. Is there anything wrong with the setup? Not sure, why docker hangs and gets corrupted on reboot?
Thanks
Running Docker containers won't work properly because restarting one agent
will cause Docker containers managed by the other agent to be deleted.
Check out --cgroups_root flag in
https://github.com/apache/mesos/blob/master/docs/configuration/agent.md
This flag only applies to MesosContainerizer (can be used to launch Docker
containers).

How do you kill a docker containers default command without killing the entire container?

I am running a docker container which contains a node server. I want to attach to the container, kill the running server, and restart it (for development). However, when I kill the node server it kills the entire container (presumably because I am killing the process the container was started with).
Is this possible? This answer helped, but it doesn't explain how to kill the container's default process without killing the container (if possible).
If what I am trying to do isn't possible, what is the best way around this problem? Adding command: bash -c "while true; do echo 'Hit CTRL+C'; sleep 1; done" to each image in my docker-compose, as suggested in the comments of the linked answer, doesn't seem like the ideal solution, since it forces me to attach to my containers after they are up and run the command manually.
This is by design by Docker. Each container is supposed to be a stateless instance of a service. If that service is interrupted, the container is destroyed. If that service is requested/started, it is created. If you're using an orchestration platform like k8s, swarm, mesos, cattle, etc at least.
There are applications that exist to represent PID 1 rather than the service itself. But this goes against the design philosophy of microservices and containers. Here is an example of an init system that can run as PID 1 instead and allow you to kill and spawn processes within your container at will: https://github.com/Yelp/dumb-init
Why do you want to reboot the node server? To apply changes from a config file or something? If so, you're looking for a solution in the wrong direction. You should instead define a persistent volume so that when the container respawns the service would reread said config file.
https://docs.docker.com/engine/admin/volumes/volumes/
If you need to restart the process that's running the container, then simply run a:
docker restart $container_name_or_id
Exec'ing into a container shouldn't be needed for normal operations, consider that a debugging tool.
Rather than changing the script that gets run to automatically restart, I'd move that out to the docker engine so it's visible if your container is crashing:
docker run --restart=unless-stopped ...
When a container is run with the above option, docker will restart it for you, unless you intentionally run a docker stop on the container.
As for why killing pid 1 in the container shuts it down, it's the same as killing pid 1 on a linux server. If you kill init/systemd, the box will go down. Inside the namespace of the container, similar rules apply and cannot be changed.

Resources