The docker daemon isn't starting anymore on my computer (Linux / Centos 7), and I strongly suspect that a container that is set to auto-restart is to blame in this case. If I start the daemon manually, the last line I see is "Loading containers: start", and then it just hangs.
What I'd like to do is to start the daemon without starting any containers. But I can't find any option to do that. Is there any option in docker to start the daemon without also starting containers set to automatically restart? If not, is there a way to remove the containers manually that doesn't require the docker daemon running?
I wrote this little script to stop all the containers before docker is started. It requires to have jq installed.
for i in /var/lib/docker/containers/*/config.v2.json; do
touch "$i.new" && getfacl -p "$i" | setfacl --set-file=- "$i.new"
cat "$i" | jq -c '.State.Running = false' > "$i.new" && mv -f "$i.new" "$i"
done
I think we need to verify the storage driver for docker that you are using. Devicemapper is known to have some issues similar to what you are describing. I would suggest moving to overlay2 as a storage driver.
If you are not running this on a prod system, you can try to do below steps to see if the daemon is coming up or not,
Stop the daemon process
Clean the docker home directory, default is /var/lib/docker/*
You may not be able to remove everything, in that case safe bet is to stop docker from autostart ,systemctl disable docker and restart the system
Once system is up, execute step-2 again and try to restart the daemon. Hopefully everything will come up.
Related
I have been running a nvidia docker image since 13 days and it used to restart without any problems using docker start -i <containerid> command. But, today while I was downloading pytorch inside the container, download got stuck at 5% and gave no response for a while.
I couldn't exit the container either by ctrl+d or ctrl+c. So, I exited the terminal and in new terminal I ran this docker start -i <containerid> again. But ever since this particular container is not responding to any command. Be it start/restart/exec/commit ...nothing! any command with this container ID or name is just non-responsive and had to exit out of it only after ctrl+c
I cannot restart the docker service since it will kill all running docker containers.
Cannot even stop the container using this docker container stop <containerid>
Please help.
You can make use of docker RestartPolicy:
docker update --restart=always <container>
while mindful of caveats on the docker version you running.
or explore an answer by #Yale Huang from a similar question: How to add a restart policy to a container that was already created
I had to restart docker process to revive my container. There was nothing else I could do to solve it. used sudo service docker restart and then revived my container using docker run. I will try to build the dockerfile out of it in order to avoid future mishaps.
I use Docker for development a HW-SW system where restarts and power losses happen very often. I'd like to find away to ignore/destroy all containers that were running at the time of shutdown and start Docker from the scratch (using docker-compose).
--force-recreate option of docker-compose, which seemed applicable - has some issues, such as "stuck" "old" port mappings preventing a new container instances from being run.
You can use
docker system prune
to remove unused images, containers and networks. That should help remove the old port mappings.
Typically you can start the container with --rm flag, so it will remove container once it stops or system shutdown.
But I did not found a similar flag for docker-compose, you can try the below but it will work manually not on shutdown or reboot
Usage: rm [options] [SERVICE...]
Options:
-f, --force Don't ask to confirm removal
-s, --stop Stop the containers, if required, before removing
-v Remove any anonymous volumes attached to containers
docker-comose rm
So the workaround is to run cron job on bootup and remove all container.
remove_container.sh.
#!/bin/bash
# put sleep so the docker process ready
sleep 90
# remove all container
docker rm -f $(docker ps -aq)
cronjob
#reboot (sh /home/user/remove_container.sh)
You can add a similar option for window etc, the above one tested on Ubuntu distribution.
I read the Enable Live Restore, but when I tried it.
ubuntu#ip-10-0-0-230:~$ cat /etc/docker/daemon.json
{
"live-restore": true
}
I started an nginx container in detached mode.
sudo docker run -d nginx
c73a20d1bb620e2180bc1fad7d10acb402c89fed9846f06471d6ef5860f76fb5
$sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS
c73a20d1bb62 nginx "nginx -g 'daemon of…" 5 seconds ago Up 4
seconds
Then I stopped the dockerd
sudo systemctl stop snap.docker.dockerd.service
and I checked that there was no container running
ps aux | grep nginx
After that, I restarted the docker service and still, there wasn't any container.
Any Idea? How this "enable live restore" works?
From the documentation, after modifying the daemon.json (adding "live-restore": true) you need to :
Restart the Docker daemon. On Linux, you can avoid a restart (and avoid any downtime for your containers) by reloading the Docker daemon. If you use systemd, then use the command systemctl reload docker. Otherwise, send a SIGHUP signal to the dockerd process.
You can also do this but it's not recommended :
If you prefer, you can start the dockerd process manually with the --live-restore flag. This approach is not recommended because it does not set up the environment that systemd or another process manager would use when starting the Docker process. This can cause unexpected behavior.
It seems that you had not done this step. You said that you've made the modification to the daemon.json and directly started a container and then stopped the dockerd.
In order to make the Live Restore functionality work follow all steps in the right order :
Modify the daemon.json by adding "live-restore": true
Reload the Docker daemon with the command :
sudo systemctl reload docker
Then try the functionality with your example (firing up a container and making the daemon unavailable).
I've tested and it works if you follow the steps in order :
Tested with Docker version 19.03.2, build 6a30dfc and Ubuntu 19.10 (Eoan Ermine)
You've installed Docker via snap : snap.docker.dockerd.service
Unfortunately, it's not recommended since snap model is not fully compatible with Docker. Furthermore, docker-snap is no longer maintained by Docker, Inc. Users encounters some issues when they installed Docker via snap see 1 2
You should delete the snap Docker installation to avoid any potential overlapping installation issues via this command :
sudo snap remove docker --purge
Then install Docker with the official way and after that try the Live Restore functionality by following the above steps.
Also be careful when restarting the daemon the documentation says that :
Live restore upon restart
The live restore option only works to restore containers if the daemon options, such as bridge IP addresses and graph driver, did not change. If any of these daemon-level configuration options have changed, the live restore may not work and you may need to manually stop the containers.
Also about downtime :
Impact of live restore on running containers
If the daemon is down for a long time, running containers may fill up the FIFO log the daemon normally reads. A full log blocks containers from logging more data. The default buffer size is 64K. If the buffers fill, you must restart the Docker daemon to flush them.
On Linux, you can modify the kernel’s buffer size by changing /proc/sys/fs/pipe-max-size.
I run some containers with the option --restart always.
It works good, so good, that I have now difficulties to stop these containers now :)
I tried :
sudo docker stop container && sudo docker rm -f container
But the container still restarts.
The docker documentation explains the restart policies, but I didn't find anything to resolve this issue.
Just
sudo docker rm -f container
will kill the process if it is running and remove the container, in one step.
That said, I couldn't replicate the symptoms you described. If I run with --restart=always, docker stop will stop the process and it remains stopped.
I am using Docker version 1.3.1.
docker update --restart=no <container>
Many thanks for those who takes time to respond.
If you use docker directly, Bryan is right sudo docker rm -f container is enough.
My problem was mainly that I use puppet to deploy docker images and run containers. I use this module and it creates entries in /etc/init for the upstart process manager.
I think, my problem whas that, some kind of incompatibilities between the process manager and docker.
In this situation, to halt a container, simply sudo stop docker-container.
More informations on managing docker container run can be found on the docker website
I have been running docker processes (apps) via
docker run …
But under runit supervision (runit is like daemontools) - so runit ensures that the process stays up, passes signals etc.
Is this reasonable? Docker seems to want to run its own demonization - but it isn't as thorough as runit. Furthermore, when runit restarts the app - a new container is created each time (fine) but it leaves a trace of the old one around - this seems to imply I am doing it in the wrong way.
Should docker not be run this way?
Should I instead set up a container from the image, just once, and then have runit run/supervise that container for all time?
Docker does do some management of daemonized containers: if the system shuts down, then when the Docker daemon starts it will also restart any containers that were running at the time the system shut down. But if the container exits on its own or the kernel (or a user) kills the container while it is running, the Docker daemon won't restart it. In cases where you do want a restart, a process manager makes sense.
I don't know runit so I can't give specific configuration guidance. But you should probably make the process manager communicate with the docker daemon and check to see if a given container id is running (docker ps | grep container_id or equivalent, or use the Docker Remote API directly). If the container has stopped, use Docker to restart it (docker run container_id) instead of running a new container. Or, if you do want a new container each time, then begin with docker run -rm to automatically clean it up when it exits or stops.
If you don't want your process manager to poll docker, you could instead run something that watches docker events.
You can get the container_id when you start the container as the return value of starting a daemon, or you can ask Docker to write this out to a file (docker run -cidfile myfilename, like a PID file)
I hope that helps or helps another runit guru offer more detailed advice.
Yes, I think running docker under runit makes sense. Typically when you start a process there is a way to tell it not to daemonize if it does by default since the normal way to hand-off from the runit run script to a process is via exec on the last line of your run script. For docker this means making sure not to set the -d flag.
For example, with docker you probably want your run script to look something like this:
#!/bin/bash -e
exec 2>&1
exec chpst -u dockeruser docker run -a stdin -a stdout -i ...
Using exec and chpst should resolve most issues with processes not terminating correctly when you bring down a runit service.