Enabling live restore on docker isn't keeping the containers alive - docker

I read the Enable Live Restore, but when I tried it.
ubuntu#ip-10-0-0-230:~$ cat /etc/docker/daemon.json
{
"live-restore": true
}
I started an nginx container in detached mode.
sudo docker run -d nginx
c73a20d1bb620e2180bc1fad7d10acb402c89fed9846f06471d6ef5860f76fb5
$sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS
c73a20d1bb62 nginx "nginx -g 'daemon of…" 5 seconds ago Up 4
seconds
Then I stopped the dockerd
sudo systemctl stop snap.docker.dockerd.service
and I checked that there was no container running
ps aux | grep nginx
After that, I restarted the docker service and still, there wasn't any container.
Any Idea? How this "enable live restore" works?

From the documentation, after modifying the daemon.json (adding "live-restore": true) you need to :
Restart the Docker daemon. On Linux, you can avoid a restart (and avoid any downtime for your containers) by reloading the Docker daemon. If you use systemd, then use the command systemctl reload docker. Otherwise, send a SIGHUP signal to the dockerd process.
You can also do this but it's not recommended :
If you prefer, you can start the dockerd process manually with the --live-restore flag. This approach is not recommended because it does not set up the environment that systemd or another process manager would use when starting the Docker process. This can cause unexpected behavior.
It seems that you had not done this step. You said that you've made the modification to the daemon.json and directly started a container and then stopped the dockerd.
In order to make the Live Restore functionality work follow all steps in the right order :
Modify the daemon.json by adding "live-restore": true
Reload the Docker daemon with the command :
sudo systemctl reload docker
Then try the functionality with your example (firing up a container and making the daemon unavailable).
I've tested and it works if you follow the steps in order :
Tested with Docker version 19.03.2, build 6a30dfc and Ubuntu 19.10 (Eoan Ermine)
You've installed Docker via snap : snap.docker.dockerd.service
Unfortunately, it's not recommended since snap model is not fully compatible with Docker. Furthermore, docker-snap is no longer maintained by Docker, Inc. Users encounters some issues when they installed Docker via snap see 1 2
You should delete the snap Docker installation to avoid any potential overlapping installation issues via this command :
sudo snap remove docker --purge
Then install Docker with the official way and after that try the Live Restore functionality by following the above steps.
Also be careful when restarting the daemon the documentation says that :
Live restore upon restart
The live restore option only works to restore containers if the daemon options, such as bridge IP addresses and graph driver, did not change. If any of these daemon-level configuration options have changed, the live restore may not work and you may need to manually stop the containers.
Also about downtime :
Impact of live restore on running containers
If the daemon is down for a long time, running containers may fill up the FIFO log the daemon normally reads. A full log blocks containers from logging more data. The default buffer size is 64K. If the buffers fill, you must restart the Docker daemon to flush them.
On Linux, you can modify the kernel’s buffer size by changing /proc/sys/fs/pipe-max-size.

Related

Restart docker daemon in rootless mode (on Linux)

How can I restart docker daemon running in rootless mode on Linux?
Stopping it works fine with:
docker --user stop docker.service
but starting it back again fails when using:
docker --user start docker.service
The command doesn't return anything but when checking the docker info it says:
ERROR: Cannot connect to the Docker daemon at unix:///run/user/1000/docker.sock. Is the docker daemon running?
It doesn't give any further information...
I had this error a couple of times before, when I accidentally run docker with sudo and therefore got mixed up permissions in my data-root (defined in daemon.json). But this time chowning it back to $USER didn't help with the restart. Also restarting the host machine didn't help (as it did a couple of times previously).
Ok, it seems that "userns-remap" is not compatible with rootless mode:
Rootless mode executes the Docker daemon and containers inside a user namespace. This is very similar to userns-remap mode, except that with userns-remap mode, the daemon itself is running with root privileges, whereas in rootless mode, both the daemon and the container are running without root privileges. Rootless mode does not use binaries with SETUID bits or file capabilities, except newuidmap and newgidmap, which are needed to allow multiple UIDs/GIDs to be used in the user namespace.
I was trying to fix permission issues on shared volumes by experimenting with setting UIDs/GIDs and added "userns-remap" to the ~/.config/docker/daemon.json:
{
"data-root": "/home/me/docker/image-storage",
"userns-remap": "me"
}
So deleting userns-remap from the config file fixed the restarting issue... Man, docker, at least a hint to the config file would be great... Because the userns-remap option was mentioned on some official docker doc pages I didn't even consider it as the source of the trouble in the first place.

Start the docker daemon without starting containers that set to restart automatically

The docker daemon isn't starting anymore on my computer (Linux / Centos 7), and I strongly suspect that a container that is set to auto-restart is to blame in this case. If I start the daemon manually, the last line I see is "Loading containers: start", and then it just hangs.
What I'd like to do is to start the daemon without starting any containers. But I can't find any option to do that. Is there any option in docker to start the daemon without also starting containers set to automatically restart? If not, is there a way to remove the containers manually that doesn't require the docker daemon running?
I wrote this little script to stop all the containers before docker is started. It requires to have jq installed.
for i in /var/lib/docker/containers/*/config.v2.json; do
touch "$i.new" && getfacl -p "$i" | setfacl --set-file=- "$i.new"
cat "$i" | jq -c '.State.Running = false' > "$i.new" && mv -f "$i.new" "$i"
done
I think we need to verify the storage driver for docker that you are using. Devicemapper is known to have some issues similar to what you are describing. I would suggest moving to overlay2 as a storage driver.
If you are not running this on a prod system, you can try to do below steps to see if the daemon is coming up or not,
Stop the daemon process
Clean the docker home directory, default is /var/lib/docker/*
You may not be able to remove everything, in that case safe bet is to stop docker from autostart ,systemctl disable docker and restart the system
Once system is up, execute step-2 again and try to restart the daemon. Hopefully everything will come up.

How to pull layers one by one in Docker to avoid connection timeout?

I keep getting connection timeout while pulling an image:
First, it starts downloading the 3 first layers, after one of them finish, the 4th layer try to start downloading. Now the problem is it won't start until the two remaining layers finish there download process, and before that happens (I think) the fourth layer fails to start downloading and abort the whole process.
So I was thinking, if downloading the layers one by one would solve this problem.
Or maybe a better way/option to solve this issue that may occure when you don't have a very fast internet speed.
The Docker daemon has a --max-concurrent-downloads option.
According to the documentation, it sets the max concurrent downloads for each pull.
So you can start the daemon with dockerd --max-concurrent-downloads 1 to get the desired effect.
See the dockerd documentation for how to set daemon options on startup.
Please follow the step if docker running already Ubuntu:
sudo service docker stop
sudo dockerd --max-concurrent-downloads 1
Download your images after that stop this terminal and start the daemon again as it was earlier.
sudo service docker start
There are 2 ways:
permanent change. add docker settings file:
sudo vim /etc/docker/daemon.json
the json file as below:
{
"max-concurrent-uploads": 1,
"max-concurrent-downloads": 4
}
after adding the file, run
sudo service docker restart
temporary change
stop the docker by
sudo service docker stop
then run
sudo dockerd --max-concurrent-uploads 1
at this point, start the push at another terminal. it will transfer files one by one. when you finished, restart the service or computer.
Building on the previous answers, in my case I couldn't do service stop, and also I wanted to make sure I would restart the docker daemon in the same state, I thus followed these steps:
Record the command line used to start the docker daemon:
ps aux | grep dockerd
Stop the docker daemon:
sudo kill <process id retrieved from previous command>
Restart docker daemon with max-concurrent-downloads option: Use the command retrieved at the first step, and add --max-concurrent-downloads 1
Additionally
You might still run into a problem if even with a single download at a time, your pull is still aborted at some point, and layers that are already downloaded are erased. It's a bug, but it was my case.
A solution in that case is to make sure to keep already downloaded layers, voluntarily.
The way to do that is to regularly abort the pull manually, but NOT by killing the docker command, but BY KILLING THE DOCKER DAEMON.
Actually, it's the daemon that erases already downloaded layers when the pull fails. Thus, by killing it, it can't erase these layers. The docker pull command does terminate, but once you restart the docker daemon, and then relaunch your docker pull command, downloaded layers are still here.

How can I stop and delete a docker container launched with restart always option?

I run some containers with the option --restart always.
It works good, so good, that I have now difficulties to stop these containers now :)
I tried :
sudo docker stop container && sudo docker rm -f container
But the container still restarts.
The docker documentation explains the restart policies, but I didn't find anything to resolve this issue.
Just
sudo docker rm -f container
will kill the process if it is running and remove the container, in one step.
That said, I couldn't replicate the symptoms you described. If I run with --restart=always, docker stop will stop the process and it remains stopped.
I am using Docker version 1.3.1.
docker update --restart=no <container>
Many thanks for those who takes time to respond.
If you use docker directly, Bryan is right sudo docker rm -f container is enough.
My problem was mainly that I use puppet to deploy docker images and run containers. I use this module and it creates entries in /etc/init for the upstart process manager.
I think, my problem whas that, some kind of incompatibilities between the process manager and docker.
In this situation, to halt a container, simply sudo stop docker-container.
More informations on managing docker container run can be found on the docker website

How to restart a container on docker restart (--restart=true doesn't work)?

I am using docker version 1.1.0, started by systemd using the command line /usr/bin/docker -d, and tried to:
run a container
stop the docker service
restart the docker service (either using systemd or manually, specifying --restart=true on the command line)
see if my container was still running
As I understand the docs, my container should be restarted. But it is not. Its public facing port doesn't respond, and docker ps doesn't show it.
docker ps -a shows my container with an empty status:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
cb0d05b4e0d9 mildred/p2pweb:latest node server-cli.js - 7 minutes ago 0.0.0.0:8888->8888/tcp jovial_ritchie
...
And when I try to docker restart cb0d05b4e0d9, I get an error:
Error response from daemon: Cannot restart container cb0d05b4e0d9: Unit docker-cb0d05b4e0d9be2aadd4276497e80f4ae56d96f8e2ab98ccdb26ef510e21d2cc.scope already exists.
2014/07/16 13:18:35 Error: failed to restart one or more containers
I can always recreate a container using the same base image using docker run ..., but how do I make sure that my running containers will be restarted if docker is restarted. Is there a solution that exists even in case the docker is not stopped properly (imagine I remove the power plug from the server).
Thank you
As mentioned in a comment, the container flag you're likely looking for is --restart=always, which will instruct Docker that unless you explicitly docker stop the container, Docker should start it back up any time either Docker dies or the container does.

Resources