I have been running a nvidia docker image since 13 days and it used to restart without any problems using docker start -i <containerid> command. But, today while I was downloading pytorch inside the container, download got stuck at 5% and gave no response for a while.
I couldn't exit the container either by ctrl+d or ctrl+c. So, I exited the terminal and in new terminal I ran this docker start -i <containerid> again. But ever since this particular container is not responding to any command. Be it start/restart/exec/commit ...nothing! any command with this container ID or name is just non-responsive and had to exit out of it only after ctrl+c
I cannot restart the docker service since it will kill all running docker containers.
Cannot even stop the container using this docker container stop <containerid>
Please help.
You can make use of docker RestartPolicy:
docker update --restart=always <container>
while mindful of caveats on the docker version you running.
or explore an answer by #Yale Huang from a similar question: How to add a restart policy to a container that was already created
I had to restart docker process to revive my container. There was nothing else I could do to solve it. used sudo service docker restart and then revived my container using docker run. I will try to build the dockerfile out of it in order to avoid future mishaps.
How can I restart docker daemon running in rootless mode on Linux?
Stopping it works fine with:
docker --user stop docker.service
but starting it back again fails when using:
docker --user start docker.service
The command doesn't return anything but when checking the docker info it says:
ERROR: Cannot connect to the Docker daemon at unix:///run/user/1000/docker.sock. Is the docker daemon running?
It doesn't give any further information...
I had this error a couple of times before, when I accidentally run docker with sudo and therefore got mixed up permissions in my data-root (defined in daemon.json). But this time chowning it back to $USER didn't help with the restart. Also restarting the host machine didn't help (as it did a couple of times previously).
Ok, it seems that "userns-remap" is not compatible with rootless mode:
Rootless mode executes the Docker daemon and containers inside a user namespace. This is very similar to userns-remap mode, except that with userns-remap mode, the daemon itself is running with root privileges, whereas in rootless mode, both the daemon and the container are running without root privileges. Rootless mode does not use binaries with SETUID bits or file capabilities, except newuidmap and newgidmap, which are needed to allow multiple UIDs/GIDs to be used in the user namespace.
I was trying to fix permission issues on shared volumes by experimenting with setting UIDs/GIDs and added "userns-remap" to the ~/.config/docker/daemon.json:
{
"data-root": "/home/me/docker/image-storage",
"userns-remap": "me"
}
So deleting userns-remap from the config file fixed the restarting issue... Man, docker, at least a hint to the config file would be great... Because the userns-remap option was mentioned on some official docker doc pages I didn't even consider it as the source of the trouble in the first place.
I read the Enable Live Restore, but when I tried it.
ubuntu#ip-10-0-0-230:~$ cat /etc/docker/daemon.json
{
"live-restore": true
}
I started an nginx container in detached mode.
sudo docker run -d nginx
c73a20d1bb620e2180bc1fad7d10acb402c89fed9846f06471d6ef5860f76fb5
$sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS
c73a20d1bb62 nginx "nginx -g 'daemon of…" 5 seconds ago Up 4
seconds
Then I stopped the dockerd
sudo systemctl stop snap.docker.dockerd.service
and I checked that there was no container running
ps aux | grep nginx
After that, I restarted the docker service and still, there wasn't any container.
Any Idea? How this "enable live restore" works?
From the documentation, after modifying the daemon.json (adding "live-restore": true) you need to :
Restart the Docker daemon. On Linux, you can avoid a restart (and avoid any downtime for your containers) by reloading the Docker daemon. If you use systemd, then use the command systemctl reload docker. Otherwise, send a SIGHUP signal to the dockerd process.
You can also do this but it's not recommended :
If you prefer, you can start the dockerd process manually with the --live-restore flag. This approach is not recommended because it does not set up the environment that systemd or another process manager would use when starting the Docker process. This can cause unexpected behavior.
It seems that you had not done this step. You said that you've made the modification to the daemon.json and directly started a container and then stopped the dockerd.
In order to make the Live Restore functionality work follow all steps in the right order :
Modify the daemon.json by adding "live-restore": true
Reload the Docker daemon with the command :
sudo systemctl reload docker
Then try the functionality with your example (firing up a container and making the daemon unavailable).
I've tested and it works if you follow the steps in order :
Tested with Docker version 19.03.2, build 6a30dfc and Ubuntu 19.10 (Eoan Ermine)
You've installed Docker via snap : snap.docker.dockerd.service
Unfortunately, it's not recommended since snap model is not fully compatible with Docker. Furthermore, docker-snap is no longer maintained by Docker, Inc. Users encounters some issues when they installed Docker via snap see 1 2
You should delete the snap Docker installation to avoid any potential overlapping installation issues via this command :
sudo snap remove docker --purge
Then install Docker with the official way and after that try the Live Restore functionality by following the above steps.
Also be careful when restarting the daemon the documentation says that :
Live restore upon restart
The live restore option only works to restore containers if the daemon options, such as bridge IP addresses and graph driver, did not change. If any of these daemon-level configuration options have changed, the live restore may not work and you may need to manually stop the containers.
Also about downtime :
Impact of live restore on running containers
If the daemon is down for a long time, running containers may fill up the FIFO log the daemon normally reads. A full log blocks containers from logging more data. The default buffer size is 64K. If the buffers fill, you must restart the Docker daemon to flush them.
On Linux, you can modify the kernel’s buffer size by changing /proc/sys/fs/pipe-max-size.
I am using docker for the first time and I was trying to implement this -
https://docs.docker.com/get-started/part2/#tag-the-image
At one stage I was trying to connect with localhost by this command -
$ curl http://localhost:4000
which showed this error-
curl: (7) Failed to connect to localhost port 4000: Connection refused
However, I have solved this by following code -
$ docker-machine ip default
$ curl http://192.168.99.100:4000
After that everything was going fine, but in the last part, I was trying to run the app by using following line according to the tutorial...
$ docker run -p 4000:80 anibar/get-started:part1
But, I got this error
C:\Program Files\Docker Toolbox\docker.exe: Error response from daemon: driver failed programming external connectivity on endpoint goofy_bohr (63f5691ef18ad6d6389ef52c56198389c7a627e5fa4a79133d6bbf13953a7c98): Bind for 0.0.0.0:4000 failed: port is already allocated.
You need to make sure that the previous container you launched is killed, before launching a new one that uses the same port.
docker container ls
docker rm -f <container-name>
Paying tribute to IgorBeaz, you need to stop running the current container. For that you are going to know current CONTAINER ID:
$ docker container ls
You get something like:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
12a32e8928ef friendlyhello "python app.py" 51 seconds ago Up 50 seconds 0.0.0.0:4000->80/tcp romantic_tesla
Then you stop the container by:
$ docker stop 12a32e8928ef
Finally you try to do what you wanted to do, for example:
$ docker run -p 4000:80 friendlyhello
I tried all the above answers, none of them worked, in my case even docker container ls doesn't show any container running. It looks like the problem is due to the fact that the docker proxy is still using ports although there are no containers running. In my case I was using ubuntu. Here's what I tried and got the problem solved, just run the following two commands:
sudo service docker stop
sudo rm -f /var/lib/docker/network/files/local-kv.db
I solved it this way:
First, I stopped all running containers:
docker-compose down
Then I executed a lsof command to find the process using the port (for me it was port 9000)
sudo lsof -i -P -n | grep 9000
Finally, I "killed" the process (in my case, it was a VSCode extension):
kill -9 <process id>
The quick fix is a just restart docker:
sudo service docker stop
sudo service docker start
Above two answers are correct but didn't work for me.
I kept on seeing blank like below for docker container ls
then I tried, docker container ls -a and after that it showed all the process previously exited and running.
Then docker stop <container id> or docker container stop <container id> didn't work
then I tried docker rm -f <container id> and it worked.
Now at this I tried docker container ls -a and this process wasn't present.
When I used nginx docker image, I also got this error:
docker: Error response from daemon: driver failed programming external connectivity on endpoint recursing_knuth (9186f7d7f523732b99d3510029cde9679f3f3fe7b7eb5f612d54c4aacea58220): Bind for 0.0.0.0:8080 failed: port is already allocated.
And I solved it using following commands:
$ docker container ls
$ docker stop [CONTAINER ID]
Then, running this docker container(like this) again is ok:
$ docker run -v $PWD/vueDemo:/usr/share/nginx/html -p 8080:80 -d nginx:alpine
You just need to stop the previous docker container.
I have had same problem with docker-compose, to fix it:
Killed docker-proxy processe
Restart docker
Start docker-compose again
docker ps will reveal the list of containers running on docker. Find the one running on your needed port and note down its PID.
Stop and remove that container using following commands:
docker stop PID
docker rm PID
Now run docker-compose up and your services should run as you have freed the needed port.
on linux 'sudo systemctl restart docker' solved the issue for me
For anyone having this problem with docker-compose.
When you have more than one project (i.e. in different folders) with similar services you need to run docker-compose stop in each of your other projects.
If you are using Docker-Desktop, you can quit Docker Desktop and then restart it. It solved the problem for me.
In my case, there was no process to kill.
Updating docker fixed the problem.
It might be a conflict with the same port specified in docker-compose.yml and docker-compose.override.yml or the same port specified explicitly and using an environment variable.
I had a docker-compose.yml with ports on a container specified using environment variables, and a docker-compose.override.yml with one of the same ports specified explicitly. Apparently docker tried to open both on the same container. docker container ls -a listed neither because the container could not start and list the ports.
For me the containers where not showing up running, so NOTHING was using port 9010 (in my case) BUT Docker still complained.
I did not want to reset my Docker (for Windows) so what I did to resolve it was simply:
Remove the network (I knew that before a container was using this network with the port in question (9010) docker network ls docker network rm blabla (or id)
I actually used a new network rather than the old (buggy) one but shouldn't be needed
Restart Docker
That was the only way it worked for me. I can't explain it but somehow the "old" network was still bound to that port (9010) and Docker kept on "blocking" it (whinching about it)
FOR WINDOWS;
I killed every process that docker use and restarted the docker service on services. My containers are working now.
It is about ports that is still in use by Docker even though you are not using on that moment.
On Linux, you can run sudo netstat -tulpn to see what is currently listening on that port. You can then choose to configure either that process or your Docker container to bind to a different port to avoid the conflict.
Stopping the container didn't work for me either. I changed the port in docker-compose.yml.
For me, the problem was mapping the same port twice.
Due to a parametric docker run, it ended up being something like
docker run -p 4000:80 -p 4000:80 anibar/get-started:part1
notice double mapping on port 4000.
The log is not informative enough in this case, as it doesn't state I was the cause of the double mapping, and that the port is no longer bound after the docker run command returns with a failure.
Don't forget the easiest fix of all....
Restart your computer.
I have tried most of the above and still couldn't fix it. Then just restart my Mac and then it's all back to normal.
For anyone still looking for a solution, just make sure you have binded your port the right way round in your docker-compose.yml
It goes:
- <EXTERNAL SERVER PORT>:<INTERNAL CONTAINER PORT>
Had the same problem. Went to Docker for Mac Dashboard and clicked restart. Problem solved.
my case was dump XD I was exposing port 80 twice :D
ports:
- '${APP_PORT:-80}:80'
- '${APP_PORT:-8080}:8080'
APP_PORT is defined, thus 80 was exposed twice.
I tried almost all solutions and found out the probable/possible reason/solution. So, If you are using traefik or any other networking server, they internally facilitate proxy for load balacing. That, most use the blueprint as it, works pretty fine. It then passes the load control entirely to nginx or similiar proxy servers. So, stopping, killing(networking server) or pruning might not help.
Solution for traefik with nginx,
sudo /etc/init.d/nginx stop
# or
sudo service nginx stop
# or
sudo systemctl stop nginx
Credits
How to stop docker processes
Making Docker Stop Itself <- Safe and Fast
this is the best way to stop containers and all unstoppable processes: making docker do the job.
go to docker settings > resources. change any of the resource and click apply and restart.
docker will stop itself and its every process -- even the most stubborn ones that might not be killed by other commonly used commands such as kill or more wild commands like rm suggested by others.
i ran into a similar problem before and all the good - proper - tips from my colleagues somehow did not work out. i share this safe trick whenever someone in my team asks me about this.
Error response from daemon: driver failed programming external connectivity on endpoint foobar
Bind for 0.0.0.0:8000 failed: port is already allocated
hope this helps!
simply restart your computer, so the docker service gets restarted
I run some containers with the option --restart always.
It works good, so good, that I have now difficulties to stop these containers now :)
I tried :
sudo docker stop container && sudo docker rm -f container
But the container still restarts.
The docker documentation explains the restart policies, but I didn't find anything to resolve this issue.
Just
sudo docker rm -f container
will kill the process if it is running and remove the container, in one step.
That said, I couldn't replicate the symptoms you described. If I run with --restart=always, docker stop will stop the process and it remains stopped.
I am using Docker version 1.3.1.
docker update --restart=no <container>
Many thanks for those who takes time to respond.
If you use docker directly, Bryan is right sudo docker rm -f container is enough.
My problem was mainly that I use puppet to deploy docker images and run containers. I use this module and it creates entries in /etc/init for the upstart process manager.
I think, my problem whas that, some kind of incompatibilities between the process manager and docker.
In this situation, to halt a container, simply sudo stop docker-container.
More informations on managing docker container run can be found on the docker website