Docker daemon cannot be started for some (hidden) reason - docker

I am trying to push a docker image and noticed that my docker daemon actually is probably not running.
If for example I run:
docker run hello-world
docker: Cannot connect to the Docker daemon at
unix:///var/run/docker.sock. Is the docker daemon running?.
If I try to restart the daemon using:
systemctl start docker
Job for docker.service failed because the control process exited with
error code. See "systemctl status docker.service" and "journalctl -xe"
for details.
Continuing running:
systemctl status docker.service
docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor
preset: enabled)
Active: failed (Result: start-limit-hit) since Wed 2021-05-12 14:45:09
EEST; 43s ago
Docs: https://docs.docker.com
Process: 4810 ExecStart=/usr/bin/dockerd -H fd://
--containerd=/run/containerd/containerd.sock (code=exited, status=1/FAILURE)
Main PID: 4810 (code=exited, status=1/FAILURE)
May 12 14:45:07 iti-554 systemd[1]: docker.service: Unit entered
failed state.
May 12 14:45:07 iti-554 systemd[1]: docker.service: Failed with result
'exit-code'.
May 12 14:45:09 iti-554 systemd[1]: docker.service: Service hold-off
time over, scheduling restart.
May 12 14:45:09 iti-554 systemd[1]: Stopped Docker Application
Container Engine.
May 12 14:45:09 iti-554 systemd[1]: docker.service: Start request
repeated too quickly.
May 12 14:45:09 iti-554 systemd[1]: Failed to start Docker Application
Container Engine.
May 12 14:45:09 iti-554 systemd[1]: docker.service: Unit entered
failed state.
May 12 14:45:09 iti-554 systemd[1]: docker.service: Failed with result
'start-limit-hit'.
which as I understand it it means docker daemon is not loaded (it's in a failed state) and the last reason for this is the start-limit-hit has been reached. This on this side probably means another reason exists for this to happen.
SO, how do I find out which is the actual reason for my docker daemon refusing to start?
If I run to reset the failed attemps counter with:
systemctl reset-failed docker.service
it return without error so I assume it succeeds. And indeed when I check the status it has become:
Active: inactive (dead) since Wed 2021-05-12 14:45:09 EEST; 14min ago
Of course if I run docker daemon again it fails.
Can someone provide any workaround about this issue? I even tried to invoke the commands after restarting (didn't help).
Edit
Well, to my case the problem was a rather stupid one. I had added a daemon.json file with minimal content in it. Just this:
cat /etc/docker/daemon.json
{
"insecure-registries": [
"docker-server.com:10022",
"docker-server.com:10023"
],
}
The problem was that the dangling comma before } made docker search for another parameter. The relevant message shown using journalctl -u docker was:
unable to configure the Docker daemon with file
/etc/docker/daemon.json: invalid character '}' looking for beginning
of object key string
is quite obvious but the previous ones did not help much.

journalctl -u docker gives you docker daemon logs. Maybe u can find something there.

The unix:///var/run/docker.sock requires the correct permission to work. This a security feature for Docker.
Try sudo chmod 755 /var/run/docker.sock and re-run Docker command.
Note the permission number given here may not be suitable for everyone.

Related

Docker - error after moved storage to second disk and using overlay2

I just moved Docker default storage location to second disk setting up a /etc/docker/daemon.json as described in documentation, so far so goood.
The problem is that now I keep getting a bunch of volumes being continuously (re)mounted, ad obiously it is really annoying.
So I tried to set up overlay2 in /etc/docker/daemon.json, but now Docker doesn't event start
# sudo systemctl restart docker
Job for docker.service failed because the control process exited with error code.
See "systemctl status docker.service" and "journalctl -xeu docker.service" for details.
# systemctl status docker.service
× docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2022-12-15 11:06:36 CET; 10s ago
TriggeredBy: × docker.socket
Docs: https://docs.docker.com
Process: 17614 ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock (code=exited, status>
Main PID: 17614 (code=exited, status=1/FAILURE)
CPU: 54ms
dic 15 11:06:36 sgratani-OptiPlex-7060 systemd[1]: docker.service: Scheduled restart job, restart counter is at 3.
dic 15 11:06:36 sgratani-OptiPlex-7060 systemd[1]: Stopped Docker Application Container Engine.
dic 15 11:06:36 sgratani-OptiPlex-7060 systemd[1]: docker.service: Start request repeated too quickly.
dic 15 11:06:36 sgratani-OptiPlex-7060 systemd[1]: docker.service: Failed with result 'exit-code'.
dic 15 11:06:36 sgratani-OptiPlex-7060 systemd[1]: Failed to start Docker Application Container Engine.
So, for now I give up using overlay2 since having all the Docker images and container on the seond disk is more important than getting rid of a bunch of volumes being mounted continuously, but can anyone tells me where the problem is and if there is a solution?
Update #1: strange permissions behaviour problem
Got a simple docker-compose.yml with a Wordpress service (official WP image) and a database service, and when I have the docker storage location on the second disk instead of default one the database (volume maybe?) seems inaccessible:
wordpress keep giving error on db connection
trying to run mysql interactive from db service result in error on login with root user
Obviously this is related to the docker storage location, but cannot find why, since new location is created by docker itself when started.

docker.service failed. See 'journalctl -xe' for details

docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Thu 2018-05-17 15:47:26 CEST; 17h ago
Docs: https://docs.docker.com
Main PID: 11843 (code=exited, status=0/SUCCESS)
CGroup: /system.slice/docker.service
May 18 08:48:38 temp systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.
May 18 08:49:09 temp systemd[1]: Stopped Docker Application Container Engine.
May 18 08:49:09 temp systemd[1]: Dependency failed for Docker Application Container Engine.
May 18 08:49:09 temp systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.
May 18 08:49:15 temp systemd[1]: Dependency failed for Docker Application Container Engine.
May 18 08:49:15 temmp systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.
May 18 09:00:03 temp systemd[1]: Dependency failed for Docker Application Container Engine.
May 18 09:00:03 temp systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.
May 18 09:03:51 temp systemd[1]: Dependency failed for Docker Application Container Engine.
May 18 09:03:51 temp systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.
Tried to uninstall docker and reinstalled it but it raises the same error is the docker daemon running can someone help me here.
There is a service that docker requires that is not running, thus, systemd won't launch docker.
Try launching journalctl -f (without -u) to see all unit logs, then start docker and read carefully the log, you will probably see some other units trying to start and failing.
You can find the reason why docker isn't starting by running
/usr/bin/dockerd -H unix://
In my case it was a fresh install of Centos7 with Docker 18.09
ERRO[2018-11-14T22:14:55.441548150+02:00] 'overlay' not found as a supported filesystem on this host. Please ensure kernel is new enough and has overlay support loaded. storage-driver=overlay2
ERRO[2018-11-14T22:14:55.444930007+02:00] AUFS was not found in /proc/filesystems storage-driver=aufs
ERRO[2018-11-14T22:14:55.447984399+02:00] 'overlay' not found as a supported filesystem on this host. Please ensure kernel is new enough and has overlay support loaded. storage-driver=overlay
To fix that, I had to upgrade to a newer kernel, and remove the current docker storage
rm -rf /var/lib/docker
Then docker started working
I have this problem on my machine. I don't have success to solve this issue.
But if you are in a hury you can do
/usr/bin/dockerd -H unix:///var/run/docker.sock
All classic commands will work (docker system, docker etc..)

Start Docker daemon as another user?

Installed Docker 17.x version in RHEL and we are getting below excetption.
-bash-4.2$ docker version
Client:
Version: 17.09.1-ce
API version: 1.32
Go version: go1.8.3
Git commit: 19e2cf6
Built: Thu Dec 7 22:23:40 2017
OS/Arch: linux/amd64
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.32/version: dial unix /var/run/docker.sock: connect: permission denied
-bash-4.2$
to solve this , we introduce another user group (docker-user) and we added all the users in this group. after that we ran this command and able to ran docker .
sudo systemctl stop docker
sudo systemctl start docker
cd /var/run
sudo chown :docker-user docker.sock
But we are facing another issue that whenever VM is getting restarted ,DOCKER is not running. So we decided to setup run docker as daemon process and we followed
below steps.
1. create docker.conf file under /etc/systemd/system/docker.service.d folder.
2. added this entry in docker.conf file
[Service]
ExecStart=
ExecStart=/usr/bin/docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock
ExecStartPost=/bin/chown :docker-user /var/run/docker.sock
After adding this entry and we ran
1. sudo systemctl daemon-reload
2. sudo systemctl stop docker
3. sudo systemctl start docker
We are getting below exception
-bash-4.2$ sudo systemctl status docker.service
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/docker.service.d
└─docker.conf
Active: failed (Result: start-limit) since Wed 2018-03-28 09:10:50 PDT; 12s ago
Docs: https://docs.docker.com
Process: 23395 ExecStart=/usr/bin/docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock (code=exited, status=1/FAILURE)
Main PID: 23395 (code=exited, status=1/FAILURE)
Mar 28 09:10:50 hostname systemd[1]: Failed to start Docker Application Container Engine.
Mar 28 09:10:50 hostname systemd[1]: Unit docker.service entered failed state.
Mar 28 09:10:50 hostname systemd[1]: docker.service failed.
Mar 28 09:10:50 hostname systemd[1]: docker.service holdoff time over, scheduling restart.
Mar 28 09:10:50 hostname systemd[1]: start request repeated too quickly for docker.service
Mar 28 09:10:50 hostname systemd[1]: Failed to start Docker Application Container Engine.
Mar 28 09:10:50 hostname systemd[1]: Unit docker.service entered failed state.
Mar 28 09:10:50 hostname systemd[1]: docker.service failed.
Guide me how to setup docker as daemon process
So, you have already dug in pretty good.
However, this behavior is built into Docker. For user groups, the docker daemon will allow users with the docker group to access the server (important to remember, this is exactly the same as giving root access to any users in that group!). If you wanted to specify a different group, you can start the daemon with the -g option.
Installing docker also installs a systemd service unit to run the daemon. The correct way to enable that (so that it restarts automatically) is
sudo systemctl enable docker
At this point, you didn't include enough of the journalctl logs to actually say why the daemon is not starting for you- if it is an option, I would try starting over since I don't know everything you have messed with, but if that is not an option the journalctl logs will likely explain the problem (probably that the user 'docker' no longer has access to the socket after you chowned it, but that is just a guess)

Failed to start Docker Application Container Engine

I am new to Docker, so don't have much idea about it.
I tried restarting Docker service using command
service docker restart
As the command was taking too much of time I did a CTL+C
Now I am not able to start docker deamon
Any docker command gives following op
Cannot connect to the Docker daemon. Is the docker daemon running on this host?
I tried starting Docker deamon using
systemctl start docker
But it outputs:
Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.
Output of command
**systemctl status docker.service**
`● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/docker.service.d
└─docker.conf, http-proxy.conf, https-proxy.conf
Active: failed (Result: exit-code) since Mon 2018-03-05 17:17:54 IST; 2min 23s ago
Docs: https://docs.docker.com
Process: 11331 ExecStart=/usr/bin/dockerd --graph=/app/dockerRT (code=exited, status=1/FAILURE)
Main PID: 11331 (code=exited, status=1/FAILURE)
Memory: 76.9M
CGroup: /system.slice/docker.service
└─4593 docker-containerd-shim 3bda33eac892d14adda9f3b1fc8dc52173e26ce60ca949075227d903399c7517 /var/run/docker/libcontainerd/3bda33eac892d14adda9f3b1fc8dc52173e26c...
Mar 05 17:17:05 hj-fsbfsd9761.persistent.co.in systemd[1]: Starting Docker Application Container Engine...
Mar 05 17:17:05 hj-fsbfsd9761.persistent.co.in dockerd[11331]: time="2018-03-05T17:17:05.126009059+05:30" level=info msg="libcontainerd: new containerd process, pid: 11337"
Mar 05 17:17:06 hj-fsbfsd9761.persistent.co.in dockerd[11331]: time="2018-03-05T17:17:06.346599571+05:30" level=warning msg="devmapper: Usage of loopback devices is ...section."
Mar 05 17:17:10 hj-fsbfsd9761.persistent.co.in dockerd[11331]: time="2018-03-05T17:17:10.889378989+05:30" level=warning msg="devmapper: Base device already exists an...ignored."
Mar 05 17:17:10 hj-fsbfsd9761.persistent.co.in dockerd[11331]: time="2018-03-05T17:17:10.976695025+05:30" level=info msg="[graphdriver] using prior storage driver \"...mapper\""
Mar 05 17:17:54 hj-fsbfsd9761.persistent.co.in dockerd[11331]: time="2018-03-05T17:17:54.312812069+05:30" level=fatal msg="Error starting daemon: timeout"
Mar 05 17:17:54 hj-fsbfsd9761.persistent.co.in systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE
Mar 05 17:17:54 hj-fsbfsd9761.persistent.co.in systemd[1]: **Failed to start Docker Application Container Engine.**
Mar 05 17:17:54 hj-fsbfsd9761.persistent.co.in systemd[1]: Unit docker.service entered failed state.
Mar 05 17:17:54 hj-fsbfsd9761.persistent.co.in systemd[1]: docker.service failed.
Hint: Some lines were ellipsized, use -l to show in full.
journalctl -xe
loop: Write error at byte offset 63585648640, length 4096.
How would I be able to start Docker without losing any containers and using previous configurations?
I had the same issue (Fedora 30 x86_64, kernel 5.2.9) and it turned out that being connected to a VPN was the problem. Apparently having a changed gateway address causes an "Error initializing network controller" error which I was able to see when I tried starting docker via sudo dockerd instead of sudo systemctl start docker.
I found the note here about the VPN being a possible problem, disconnecting immediately allowed me to start docker with systemctl start docker.
"Failed to start Docker Application Container Engine" is a general error message.
You should inspect journal for more details:
journalctl -eu docker
In my case it was: "error initializing graphdriver: /var/lib/docker contains several valid graphdrivers: devicemapper, overlay2"
Changing graphdriver to overlay2, fixed it:
$ sudo systemctl stop docker
$ vi /etc/docker/daemon.json # Create the file if it does not exist, and add:
{
"storage-driver": "overlay2"
}
$ sudo systemctl start docker
$ systemctl status docker.service # Hopefully it's running now
I removed the file /etc/docker/daemon.json and started it with sudo systemctl start docker and it worked!!
For the googlers:
For me what worked was doing a killall dockerd and a sudo rm /var/run/docker.pid
I got the error message that recommended that from running dockerd
I ran into this too but all I did after receiving the error was sudo systemctl start docker and then ran sudo systemctl status docker again and that took care of it while on a vpn.
You may have the issue with the current docker installation.
If you don't have much done already you may want to try to reinstall using the install script provided by Docker: https://docs.docker.com/install/linux/docker-ce/ubuntu/#install-using-the-convenience-script this will help you to investigate the errors.
I faced the same problem, in my case the disk was full. I removed docker volumes from /var/lib/docker/volumes/ and started docker with sudo systemctl start docker.

docker not responding when using different data directory

I want to change the image directory in docker. I tried the initial two methods mentioned here. Both methods work and change the directory for docker images. But the problem is that the images stop responding. I can run the hello world example but if I run the ubuntu container or the whalesay container, docker stops responding and I can't run it again.
docker run -it ubuntu bash
docker run docker/whalesay cowsay boo
On using the above commands, the images get downloaded and nothing happens. Then I enter the command again to run and the system stops responding. I used Ctrl + C to terminate it but after that I can not open any other terminal screen. Also, the system doesn't power off; it gets stuck at a black screen. On force restarting the system docker starts failing to run giving the following log:
docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Fri 2017-04-14 20:12:14 EDT; 10min ago
Docs: https://docs.docker.com
Process: 1160 ExecStart=/usr/bin/dockerd -H fd:// (code=exited, status=1/FAILURE)
Main PID: 1160 (code=exited, status=1/FAILURE)
Apr 14 20:12:14 abmittal-linux systemd[1]: Starting Docker Application Container Engine...
Apr 14 20:12:14 abmittal-linux dockerd[1160]: unable to configure the Docker daemon with file /etc/docker/daemon.json: EOF
Apr 14 20:12:14 abmittal-linux systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Apr 14 20:12:14 abmittal-linux systemd[1]: Failed to start Docker Application Container Engine.
Apr 14 20:12:14 abmittal-linux systemd[1]: docker.service: Unit entered failed state.
Apr 14 20:12:14 abmittal-linux systemd[1]: docker.service: Failed with result 'exit-code'
Removing and reinstalling docker also doesn't work if the directory is same as before (even if the directory has been deleted and then made again). I have to change the directory in the configuration to get it to run again but again it stops responding.
The following is my daemon.json file:
{
"graph":"/mnt/other/docker_images"
}
EDIT: I think I may have found the error. The partition /mnt/other is using NTFS file system (and is on a different disk). Can someone please confirm if this might be the source of the error?
This is a known bug: Link
I tried creating a custom directory on an ext4 partition and it worked.

Resources