docker.service: Service lacks both ExecStart= and ExecStop= setting. Refusing - docker

My VM crashed because it was out of memory. After rebooting the machine docker was not running:
systemctl status docker
● docker.service
Loaded: error (Reason: Invalid argument)
Active: inactive (dead)
Dec 19 08:18:21 my-vm-single-instance systemd[1]: [/lib/systemd/system/docker.service:1] Assignment outside of section. Ignoring.
Dec 19 08:18:21 my-vm-single-instance systemd[1]: docker.service: Service lacks both ExecStart= and ExecStop= setting. Refusing.
I installed docker using the offical documentation: https://docs.docker.com/engine/install/debian/
The VM is running:
Debian GNU/Linux 9 (stretch)
Docker version 19.03.14, build 5eb3275d40
docker-compose version 1.25.4, build 8d51620a
I got docker up and running again with
dockerd
However I would like to get it running again through systemctl.
The contents of /lib/systemd/system/docker.service are:
Environment="GOOGLE_APPLICATION_CREDENTIALS=/etc/docker/key.json"
Any ideas how to fix this problem?

If the docker.service contains only one line as it mentioned it's bogus.
As it says
docker.service: Service lacks both ExecStart= and ExecStop= setting. Refusing.
the executions scripts are missing at least.
Here is a sample service file:
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service containerd.service
Wants=network-online.target
Requires=docker.socket containerd.service
[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
# Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.
# Both the old, and new location are accepted by systemd 229 and up, so using the old location
# to make them work for either version of systemd.
StartLimitBurst=3
# Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.
# Both the old, and new name are accepted by systemd 230 and up, so using the old name to make
# this option work for either version of systemd.
StartLimitInterval=60s
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Comment TasksMax if your systemd version does not support it.
# Only systemd 226 and above support this option.
TasksMax=infinity
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
OOMScoreAdjust=-500
[Install]
WantedBy=multi-user.target
This is my default service file. I've never modified it after installation.

Related

docker not starting and failing with error , Failed to start docker.service: Unit not found

When I am trying to start docker using the command:
sudo systemctl start docker
I am getting below error
Failed to start docker.service: Unit not found.
I tried finding some suggestions over the web to resolve this issue and followed that but it didn't solve the issue.
Cannot start docker daemon in CentOS7
This is my docker.socket file [which is just a copy-paste of one of the answer]
[Unit]
Description=Docker Socket for the API
PartOf=docker.service
[Socket]
ListenStream=/var/run/docker.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker
[Install]
WantedBy=sockets.target
This is the error I am getting error while starting docker.socket
sudo systemctl start docker.socket
See "systemctl status docker.socket" and "journalctl -xe" for details.
output of "systemctl status docker.socket"
systemctl status docker.socket
systemd[1]: Socket service docker.service not loaded, refusing.
systemd[1]: Failed to listen on Docker Socket for the API.
docker version details
Client: Docker Engine - Community
Version: 19.03.2
API version: 1.40
Go version: go1.12.8
Git commit: 6a30dfca03
Built: Thu Aug 29 05:26:30 2019
OS/Arch: linux/amd64
Experimental: false
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
To me, it looks like that docker.service is an issue. Could you please suggest how I can resolve it.
There should be a docker.service unit file at either /lib/systemd/system or /etc/systemd/system. Mine looks like what's shown below.
If you have one there, you can try to make sure it's enabled via:
sudo systemctl enable docker.service
Here's an exmaple of the docker.service unit file:
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
BindsTo=containerd.service
After=network-online.target firewalld.service containerd.service
Wants=network-online.target
Requires=docker.socket
[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
# Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.
# Both the old, and new location are accepted by systemd 229 and up, so using the old location
# to make them work for either version of systemd.
StartLimitBurst=3
# Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.
# Both the old, and new name are accepted by systemd 230 and up, so using the old name to make
# this option work for either version of systemd.
StartLimitInterval=60s
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Comment TasksMax if your systemd version does not support it.
# Only systemd 226 and above support this option.
TasksMax=infinity
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
[Install]
WantedBy=multi-user.target

docker.socket: Failed with result 'service-start-limit-hit' after protecting docker daemon socket

I followed the steps provided in the documentation here to add tls security for docker api. Certificates are located in ~/.docker/ as well as /etc/docker/ssl/ folders. I added override.conf to /etc/systemd/system/docker.service.d/ with content
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -H tcp://0.0.0.0:2376 --tlsverify --tlscacert=ca.pem --tlscert=server-cert.pem --tlskey=server-key.pem
Then, I used daemon-reload and docker start
$ systemctl daemon-reload
$ service docker start
The errors in journalctl -xe is:
-- Unit docker.socket has finished starting up.
--
-- The start-up result is RESULT.
Jan 15 21:43:24 cynicalplyaground systemd[1]: docker.service: Start request repeated too quickly.
Jan 15 21:43:24 cynicalplyaground systemd[1]: docker.service: Failed with result 'exit-code'.
Jan 15 21:43:24 cynicalplyaground systemd[1]: Failed to start Docker Application Container Engine.
-- Subject: Unit docker.service has failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit docker.service has failed.
--
-- The result is RESULT.
Jan 15 21:43:24 cynicalplyaground systemd[1]: docker.socket: Failed with result 'service-start-limit-hit'.
Jan 15 21:45:01 cynicalplyaground CRON[12768]: pam_unix(cron:session): session opened for user root by (uid=0)
Jan 15 21:45:01 cynicalplyaground CRON[12769]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Jan 15 21:45:01 cynicalplyaground CRON[12768]: pam_unix(cron:session): session closed for user root
How can I sort this issue?
In the present case the same error occured after the latest manjaro update (2020-01-20).
Tried to change the systemd docker service, as adviced in other cases, but I reverted those changes and finally this was solved with:
a reboot of the system
(like advised here: https://www.reddit.com/r/archlinux/comments/7ya4ug/installing_docker_on_arch_linux/)
Getting to the root of the problem;
systemctl status docker.service
has this:
/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
Trying to run that command, it complains about
unable to configure the Docker daemon with file /etc/docker/daemon.json: EOF
ls -l /etc/docker/daemon.json
-rw-r--r-- 1 root root 0 Jul 30 10:32 /etc/docker/daemon.json
NOTE that the JSON file is empty. Delete it.
For me it was because the docker installer uses iptables for nat. Unfortunately Debian uses nftables. You can convert the entries over to nftables or just setup Debian to use the legacy iptables.
sudo update-alternatives --set iptables /usr/sbin/iptables-legacy
sudo update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy
dockerd, should start fine after switching to iptables-legacy.
I have the same issue and just modify the "/usr/bin/dockerd" to "/usr/sbin/dockerd", then it works.
You can check the dockerd path first.
in my case... the host was part of a docker swarm...but the IPv6 was no longer reachable or automatically assigned to the host...
I manually add the old_IPv6
ip -6 address add 28xx:xxxx:x:x:xx:ebff:fe14:xxx dev ens3x
the journalctl -u docker.service mention:
level=fatal msg="Error starting cluster component: could not find local IP address: dial udp [2xxx:xxx:xxxx:xxx]:2377: connect: network is unreachable"
after add manually the IPv6 I was able to start docker so with docker running I leave the "swarm" and reboot
docker swarm leave --force
after reboot the docker services run as usual
For me it was missing disk space. Reboot also helped, but I was stillnot able to build any container.
After pruning some outdated stuff from the docker volumes I was able to continue.
I faced a similar issue on Ubuntu because I added the hosts option to /etc/docker/daemon.json file. That's ok, but for systems that use systemd it may cause conflict with the arguments passed to dockerd on start.
The solution was to delete the /etc/docker/daemon.json's hosts entry and set this config on file /etc/systemd/system/docker.service.d/options.conf.
$ cat /etc/systemd/system/docker.service.d/options.conf
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -H tcp://0.0.0.0:2375 -H unix://
After that, restart the service.
$ sudo systemctl daemon-reload
$ sudo systemctl restart docker
You may check that your changes has been applied by running docker info. Also, you may note on the docker service status that Drop-In field is using the options.conf created, and dockerd was executed with the specified host list.
$ systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset>
Drop-In: /etc/systemd/system/docker.service.d
└─options.conf
Active: active (running) since Fri 2022-11-18 01:02:18 EST; 1h 50min ago
TriggeredBy: ● docker.socket
Docs: https://docs.docker.com
Main PID: 1111 (dockerd)
Tasks: 18
Memory: 58.5M
CPU: 1.294s
CGroup: /system.slice/docker.service
└─1111 /usr/bin/dockerd -H tcp://0.0.0.0:2375 -H unix://
References:
Daemon configuration file
Control Docker with systemd
I had a similar issue on nixOS installed in a btrfs filesystem.
For me the solution was to add virtualisation.docker.storageDriver = "btrfs"; to my /etc/nixos/configuration.nix
Which according to the docker docs should equate to adding the following to /etc/docker/daemon.json in most other distros:
{
"storage-driver": "btrfs"
}
I was able to solve the problem by disabling the firewalld
systemctl disable firewalld
systemctl stop firewalld

docker flannel subnet issues

Docker not picking up flannel subnet. Any help will be greatful. I am
using coreos as my container Linux and docker version is 1.12.6
my docker startup file look like below.
Flannel is working as expected
[Unit]
Description=Docker Application Container Engine
Documentation=http://docs.docker.com
After=containerd.service docker.socket network.target
Requires=containerd.service docker.socket
[Service]
Type=notify
EnvironmentFile=-/run/flannel/flannel_docker_opts.env
# the default is not to use systemd for cgroups because the delegate issues
still
# exists and systemd currently does not support the cgroup feature set
required
# for containers run by docker
ExecStart=/usr/lib/coreos/dockerd --host=fd:// --
containerd=/var/run/docker/libcontainerd/docker-containerd.sock
$DOCKER_OPTS $DOCKER_CGROUPS $
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=1048576
# Having non-zero Limit*s causes performance problems due to accounting
overhead
# in the kernel. We recommend using cgroups to do container-local
accounting.
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker
containers
Delegate=yes
[Install]
WantedBy=multi-user.target
# /etc/systemd/system/docker.service.d/40-flannel.conf
[Unit]
Requires=flanneld.service
After=flanneld.service
[Service]
EnvironmentFile=/etc/kubernetes/cni/docker_opts_cni.env
# /etc/systemd/system/docker.service.d/40-storage.conf
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd --host=fd:// --graph="/abc/docker" $DOCKER_OPTS
$DOCKER_CGROUPS $DOCKER_OPT_BIP $DOCKER_OPT_MTU $DOCKER_OPT_IPMASQ
# /etc/systemd/system/docker.service.d/50-insecure-registry.conf
[Service]
Environment=DOCKER_OPTS='--insecure-registry="10.x.x.x:5000"'
# /etc/systemd/system/docker.service.d/50-require-flannel.conf
[Unit]
Requires=flanneld.service
After=flanneld.service
Check if /run/flannel/flannel_docker_opts.env exists and its content.
My /run/flannel/subnet.env looks like:
FLANNEL_NETWORK=10.252.0.0/16
FLANNEL_SUBNET=10.252.127.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=false
Add --bip=${FLANNEL_SUBNET} --mtu=${FLANNEL_MTU} to dockerd args.
If this cannot work, rm -rf /var/lib/docker/overlay2/* and restart docker.service

Returning host IP address into docker contain inside systemd configuration file

Am working systemd configuration file, which needs a parameter of current docker host server. I tried following and it worked fine for me when i run directly
docker run -p 5555:5555 -e REMOTE_HOST="http://`/bin/hostname --ip-address`:5555" -e HUB_PORT_4444_TCP_ADDR=selhub.stage.internal -e HUB_PORT_4444_TCP_PORT=4444 --name=stage_selff stage_selff
But when I tried to add this into systemctl file,
[Unit]
Description=Selenium node container
Requires=docker.service
After=docker.service
[Service]
Restart=on-failure
RestartSec=10
ExecStartPre=-/usr/bin/docker kill stage_selff
ExecStartPre=-/usr/bin/docker rm stage_selff
ExecStart=/usr/bin/docker run -p 5555:5555 -e REMOTE_HOST="http://`/bin/hostname --ip-address`:5555" -e HUB_PORT_4444_TCP_ADDR=selhub.stage.internal -e HUB_PORT_4444_TCP_PORT=4444 --name=stage_selff stage_selff
ExecStop=-/usr/bin/docker stop stage_selff
[Install]
WantedBy=multi-user.target
Its not working, I cannot find clear logs why, but am guessing the ` operator is not accepted by systemctl
Here is what the systemctl logs says
Sep 15 01:27:06 docker[831]: 21:27:06.337 INFO - Launching a Selenium Grid node
Sep 15 01:29:06 docker[831]: 21:29:06.668 WARN - error getting the parameters from the hub. The node may end up with wrong timeouts.Connect
Sep 15 01:29:06 docker[831]: 21:29:06.675 INFO - Java: Oracle Corporation 25.45-b02
Sep 15 01:29:06 docker[831]: 21:29:06.678 INFO - OS: Linux 3.16.0-4-amd64 amd64
Sep 15 01:29:06 docker[831]: 21:29:06.690 INFO - v2.47.1, with Core v2.47.1. Built from revision 411b314
Sep 15 01:29:06 docker[831]: 21:29:06.735 INFO - Driver provider org.openqa.selenium.ie.InternetExplorerDriver registration is skipped:
That construction wouldn't work, as ExecStart and it's siblings(like ExecStartPre, ExecStop, etc) don't use shell for operation, but rather execve(3) system call. That means that shell constructions like $(command), >, >>, < and others have no effect.
See Command lines for details.
You may try to employ Environment and EnvironmetFile options to pass values from outside of the unit file.

Docker containers shut down after systemd start

For some reason when using systemd unit files my docker containers start but get shut down instantly. I have tried finding logs but can not see any indication on why this is happening. Is there someone that knows how to solve this / find the logs that show what is happening?
Note: When starting them manually after boot with docker start containername then it works (also when using systemctl start nginx)
After some more digging I found this error: could not find udev device: No such device it could have something to do with this?
Unit Service file:
[Unit]
Description=nginx-container
Requires=docker.service
After=docker.service
[Service]
Restart=always
RestartSec=2
StartLimitInterval=3600
StartLimitBurst=5
TimeoutStartSec=5
ExecStartPre=-/usr/bin/docker kill nginx
ExecStartPre=-/usr/bin/docker rm nginx
ExecStart=/usr/bin/docker run -i -d -t --restart=no --name nginx -p 80:80 -v /projects/frontend/data/nginx/:/var/www -v /projects/frontend: nginx
ExecStop=/usr/bin/docker stop -t 2 nginx
[Install]
WantedBy=multi-user.target
Journalctl output:
May 28 11:18:15 frontend dockerd[462]: time="2015-05-28T11:18:15Z" level=info msg="-job start(d757f83d4a13f876140ae008da943e8c5c3a0765c1fe5bc4a4e2599b70c30626) = OK (0)"
May 28 11:18:15 frontend dockerd[462]: time="2015-05-28T11:18:15Z" level=info msg="POST /v1.18/containers/nginx/stop?t=2"
May 28 11:18:15 frontend dockerd[462]: time="2015-05-28T11:18:15Z" level=info msg="+job stop(nginx)"
Docker logs: empty (docker logs nginx)
Systemctl output: (systemctl status nginx, nginx.service)
● nginx.service - nginx-container
Loaded: loaded (/etc/systemd/system/multi-user.target.wants/nginx.service)
Active: failed (Result: start-limit) since Thu 2015-05-28 11:18:20 UTC; 12min ago
Process: 3378 ExecStop=/usr/bin/docker stop -t 2 nginx (code=exited, status=0/SUCCESS)
Process: 3281 ExecStart=/usr/bin/docker run -i -d -t --restart=no --name nginx -p 80:80 -v /projects/frontend/data/nginx/:/var/www -v /projects/frontend:/nginx (code=exited, status=0/SUCCESS)
Process: 3258 ExecStartPre=/usr/bin/docker rm nginx (code=exited, status=0/SUCCESS)
Process: 3246 ExecStartPre=/usr/bin/docker kill nginx (code=exited, status=0/SUCCESS)
Main PID: 3281 (code=exited, status=0/SUCCESS)
May 28 11:18:20,frontend systemd[1]: nginx.service holdoff time over, scheduling restart.
May 28 11:18:20 frontend systemd[1]: start request repeated too quickly for nginx.service
May 28 11:18:20 frontend systemd[1]: Failed to start nginx-container.
May 28 11:18:20 frontend systemd[1]: Unit nginx.service entered failed state.
May 28 11:18:20 frontend systemd[1]: nginx.service failed.
Because you have not specified a Type in your systemd unit file, systemd is using the default, simple. From systemd.service:
If set to simple (the default if neither Type= nor BusName=, but
ExecStart= are specified), it is expected that the process
configured with ExecStart= is the main process of the service.
This means that if the process started by ExecStart exits, systemd
will assume your service has exited and will clean everything up.
Because you are running the docker client with -d, it exits
immediately...thus, systemd cleans up the service.
Typically, when starting containers with systemd, you would not use
the -d flag. This means that the client will continue running, and
will allow systemd to collect any output produced by your application.
That said, there are fundamental problems in starting Docker containers with systemd. Because of the way Docker operates, there really is no way for systemd to monitor the status of your container. All it can really do is track the status of the docker client, which is not the same thing (the client can exit/crash/etc without impacting your container). This isn't just relevant to systemd; any sort of process supervisor (upstart, runit, supervisor, etc) will have the same problem.

Resources