when I am start docker(version 18.09.6) using:
service docker start
the log output:
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z dockerd[21199]: time="2019-09-14T23:47:24.058391699+08:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420166a80, READY" module=grpc
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z dockerd[21199]: time="2019-09-14T23:47:24.065016392+08:00" level=error msg="'overlay2' is not supported over nfs" storage-driver=overlay2
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z dockerd[21199]: Error starting daemon: error initializing graphdriver: backing file system is unsupported for this graph driver
this is my deamon.json config:
{
"data-root": "/data/docker/lib/docker",
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true",
"overlay2.size=1G"
]
}
what is problem with my docker? this is the full log output:
Sep 14 23:47:23 iZuf63refzweg1d9dh94t9Z systemd[1]: Starting Docker Application Container Engine...
Sep 14 23:47:23 iZuf63refzweg1d9dh94t9Z dockerd[21199]: time="2019-09-14T23:47:23+08:00" level=warning msg="The \"-g / --graph\" flag is deprecated. Please use \"--data-root\" instead"
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z dockerd[21199]: time="2019-09-14T23:47:24.054488483+08:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z dockerd[21199]: time="2019-09-14T23:47:24.054528530+08:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z dockerd[21199]: time="2019-09-14T23:47:24.054680027+08:00" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///run/containerd/containerd.sock 0 <nil>}]" module=grpc
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z dockerd[21199]: time="2019-09-14T23:47:24.054720022+08:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z dockerd[21199]: time="2019-09-14T23:47:24.054796786+08:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420166680, CONNECTING" module=grpc
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z dockerd[21199]: time="2019-09-14T23:47:24.055020881+08:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420166680, READY" module=grpc
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z dockerd[21199]: time="2019-09-14T23:47:24.058007723+08:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z dockerd[21199]: time="2019-09-14T23:47:24.058029632+08:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z dockerd[21199]: time="2019-09-14T23:47:24.058172958+08:00" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///run/containerd/containerd.sock 0 <nil>}]" module=grpc
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z dockerd[21199]: time="2019-09-14T23:47:24.058204756+08:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z dockerd[21199]: time="2019-09-14T23:47:24.058254343+08:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420166a80, CONNECTING" module=grpc
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z dockerd[21199]: time="2019-09-14T23:47:24.058391699+08:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420166a80, READY" module=grpc
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z dockerd[21199]: time="2019-09-14T23:47:24.065016392+08:00" level=error msg="'overlay2' is not supported over nfs" storage-driver=overlay2
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z dockerd[21199]: Error starting daemon: error initializing graphdriver: backing file system is unsupported for this graph driver
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z systemd[1]: Failed to start Docker Application Container Engine.
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z systemd[1]: Unit docker.service entered failed state.
Sep 14 23:47:24 iZuf63refzweg1d9dh94t9Z systemd[1]: docker.service failed.
Sep 14 23:47:26 iZuf63refzweg1d9dh94t9Z systemd[1]: docker.service holdoff time over, scheduling restart.
Sep 14 23:47:26 iZuf63refzweg1d9dh94t9Z systemd[1]: Stopped Docker Application Container Engine.
Sep 14 23:47:26 iZuf63refzweg1d9dh94t9Z systemd[1]: start request repeated too quickly for docker.service
Sep 14 23:47:26 iZuf63refzweg1d9dh94t9Z systemd[1]: Failed to start Docker Application Container Engine.
Sep 14 23:47:26 iZuf63refzweg1d9dh94t9Z systemd[1]: Unit docker.service entered failed state.
Sep 14 23:47:26 iZuf63refzweg1d9dh94t9Z systemd[1]: docker.service failed.
The point is my host(total 5) mount a share nfs disk.Change the path should solve the problem.
[root#iZuf63refzweg1dh94t9Z containers]# df -T /data
Filesystem Type 1K-blocks Used Available Use% Mounted on
3761b-lrf18.cn-shanghai.nas.aliyuncs.com:/ nfs4 10995116277760 233244672 10994883033088 1% /data
change the daemon.json root path to local:
{
"data-root": "/var/lib/docker",
"storage-driver": "overlay2"
}
Related
My docker containers are getting removed intermittent after few days.
-- Logs begin at Mon 2020-08-31 10:12:44 IST, end at Thu 2020-09-17 23:10:25 IST. --
Aug 31 11:31:02 SPK-X-0036 systemd[1]: Starting Docker Application Container Engine...
Aug 31 11:31:02 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:02.538275526+05:30" level=info msg="Starting up"
Aug 31 11:31:02 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:02.539105284+05:30" level=warning msg="[!] DON'T BIND ON ANY IP ADDRESS WITHOUT setting --tlsverify IF YOU DON'T KNOW WHAT YOU'RE DOING [!]"
Aug 31 11:31:02 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:02.544986324+05:30" level=info msg="parsed scheme: \"unix\"" module=grpc
Aug 31 11:31:02 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:02.545033917+05:30" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Aug 31 11:31:02 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:02.545086917+05:30" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock 0 <nil>}] <nil>}" module=grpc
Aug 31 11:31:02 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:02.545114012+05:30" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Aug 31 11:31:02 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:02.548578610+05:30" level=info msg="parsed scheme: \"unix\"" module=grpc
Aug 31 11:31:02 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:02.548640183+05:30" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Aug 31 11:31:02 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:02.548673867+05:30" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock 0 <nil>}] <nil>}" module=grpc
Aug 31 11:31:02 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:02.548698658+05:30" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Aug 31 11:31:02 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:02.842884585+05:30" level=info msg="[graphdriver] using prior storage driver: overlay2"
Aug 31 11:31:02 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:02.944238578+05:30" level=info msg="Loading containers: start."
Aug 31 11:31:02 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:02.985979893+05:30" level=warning msg="7e9847e8c2ccb1cb3690316d19b66136d8f9fd5c9c436969bc7b6303db345d30 cleanup: failed to unmount IPC: umount
Aug 31 11:31:11 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:11.081963154+05:30" level=info msg="Removing stale sandbox 69a759dcb5b230c1020e53a89ef8887e7447ce3064f930a8a914324d800cedc4 (404f29278912116b3cd04a407fe57cf16538cc623b2a2faa4328ab0cdb59fcba)"
Aug 31 11:31:11 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:11.542843230+05:30" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint 6711cd7b1ce0ede11e852ca3bd0114934d14e83292364c27ee5808cffa1062c4 87ea9779402a38ffacf62ce84fdbf7cc2cd8419c4d0cae22ddc8468072b7ea6c], retrying...."
Aug 31 11:31:11 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:11.804699526+05:30" level=error msg="getEndpointFromStore for eid 2cc1662e8f61a8b7c549754ddc92c0c6d59d03905719ec32be5b63bd7f0ea881 failed while trying to build sandbox for cleanup: could not find endpoint 2cc1662e8f61a8b7c549754ddc92c0c6d59d03905719ec32be5b63bd7f0ea881: []"
Aug 31 11:31:11 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:11.804739901+05:30" level=info msg="Removing stale sandbox 6c96e15d6989d7c27fd0a304aefc7bdc9bbae90335523ffbefb0fb6e0624afd0 (527c74012cd1d0447850ea3fe13670f11dd8d294758b470e8e6b21cd7ea46edd)"
Aug 31 11:31:11 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:11.804780951+05:30" level=warning msg="Failed deleting endpoint 2cc1662e8f61a8b7c549754ddc92c0c6d59d03905719ec32be5b63bd7f0ea881: failed to get endpoint from store during Delete: could not find endpoint 2cc1662e8f61a8b7c549754ddc92c0c6d59d03905719ec32be5b63bd7f0ea881: []\n"
Aug 31 11:31:12 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:12.049373612+05:30" level=info msg="Removing stale sandbox a3ac3dd4ea15a633a27320da2cae2d39de808a96d0841a34271b657f10f51483 (38bf61d74d42fad07beef7f68e1eec34a2854a149d78c1008f454b02df1bf9c4)"
Aug 31 11:31:12 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:12.412946082+05:30" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint 6711cd7b1ce0ede11e852ca3bd0114934d14e83292364c27ee5808cffa1062c4 b2f51b62ffe030ca537438ff0c547597c61428e3079a91f5c6b62e46fdbfe955], retrying...."
Aug 31 11:31:12 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:12.660895622+05:30" level=info msg="Removing stale sandbox edd9206a3ac0e5cd0adb28d04bdc15d4ffb36446b3557cbc7e3621e40459d9f8 (3167d0b093703c12e6978d052022241fb4c9088eec12247e743ad5ef9845240d)"
Aug 31 11:31:13 SPK-X-0036 dockerd[6678]: time="2020-08-31T11:31:13.009080357+05:30" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint 6711cd7b1ce0ede11e852ca3bd0114934d14e83292364c27ee5808cffa1062c4 946f45722e853160f4d194eb01916c566e9dba2b57e8b48a4265147e1707385a], retrying...."
I am facing a strange issue with the docker image pull command where the command fails with this error;
[desai#brilp0017 ~]$ docker image pull nginx:latest
latest: Pulling from library/nginx
d121f8d1c412: Extracting [==================================================>]
27.09MB/27.09MB
ebd81fc8c071: Download complete
655316c160af: Download complete
d15953c0e0f8: Download complete
2ee525c5c3cc: Download complete
failed to register layer: Error processing tar file(exit status 1): Error cleaning up after pivot: remove /.pivot_root534731447: device or resource busy
After this error the docker daemon is no longer accessible and all docker commands return following error;
[desai#brilp0017 ~]$ docker info
Client:
Debug Mode: false
Server:
ERROR: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
errors pretty printing info
The command systemctl status docker however shows it as running;
[desai#brilp0017 ~]$ systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
Active: active (running) since Fri 2020-09-11 14:25:53 BST; 14min ago
TriggeredBy: ● docker.socket
Docs: https://docs.docker.com
Main PID: 2380 (dockerd)
Tasks: 14
Memory: 249.5M
CGroup: /system.slice/docker.service
└─2380 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
The journalctl log has the same error line as encountered in the pull command;
Sep 11 14:25:52 brilp0017 systemd[1]: Starting Docker Application Container Engine...
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.345006155+01:00" level=info msg="Starting up"
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.348597478+01:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.348667479+01:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.348733420+01:00" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock 0 <nil>}] <nil>}" module=grpc
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.348765306+01:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.353865701+01:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.353908904+01:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.353944835+01:00" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock 0 <nil>}] <nil>}" module=grpc
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.353988191+01:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.497701794+01:00" level=info msg="[graphdriver] using prior storage driver: overlay2"
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.816295801+01:00" level=warning msg="Your kernel does not support cgroup rt period"
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.816318357+01:00" level=warning msg="Your kernel does not support cgroup rt runtime"
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.816442165+01:00" level=info msg="Loading containers: start."
Sep 11 14:25:53 brilp0017 dockerd[2380]: time="2020-09-11T14:25:53.101411528+01:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Sep 11 14:25:53 brilp0017 dockerd[2380]: time="2020-09-11T14:25:53.125378601+01:00" level=info msg="Loading containers: done."
Sep 11 14:25:53 brilp0017 dockerd[2380]: time="2020-09-11T14:25:53.291896277+01:00" level=warning msg="Not using native diff for overlay2, this may cause degraded performance for building images: kernel has CONFIG_OVERLAY_FS_REDIRECT_DIR enabled" storage-driver=overlay2
Sep 11 14:25:53 brilp0017 dockerd[2380]: time="2020-09-11T14:25:53.292711063+01:00" level=info msg="Docker daemon" commit=48a66213fe graphdriver(s)=overlay2 version=19.03.12-ce
Sep 11 14:25:53 brilp0017 dockerd[2380]: time="2020-09-11T14:25:53.293190069+01:00" level=info msg="Daemon has completed initialization"
Sep 11 14:25:53 brilp0017 dockerd[2380]: time="2020-09-11T14:25:53.340381428+01:00" level=info msg="API listen on /run/docker.sock"
Sep 11 14:25:53 brilp0017 systemd[1]: Started Docker Application Container Engine.
Sep 11 14:32:38 brilp0017 dockerd[2380]: time="2020-09-11T14:32:38.011501405+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Sep 11 14:33:11 brilp0017 dockerd[2380]: time="2020-09-11T14:33:11.592234770+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Sep 11 14:34:52 brilp0017 dockerd[2380]: time="2020-09-11T14:34:52.864254519+01:00" level=info msg="Attempting next endpoint for pull after error: failed to register layer: Error processing tar file(exit status 1): Error cleaning up after pivot: remove /.pivot_root534731447: device or resource busy"
After this the error remains the same even after stopping and starting docker service multiple times with systemctl. After restarting the laptop fully and starting the docker service it starts work as expected until the next time docker pull command is used.
I have searched for solution on the internet but majority of them point to the user not being in docker group but that is not the case for me;
[desai#brilp0017 ~]$ groups
sys network power vboxusers wireshark sambashare docker lp wheel desai
Here is the output of docker version before it crashes for version details;
[desai#brilp0017 ~]$ docker version
Client:
Version: 19.03.12-ce
API version: 1.40
Go version: go1.14.5
Git commit: 48a66213fe
Built: Sat Jul 18 01:33:21 2020
OS/Arch: linux/amd64
Experimental: false
Server:
Engine:
Version: 19.03.12-ce
API version: 1.40 (minimum version 1.12)
Go version: go1.14.5
Git commit: 48a66213fe
Built: Sat Jul 18 01:32:59 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: v1.4.0.m
GitCommit: 09814d48d50816305a8e6c1a4ae3e2bcc4ba725a.m
runc:
Version: 1.0.0-rc92
GitCommit: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
docker-init:
Version: 0.18.0
GitCommit: fec3683
I am using Manjaro linux OS;
Operating System: Manjaro Linux
KDE Plasma Version: 5.19.4
KDE Frameworks Version: 5.73.0
Qt Version: 5.15.0
Kernel Version: 4.19.141-2-MANJARO
OS Type: 64-bit
Processors: 8 × Intel® Core™ i7-8550U CPU # 1.80GHz
Memory: 31.2 GiB of RAM
Graphics Processor: Mesa Intel® UHD Graphics 620
Any help on this would be appreciated.
This issue was resolved by updating the kernel to 5.4 version.
I have 8 containers running in a host. When I restarted the host server, 4 out of 8 containers went in a restart loop.
Below operation performed before server reboot:
updated restart policy of each container to "always"
cpu-sets assigned to each container.
Below are the logs from one sample container (same logs can be seen for rest of 3 containers)
Jul 29 19:07:28 docker-worker-3-7 dockerd: time="2020-07-29T19:07:28.950906638+05:30" level=info msg="Removing stale sandbox 950a6df
ee89162c0d21911fd4875466af986ee75d49955eb87f989451c2a1cae (28cdbceb291cff58e2a7993b05c1401464b92743bc54c2a84e124e56638951c4)"
Jul 29 19:07:30 docker-worker-3-7 dockerd: time="2020-07-29T19:07:30+05:30" level=info msg="shim docker-containerd-shim started" add
ress="/containerd-shim/moby/28cdbceb291cff58e2a7993b05c1401464b92743bc54c2a84e124e56638951c4/shim.sock" debug=false pid=3003
Jul 29 19:07:30 docker-worker-3-7 dockerd: time="2020-07-29T19:07:30+05:30" level=info msg="shim reaped" id=28cdbceb291cff58e2a7993b
05c1401464b92743bc54c2a84e124e56638951c4
Jul 29 19:07:30 docker-worker-3-7 dockerd: time="2020-07-29T19:07:30+05:30" level=info msg="shim docker-containerd-shim started" add
ress="/containerd-shim/moby/28cdbceb291cff58e2a7993b05c1401464b92743bc54c2a84e124e56638951c4/shim.sock" debug=false pid=3664
Jul 29 19:07:31 docker-worker-3-7 dockerd: time="2020-07-29T19:07:31+05:30" level=info msg="shim reaped" id=28cdbceb291cff58e2a7993b
05c1401464b92743bc54c2a84e124e56638951c4
Jul 29 19:07:31 docker-worker-3-7 dockerd: time="2020-07-29T19:07:31+05:30" level=info msg="shim docker-containerd-shim started" add
ress="/containerd-shim/moby/28cdbceb291cff58e2a7993b05c1401464b92743bc54c2a84e124e56638951c4/shim.sock" debug=false pid=4075
Jul 29 19:07:31 docker-worker-3-7 dockerd: time="2020-07-29T19:07:31+05:30" level=info msg="shim reaped" id=28cdbceb291cff58e2a7993b
05c1401464b92743bc54c2a84e124e56638951c4
Jul 29 19:07:32 docker-worker-3-7 dockerd: time="2020-07-29T19:07:32+05:30" level=info msg="shim docker-containerd-shim started" add
ress="/containerd-shim/moby/28cdbceb291cff58e2a7993b05c1401464b92743bc54c2a84e124e56638951c4/shim.sock" debug=false pid=4540
Jul 29 19:07:32 docker-worker-3-7 dockerd: time="2020-07-29T19:07:32+05:30" level=info msg="shim reaped" id=28cdbceb291cff58e2a7993b
05c1401464b92743bc54c2a84e124e56638951c4
Jul 29 19:07:33 docker-worker-3-7 dockerd: time="2020-07-29T19:07:33+05:30" level=info msg="shim docker-containerd-shim started" add
ress="/containerd-shim/moby/28cdbceb291cff58e2a7993b05c1401464b92743bc54c2a84e124e56638951c4/shim.sock" debug=false pid=4888
Jul 29 19:07:33 docker-worker-3-7 dockerd: time="2020-07-29T19:07:33+05:30" level=info msg="shim reaped" id=28cdbceb291cff58e2a7993b
05c1401464b92743bc54c2a84e124e56638951c4
Jul 29 19:07:34 docker-worker-3-7 dockerd: time="2020-07-29T19:07:34+05:30" level=info msg="shim docker-containerd-shim started" add
ress="/containerd-shim/moby/28cdbceb291cff58e2a7993b05c1401464b92743bc54c2a84e124e56638951c4/shim.sock" debug=false pid=5236
Jul 29 19:07:35 docker-worker-3-7 dockerd: time="2020-07-29T19:07:35+05:30" level=info msg="shim reaped" id=28cdbceb291cff58e2a7993b
05c1401464b92743bc54c2a84e124e56638951c4
Jul 30 11:32:22 docker-worker-3-7 dockerd: time="2020-07-30T11:32:22.168356749+05:30" level=error msg="collecting stats for 28cdbceb291cff58e2a799
3b05c1401464b92743bc54c2a84e124e56638951c4: no such container"
Jul 30 11:32:23 docker-worker-3-7 dockerd: time="2020-07-30T11:32:23.170414772+05:30" level=error msg="collecting stats for 28cdbceb291cff58e2a799
3b05c1401464b92743bc54c2a84e124e56638951c4: no such container"
Jul 30 11:32:24 docker-worker-3-7 dockerd: time="2020-07-30T11:32:24.175846696+05:30" level=error msg="collecting stats for 28cdbceb291cff58e2a799
3b05c1401464b92743bc54c2a84e124e56638951c4: no such container"
Jul 30 11:32:25 docker-worker-3-7 dockerd: time="2020-07-30T11:32:25.179462074+05:30" level=error msg="collecting stats for 28cdbceb291cff58e2a799
3b05c1401464b92743bc54c2a84e124e56638951c4: no such container"
Jul 30 11:32:26 docker-worker-3-7 dockerd: time="2020-07-30T11:32:26.180273919+05:30" level=error msg="collecting stats for 28cdbceb291cff58e2a799
3b05c1401464b92743bc54c2a84e124e56638951c4: no such container"
Jul 30 11:32:27 docker-worker-3-7 dockerd: time="2020-07-30T11:32:27.184552211+05:30" level=error msg="collecting stats for 28cdbceb291cff58e2a799
3b05c1401464b92743bc54c2a84e124e56638951c4: no such container"
Jul 30 11:32:28 docker-worker-3-7 dockerd: time="2020-07-30T11:32:28.191554766+05:30" level=error msg="collecting stats for 28cdbceb291cff58e2a799
3b05c1401464b92743bc54c2a84e124e56638951c4: no such container"
Jul 30 11:32:29 docker-worker-3-7 dockerd: time="2020-07-30T11:32:29.196273850+05:30" level=error msg="collecting stats for 28cdbceb291cff58e2a799
3b05c1401464b92743bc54c2a84e124e56638951c4: no such container"
Jul 30 11:32:30 docker-worker-3-7 dockerd: time="2020-07-30T11:32:30.202965670+05:30" level=error msg="collecting stats for 28cdbceb291cff58e2a799
3b05c1401464b92743bc54c2a84e124e56638951c4: no such container"
Jul 30 11:32:31 docker-worker-3-7 dockerd: time="2020-07-30T11:32:31.205259453+05:30" level=error msg="collecting stats for 28cdbceb291cff58e2a799
3b05c1401464b92743bc54c2a84e124e56638951c4: no such container"
Jul 30 11:32:32 docker-worker-3-7 dockerd: time="2020-07-30T11:32:32.210054931+05:30" level=error msg="collecting stats for 28cdbceb291cff58e2a799
3b05c1401464b92743bc54c2a84e124e56638951c4: no such container"
Jul 30 11:32:33 docker-worker-3-7 dockerd: time="2020-07-30T11:32:33.215448730+05:30" level=error msg="collecting stats for 28cdbceb291cff58e2a799
3b05c1401464b92743bc54c2a84e124e56638951c4: no such container"
Jul 30 11:32:34 docker-worker-3-7 dockerd: time="2020-07-30T11:32:34.220660678+05:30" level=error msg="collecting stats for 28cdbceb291cff58e2a799
3b05c1401464b92743bc54c2a84e124e56638951c4: no such container"
Jul 30 11:55:51 docker-worker-3-7 dockerd: time="2020-07-30T11:55:51.246346075+05:30" level=error msg="28cdbceb291cff58e2a7993b05c1401464b92743bc5
4c2a84e124e56638951c4 cleanup: failed to delete container from containerd: no such container"
Jul 30 12:06:17 docker-worker-3-7 dockerd: time="2020-07-30T12:06:17.551094064+05:30" level=error msg="28cdbceb291cff58e2a7993b05c1401464b92743bc5
4c2a84e124e56638951c4 cleanup: failed to delete container from containerd: no such container"
Jul 30 12:21:32 docker-worker-3-7 dockerd: time="2020-07-30T12:21:32.309091576+05:30" level=error msg="28cdbceb291cff58e2a7993b05c1401464b92743bc5
4c2a84e124e56638951c4 cleanup: failed to delete container from containerd: no such container"
Server restarted at Wed Jul 29 19:06
To avoid an outage, we have recreated the containers but wanted to know the reason behind this "container restart loop"
I'm running several docker containers with restart=always on Ubuntu 18.04.1 LTS. The physical server reboots every morning at 2am via cronjob executing reboot now.
So far, I haven't had any problems with that in the past 5 or 6 months running that particular setup.
But today, containers didn't start after the daily reboot. The output of docker ps was empty, all containers were in state "Exited".
Why does this happen all of a sudden? Was my setup mis-configured from the beginning, or does the recent docker-ce package upgrade play a role?
Here are logs before and after the reboot as well as the docker.service unit and version info:
root#skprov2:~# journalctl -b -1 -x -u docker
Nov 15 02:00:02 skprov2 systemd[1]: Stopping Docker Application Container Engine...
-- Subject: Unit docker.service has begun shutting down
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit docker.service has begun shutting down.
Nov 15 02:00:02 skprov2 dockerd[1504]: time="2018-11-15T02:00:02.189764841+01:00" level=info msg="Processing signal 'terminated'"
Nov 15 02:00:02 skprov2 dockerd[1504]: time="2018-11-15T02:00:02.595098434+01:00" level=info msg="shim reaped" id=c929d444a6eb59a69a0da738ca782a9feb92ac1f80e5c4576bf85376c3d4c17a
Nov 15 02:00:02 skprov2 dockerd[1504]: time="2018-11-15T02:00:02.601217756+01:00" level=info msg="shim reaped" id=98a8c1b99cf986e6a889474f0fc28fe3635e466b21f8a37ef3c10a1050495c78
Nov 15 02:00:02 skprov2 dockerd[1504]: time="2018-11-15T02:00:02.604880385+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Nov 15 02:00:02 skprov2 dockerd[1504]: time="2018-11-15T02:00:02.670918937+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Nov 15 02:00:02 skprov2 dockerd[1504]: time="2018-11-15T02:00:02.732991633+01:00" level=info msg="shim reaped" id=9b3badc752786df08d00138c0222042a6bd80bb2c971f5a96b71e57105cea95c
Nov 15 02:00:02 skprov2 dockerd[1504]: time="2018-11-15T02:00:02.748732351+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Nov 15 02:00:02 skprov2 dockerd[1504]: time="2018-11-15T02:00:02.843982385+01:00" level=info msg="shim reaped" id=ae7531405113db8b4754491a12c2ababf09fa0c8f501bfe6f1b33e3ff18b6462
Nov 15 02:00:02 skprov2 dockerd[1504]: time="2018-11-15T02:00:02.869023019+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Nov 15 02:00:03 skprov2 dockerd[1504]: time="2018-11-15T02:00:03.863568729+01:00" level=info msg="shim reaped" id=b335536f5f07b1db3f32ba4452fc4aadacc02c6184cef7fc9df619ab81bbf002
Nov 15 02:00:04 skprov2 dockerd[1504]: time="2018-11-15T02:00:04.279347144+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Nov 15 02:00:12 skprov2 dockerd[1504]: time="2018-11-15T02:00:12.233635995+01:00" level=info msg="Container 77e00eabebea97357f05f564597d167acad8f2596b25d295b4366baf08ef3127 failed to exit within 10 seconds of signal 15 - using the force"
Nov 15 02:00:12 skprov2 dockerd[1504]: time="2018-11-15T02:00:12.253563540+01:00" level=info msg="Container 7f7d2a92bcdbb240a9400942c9301f5cd77bf9d3fbde1d38f41a2bd1226f9b09 failed to exit within 10 seconds of signal 15 - using the force"
Nov 15 02:00:12 skprov2 dockerd[1504]: time="2018-11-15T02:00:12.253563179+01:00" level=info msg="Container f6b49cc85eb7f9226ac192498b1e319d68e0de2faff6b4e3e67adabba43a093a failed to exit within 10 seconds of signal 15 - using the force"
Nov 15 02:00:12 skprov2 dockerd[1504]: time="2018-11-15T02:00:12.654403249+01:00" level=info msg="shim reaped" id=7f7d2a92bcdbb240a9400942c9301f5cd77bf9d3fbde1d38f41a2bd1226f9b09
Nov 15 02:00:12 skprov2 dockerd[1504]: time="2018-11-15T02:00:12.679675304+01:00" level=info msg="shim reaped" id=f6b49cc85eb7f9226ac192498b1e319d68e0de2faff6b4e3e67adabba43a093a
Nov 15 02:00:12 skprov2 dockerd[1504]: time="2018-11-15T02:00:12.680699340+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Nov 15 02:00:12 skprov2 dockerd[1504]: time="2018-11-15T02:00:12.689801078+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Nov 15 02:00:13 skprov2 dockerd[1504]: time="2018-11-15T02:00:13.088891655+01:00" level=info msg="shim reaped" id=77e00eabebea97357f05f564597d167acad8f2596b25d295b4366baf08ef3127
Nov 15 02:00:13 skprov2 dockerd[1504]: time="2018-11-15T02:00:13.111193244+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Nov 15 02:00:15 skprov2 dockerd[1504]: time="2018-11-15T02:00:15.233286510+01:00" level=info msg="stopping event stream following graceful shutdown" error="<nil>" module=libcontainerd namespace=moby
Nov 15 02:00:15 skprov2 dockerd[1504]: time="2018-11-15T02:00:15.233684167+01:00" level=info msg="stopping healthcheck following graceful shutdown" module=libcontainerd
Nov 15 02:00:15 skprov2 dockerd[1504]: time="2018-11-15T02:00:15.233695697+01:00" level=info msg="stopping event stream following graceful shutdown" error="context canceled" module=libcontainerd namespace=plugins.moby
Nov 15 02:00:15 skprov2 dockerd[1504]: time="2018-11-15T02:00:15.234287398+01:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4209ec610, TRANSIENT_FAILURE" module=grpc
Nov 15 02:00:15 skprov2 dockerd[1504]: time="2018-11-15T02:00:15.234328545+01:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4209ec610, CONNECTING" module=grpc
Nov 15 02:00:16 skprov2 systemd[1]: Stopped Docker Application Container Engine.
-- Subject: Unit docker.service has finished shutting down
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit docker.service has finished shutting down.
==================================================================================
==================================================================================
==================================================================================
root#skprov2:~# journalctl -b 0 -x -u docker
-- Logs begin at Thu 2018-07-05 13:16:23 CEST, end at Thu 2018-11-15 08:16:31 CET. --
Nov 15 02:04:00 skprov2 systemd[1]: Starting Docker Application Container Engine...
-- Subject: Unit docker.service has begun start-up
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit docker.service has begun starting up.
Nov 15 02:04:12 skprov2 dockerd[1690]: time="2018-11-15T02:04:12.152961544+01:00" level=info msg="systemd-resolved is running, so using resolvconf: /run/systemd/resolve/resolv.conf"
Nov 15 02:04:12 skprov2 dockerd[1690]: time="2018-11-15T02:04:12.432271212+01:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Nov 15 02:04:12 skprov2 dockerd[1690]: time="2018-11-15T02:04:12.432315437+01:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Nov 15 02:04:12 skprov2 dockerd[1690]: time="2018-11-15T02:04:12.439772198+01:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Nov 15 02:04:12 skprov2 dockerd[1690]: time="2018-11-15T02:04:12.439800208+01:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Nov 15 02:04:12 skprov2 dockerd[1690]: time="2018-11-15T02:04:12.471564855+01:00" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///run/containerd/containerd.sock 0 <nil>}]" module=grpc
Nov 15 02:04:12 skprov2 dockerd[1690]: time="2018-11-15T02:04:12.471618580+01:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Nov 15 02:04:12 skprov2 dockerd[1690]: time="2018-11-15T02:04:12.471653422+01:00" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///run/containerd/containerd.sock 0 <nil>}]" module=grpc
Nov 15 02:04:12 skprov2 dockerd[1690]: time="2018-11-15T02:04:12.475146270+01:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Nov 15 02:04:12 skprov2 dockerd[1690]: time="2018-11-15T02:04:12.475678777+01:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420abc010, CONNECTING" module=grpc
Nov 15 02:04:12 skprov2 dockerd[1690]: time="2018-11-15T02:04:12.475795536+01:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42080b0b0, CONNECTING" module=grpc
Nov 15 02:04:12 skprov2 dockerd[1690]: time="2018-11-15T02:04:12.475769194+01:00" level=info msg="blockingPicker: the picked transport is not ready, loop back to repick" module=grpc
Nov 15 02:04:12 skprov2 dockerd[1690]: time="2018-11-15T02:04:12.476273893+01:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42080b0b0, READY" module=grpc
Nov 15 02:04:12 skprov2 dockerd[1690]: time="2018-11-15T02:04:12.476346309+01:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420abc010, READY" module=grpc
Nov 15 02:04:14 skprov2 dockerd[1690]: time="2018-11-15T02:04:14.769703354+01:00" level=info msg="[graphdriver] using prior storage driver: overlay2"
Nov 15 02:04:23 skprov2 dockerd[1690]: time="2018-11-15T02:04:23.247573731+01:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Nov 15 02:04:23 skprov2 dockerd[1690]: time="2018-11-15T02:04:23.247926863+01:00" level=warning msg="Your kernel does not support swap memory limit"
Nov 15 02:04:23 skprov2 dockerd[1690]: time="2018-11-15T02:04:23.247998928+01:00" level=warning msg="Your kernel does not support cgroup rt period"
Nov 15 02:04:23 skprov2 dockerd[1690]: time="2018-11-15T02:04:23.248016977+01:00" level=warning msg="Your kernel does not support cgroup rt runtime"
Nov 15 02:04:23 skprov2 dockerd[1690]: time="2018-11-15T02:04:23.254944197+01:00" level=info msg="Loading containers: start."
Nov 15 02:04:25 skprov2 dockerd[1690]: time="2018-11-15T02:04:25.856323528+01:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Nov 15 02:04:35 skprov2 dockerd[1690]: time="2018-11-15T02:04:35.182112549+01:00" level=error msg="Failed to start container c929d444a6eb59a69a0da738ca782a9feb92ac1f80e5c4576bf85376c3d4c17a: id already in use"
Nov 15 02:04:35 skprov2 dockerd[1690]: time="2018-11-15T02:04:35.206030890+01:00" level=error msg="Failed to start container b335536f5f07b1db3f32ba4452fc4aadacc02c6184cef7fc9df619ab81bbf002: id already in use"
Nov 15 02:04:35 skprov2 dockerd[1690]: time="2018-11-15T02:04:35.235647072+01:00" level=error msg="Failed to start container ae7531405113db8b4754491a12c2ababf09fa0c8f501bfe6f1b33e3ff18b6462: id already in use"
Nov 15 02:04:35 skprov2 dockerd[1690]: time="2018-11-15T02:04:35.374241415+01:00" level=error msg="Failed to start container 9b3badc752786df08d00138c0222042a6bd80bb2c971f5a96b71e57105cea95c: id already in use"
Nov 15 02:04:35 skprov2 dockerd[1690]: time="2018-11-15T02:04:35.410173049+01:00" level=error msg="Failed to start container 7f7d2a92bcdbb240a9400942c9301f5cd77bf9d3fbde1d38f41a2bd1226f9b09: id already in use"
Nov 15 02:04:36 skprov2 dockerd[1690]: time="2018-11-15T02:04:36.171600568+01:00" level=error msg="Failed to start container 98a8c1b99cf986e6a889474f0fc28fe3635e466b21f8a37ef3c10a1050495c78: id already in use"
Nov 15 02:04:36 skprov2 dockerd[1690]: time="2018-11-15T02:04:36.970077586+01:00" level=error msg="Failed to start container f6b49cc85eb7f9226ac192498b1e319d68e0de2faff6b4e3e67adabba43a093a: id already in use"
Nov 15 02:04:36 skprov2 dockerd[1690]: time="2018-11-15T02:04:36.993993749+01:00" level=error msg="Failed to start container 77e00eabebea97357f05f564597d167acad8f2596b25d295b4366baf08ef3127: id already in use"
Nov 15 02:04:36 skprov2 dockerd[1690]: time="2018-11-15T02:04:36.994202774+01:00" level=info msg="Loading containers: done."
Nov 15 02:04:37 skprov2 dockerd[1690]: time="2018-11-15T02:04:37.492457742+01:00" level=info msg="Docker daemon" commit=4d60db4 graphdriver(s)=overlay2 version=18.09.0
Nov 15 02:04:37 skprov2 dockerd[1690]: time="2018-11-15T02:04:37.494916840+01:00" level=info msg="Daemon has completed initialization"
Nov 15 02:04:37 skprov2 dockerd[1690]: time="2018-11-15T02:04:37.669139526+01:00" level=info msg="API listen on /var/run/docker.sock"
Nov 15 02:04:37 skprov2 systemd[1]: Started Docker Application Container Engine.
-- Subject: Unit docker.service has finished start-up
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit docker.service has finished starting up.
--
-- The start-up result is RESULT.
==================================================================================
==================================================================================
==================================================================================
root#skprov2:~# cat /lib/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
BindsTo=containerd.service
After=network-online.target firewalld.service
Wants=network-online.target
[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd -H unix://
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
# Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.
# Both the old, and new location are accepted by systemd 229 and up, so using the old location
# to make them work for either version of systemd.
StartLimitBurst=3
# Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.
# Both the old, and new name are accepted by systemd 230 and up, so using the old name to make
# this option work for either version of systemd.
StartLimitInterval=60s
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Comment TasksMax if your systemd version does not supports it.
# Only systemd 226 and above support this option.
TasksMax=infinity
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
[Install]
WantedBy=multi-user.target
==================================================================================
==================================================================================
==================================================================================
root#skprov2:~# docker info && docker version
Containers: 14
Running: 5
Paused: 0
Stopped: 9
Images: 61
Server Version: 18.09.0
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: c4446665cb9c30056f4998ed953e6d4ff22c7c39
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: fec3683
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.15.0-39-generic
Operating System: Ubuntu 18.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 31.39GiB
Name: skprov2
ID: EDC2:AGFH:BHKP:P4HS:M5DA:ZPXM:AU6B:TV6E:6KIU:YC4S:F3NN:35A4
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
localhost:5000
127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine
WARNING: No swap limit support
Client:
Version: 18.09.0
API version: 1.39
Go version: go1.10.4
Git commit: 4d60db4
Built: Wed Nov 7 00:49:01 2018
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 18.09.0
API version: 1.39 (minimum version 1.12)
Go version: go1.10.4
Git commit: 4d60db4
Built: Wed Nov 7 00:16:44 2018
OS/Arch: linux/amd64
Experimental: false
I stumbled on the same issue.
In my case the issue was due to Docker not cleaning up after himself properly.
As seen in Docker's log:
time="2018-12-31T17:38:54.330555181+02:00" level=error msg="2089c8095e62011b0dc05e66c51ae59d648d909ca7a8e806af0fdf39b2e3006c cleanup: failed to delete container from containerd: transport is closing: unknown"
These ids are used on startup because of restart=always.
So Docker says
time="2018-12-31T17:40:04.648261275+02:00" level=error msg="Failed to start container 2089c8095e62011b0dc05e66c51ae59d648d909ca7a8e806af0fdf39b2e3006c: id already in use"
Seems like the Docker daemon shuts down faster than the containers are cleaned (probably because of the container's ignoring signals or something)
So seems the solution for me was to change the daemon's shutdown-timeout in the Docker daemon configuration file. The default is 10s or something, I changed it to 60s and no I don't experience these issues anymore.
I still think this is a legitimate scenario that should work out-of-the-box though.
The containerd "id already in use" errors appear to have been fixed in the latest release of Docker CE. Try upgrading to 18.09.3 to see if that corrects your issue.
https://github.com/docker/docker-ce/releases/tag/v18.09.3
Following is the error I am getting while trying to start the docker daemon service
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/docker.service.d
└─50-docker-service.conf
Active: failed (Result: exit-code) since Tue 2017-08-22 02:09:40 UTC; 15min ago
Docs: http://docs.docker.com
Main PID: 3571 (code=exited, status=1/FAILURE)
CPU: 292ms
Aug 22 02:09:40 systemd[1]: docker.service: Unit entered failed state.
Aug 22 02:09:40 systemd[1]: docker.service: Failed with result 'exit-code'.
Aug 22 02:09:40 systemd[1]: docker.service: Service hold-off time over, scheduling restart.
Aug 22 02:09:40 systemd[1]: Stopped Docker Application Container Engine.
Aug 22 02:09:40 systemd[1]: docker.service: Start request repeated too quickly.
Aug 22 02:09:40 systemd[1]: Failed to start Docker Application Container Engine.
Aug 22 02:09:40 systemd[1]: docker.service: Unit entered failed state.
Aug 22 02:09:40 systemd[1]: docker.service: Failed with result 'exit-code'.
Below is the config files I have
50-docker-service.conf
[Service]
Environment="DOCKER_OPTS=--bip=A.B.C.D"
what could be the cause?
$ls -ltr /etc/systemd/system/docker.service.d
total 16
-rw-r--r--. 1 root root 125 Aug 22 02:09 50-docker-service.conf
journalctl logs
Jul 14 13:55:52 systemd[1]: Starting Docker Application Container Engine...
Jul 14 13:55:52 dockerd[1274]: time="2017-07-14T13:55:52.925276313Z" level=info msg="[graphdriver] using prior storage driver \"overlay\""
Jul 14 13:55:53 dockerd[1274]: time="2017-07-14T13:55:53.378204522Z" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Jul 14 13:55:53 dockerd[1274]: time="2017-07-14T13:55:53.379367854Z" level=info msg="Loading containers: start."
Jul 14 13:55:53 dockerd[1274]: ..time="2017-07-14T13:55:53.507972850Z" level=info msg="Firewalld running: false"
Jul 14 13:55:54 dockerd[1274]: time="2017-07-14T13:55:54.013379242Z" level=info msg="Loading containers: done."
Jul 14 13:55:54 dockerd[1274]: time="2017-07-14T13:55:54.021206395Z" level=info msg="Daemon has completed initialization"
Jul 14 13:55:54 dockerd[1274]: time="2017-07-14T13:55:54.021283711Z" level=info msg="Docker daemon" commit=a82d35e graphdriver=overlay version=1.12.6
Jul 14 13:55:54 systemd[1]: Started Docker Application Container Engine.
Jul 14 13:55:54 dockerd[1274]: time="2017-07-14T13:55:54.039479153Z" level=info msg="API listen on /var/run/docker.sock"
Jul 14 13:56:03 dockerd[1274]: time="2017-07-14T13:56:03.565234227Z" level=error msg="Handler for POST /v1.24/containers/7019b26d0cb3/start returned error: Container already started"
Jul 14 13:56:09 dockerd[1274]: time="2017-07-14T13:56:09.660967581Z" level=error msg="Handler for POST /v1.24/containers/7019b26d0cb3/start returned error: Container already started"
Jul 14 13:56:14 dockerd[1274]: time="2017-07-14T13:56:14.741806551Z" level=error msg="Handler for POST /v1.24/containers/d9a3cb2b66e0/start returned error: Container already started"
Jul 14 21:05:16 dockerd[1274]: time="2017-07-14T21:05:16.992138499Z" level=info msg="Container 7019b26d0cb31412f40f8ab7f971f26896debcce09a58c39679dbaf62f6caa0b failed to exit within 0 s
Jul 14 21:07:20 dockerd[1274]: time="2017-07-14T21:07:20.897682536Z" level=info msg="Container d9a3cb2b66e0395086decd444f7ef52775f76f64b8d4dc291ee66cca48e53535 failed to exit within 2 s
Jul 14 21:08:08 systemd[1]: Stopping Docker Application Container Engine...
Jul 14 21:08:08 dockerd[1274]: time="2017-07-14T21:08:08.243279424Z" level=info msg="Processing signal 'terminated'"
Jul 14 21:08:18 dockerd[1274]: time="2017-07-14T21:08:18.244896088Z" level=info msg="Container 9966c97ca301ef593a68ab8e50730552dbe945c42e6b530d4c6339d3ffa8f544 failed to exit within 10
Jul 14 21:08:18 dockerd[1274]: time="2017-07-14T21:08:18.244909995Z" level=info msg="Container d94a3d0c45471753da85fce062e3c56c79c14ad40b6870281515333b98d0807e failed to exit within 10
Jul 14 21:08:18 dockerd[1274]: time="2017-07-14T21:08:18.244936931Z" level=info msg="Container b49d40479583c121ae7abe14489c7121a25a56da00f3d25961f981c918d3257e failed to exit within 10
Jul 14 21:08:18 systemd[1]: Stopped Docker Application Container Engine.
Docker bridge ip is not configured properly.
Environment="DOCKER_OPTS=--bip=A.B.C.D
Instead configured bip with a fully qualified CIDR
Environment="DOCKER_OPTS=--bip=A.B.C.D/size
that solved my problem
Check ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
is not modified in /lib/systemd/system/docker.service