I have researched this issue extensively to no avail, and also asked on unix.stackexchange.com also to no avail, so I'm asking here in hopes someone else has some insight into why this is occurring, as asking on both the unix board as well as github has shed no insight whatsoever.
I cannot get Docker to play nice on Antergos, or be reachable without sudo. Running container builds with sudo causes a number of issues, such as ssh keys not being detected and nginx not being recognized. This problem arose about 3 days ago, and rolling back has not made any difference. Uninstalling docker completely and reinstalling also did not make any difference. Neither has updating my configuration, permissions, or any other available setting.
System version: 4.17.8-1-ARCH #1 SMP PREEMPT Wed Jul 18 09:56:24 UTC 2018 x86_64 GNU/Linux
Current docker version: 18.04.0-ce (also tried on all versions up to current 18.05 to no avail, have rolled back one version at a time with no effect).
Existing research led to the typical issue being that the user needs to be in the docker group to circumvent sudo, however I am, and it is still not working. I have also checked here, here, and here, and all of them offer the same (not working) answer.
Please do not suggest checking my user group or adding my user to the docker group, as this is not the issue, as outlined below.
Everything worked fine until a couple of days ago. I am inclined to believe an automatic update broke it.
Below is some context:
Output of groups
root http docker users wheel
When calling any docker command without sudo (eg docker info, docker ps, docker run ... docker-compose up, etc), I get the following:
Cannot connect to the Docker daemon at tcp://localhost:2375. Is the docker daemon running?
It is definitely running. systemctl status docker yeilds the following:
● docker.service - Docker Application Container Engine
Loaded: loaded (/etc/systemd/system/docker.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2018-07-20 14:52:54 EDT; 21min ago
Docs: https://docs.docker.com
Main PID: 472 (dockerd)
Tasks: 50 (limit: 4915)
Memory: 139.0M
CGroup: /system.slice/docker.service
├─ 472 /usr/bin/dockerd -H fd://
├─ 620 docker-containerd --config /var/run/docker/containerd/containerd.toml
├─ 802 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/e0942c95c35608cecbbe761d27a2c5386d9faec072cf8031>
├─ 818 bash -c echo "RESTARTING GUlP COMMAND" && npm rebuild node-sass && npm upgrade && npm update && npm install && gulp && tail -f /dev/null
└─1572 tail -f /dev/null
It is likewise displayed when running htop and ps aux | grep docker.
perms for ls -la $(which docker):
-rwxr-xr-x 1 root docker 36823912 Apr 17 18:48 /usr/bin/docker
According to this, it should absolutely be accessible without sudo, but still chokes on all commands without sudo. I cannot just run it with sudo due to a number of production build scripts that require user space locality failing, which break when sudo is applied.
output of sudo docker info
Containers: 15
Running: 1
Paused: 0
Stopped: 14
Images: 30
Server Version: 18.04.0-ce
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk
syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 773c489c9c1b21a6d78b5c538cd395416ec50f88
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: 949e6fa
Security Options:
seccomp
Profile: default
Kernel Version: 4.17.8-1-ARCH
Operating System: Antergos Linux
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.02GiB
Name: Indibog
ID: OCC4:P3QN:B5EU:J2Y4:LZN4:WAIC:2F5V:ZQZD:NLXY:DWVE:X2LB:TLEQ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
File Descriptors: 27
Goroutines: 39
System Time: 2018-07-20T15:04:01.745176194-04:00
EventsListeners: 0
Username: mopsyd
Registry: https://index.docker.io/v1/
Labels:
Experimental: true
Insecure Registries:
192.168.40.60:5000
sandbox.cdp.local:5000
127.0.0.0/8
Live Restore Enabled: false
Related
I've set up Docker in rootless mode under Ubuntu 20.04 and Debian 11 (in my case, using Ansible and this role). I want to deploy a simple Docker stack to the node via Docker Swarm. No other hosts are involved, just one Swarm node from the same machine, acting as a manager.
I can run this project with Docker and Docker Compose just fine, also in rootless mode. All that changes for the rootless setup is that DOCKER_HOST is overwritten in .bashrc:
export XDG_RUNTIME_DIR="/run/user/1000"
export DOCKER_HOST="unix:///run/user/1000/docker.sock"
When I deploy the stack though, none of the services can start (here is an excerpt of the status):
$ docker stack deploy -c docker-stack.yml demo-stack
$ docker stack ps demo-stack --no-trunc
jig6zyewkem2g225509x91nt5 demo-stack_db.1 registry.example.com/db:v1.20.2 bullseye Shutdown Rejected 15 seconds ago "mkdir /var/lib/docker: permission denied"
ox6x5w7du9o5ew2v70g5mfg9e demo-stack_redis.1 registry.example.com/redis:v1.20.2 bullseye Shutdown Rejected 15 seconds ago "mkdir /var/lib/docker: permission denied"
ipme447wrrsjc8jw6cpfak4hq demo-stack_web.1 registry.example.com/web:v1.20.2 bullseye Shutdown Rejected 14 seconds ago "mkdir /var/lib/docker: permission denied"
The services all error with mkdir /var/lib/docker: permission denied. I suppose that it tries to start them as if the system was using rootful Docker, but it's a rootless installation.
I guess the question is: how do I get the Swarm node (which is the very same machine) to use the correct Docker rootless configuration for launching the services? That would include using the correct DOCKER_HOST configuration.
I am unsure if this is even supposed to work. I hear that overlay networks are not supported, but I am only on one machine, so I don't really need this. I do need Swarm for its usable implementation of secrets (compared to the mock implementation from Docker Compose).
Note that I have the same setup with Docker running in (normal) rootful mode, and there, all services can be started. It's therefore not an issue with the Docker stack file itself.
More details with docker info:
Client:
Context: default
Debug Mode: false
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 12
Server Version: 20.10.13
Storage Driver: fuse-overlayfs
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: active
NodeID: hpzsmez48acse9yo1frnx37fo
Is Manager: true
ClusterID: zkv7wsoun193kyvbxe1k3hdph
Managers: 1
Nodes: 1
Default Address Pool: 10.0.0.0/8
SubnetSize: 24
Data Path Port: 4789
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 10
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Force Rotate: 0
Autolock Managers: false
Root Rotation In Progress: false
Node Address: 127.0.0.1
Manager Addresses:
127.0.0.1:2377
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc
runc version: v1.0.3-0-gf46b6ba2
init version: de40ad0
Security Options:
seccomp
Profile: default
rootless
cgroupns
Kernel Version: 5.10.0-13-amd64
Operating System: Debian GNU/Linux 11 (bullseye)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 1.936GiB
Name: bullseye
ID: 3R5P:2UV6:FIP4:UIJV:TDNQ:35DT:DEDI:SMGN:FDUY:JSWO:FRU6:O2HF
Docker Root Dir: /home/vagrant/.local/share/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support
WARNING: No cpu shares support
WARNING: No cpuset support
WARNING: No io.weight support
WARNING: No io.weight (per device) support
WARNING: No io.max (rbps) support
WARNING: No io.max (wbps) support
WARNING: No io.max (riops) support
WARNING: No io.max (wiops) support
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
The solution is simple: Docker Rootless does not work with Docker Swarm. You can have either, but not both.
In the last couple of days I'm having some issues at building or running docker containers.
It seems that root doesn't have permission of having access to the filesystem.
Eg. I've created this very simple Dockerfile
FROM centos
RUN id && ls -l /usr/bin/yum /usr/bin/dnf-3 && yum install mlocate
and when I try to build the image I get the error
Step 1/2 : FROM centos
---> 470671670cac
Step 2/2 : RUN id && ls -l /usr/bin/yum /usr/bin/dnf-3 && yum install mlocate
---> Running in f7b32a009a74
uid=0(root) gid=0(root) groups=0(root)
-rwxr-xr-x 1 root root 1954 Dec 19 15:43 /usr/bin/dnf-3
lrwxrwxrwx 1 root root 5 Dec 19 15:43 /usr/bin/yum -> dnf-3
/usr/libexec/platform-python: can't open file '/usr/bin/yum': [Errno 13] Permission denied
The command '/bin/sh -c id && ls -l /usr/bin/yum /usr/bin/dnf-3 && yum install mlocate' returned a non-zero code: 2
The issue seems to be more generic as even with ubuntu or alpine I get similar errors, so I suspect is related to Ubuntu.
Consider that before I could perform any task without problems.
I've tried adding capabilities and stopping apparmor but it doesn't have any effect.
Docker info
Client:
Debug Mode: false
Server:
Containers: 18
Running: 0
Paused: 0
Stopped: 18
Images: 20
Server Version: 19.03.8
Storage Driver: overlay2
Backing Filesystem: <unknown>
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc version:
init version: fec3683
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 5.4.0-31-generic
Operating System: Ubuntu Core 16
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 7.475GiB
Name: gurdulu-xps
ID: E5JA:3WKI:JWFQ:M5J2:CAZ7:VVKI:2ADB:3W7W:F3F4:VYXZ:7JLP:R7C4
Docker Root Dir: /var/snap/docker/common/var-lib-docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No swap limit support
It was apparmor in combination with snap. The profile coming with the snap installation had in some way become invalid in the last couple of days.
To be honest I didn't investigate and tried removing the snap and installing with apt.
Now it works fine.
I was facing issues installing docker on cloud server according to the official guide(Install Docker Engine on Ubuntu). I finished old version's uninstallation, the repository setting up and docker engine installation (sudo apt-get install docker-ce docker-ce-cli containerd.io). However, I got an error when running hello-world.
wyf#VM1103-Timi:~$ sudo docker run hello-world
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:58: mounting \\\"proc\\\" to rootfs \\\"/var/lib/docker/overlay2/e9fedf64e8983aa01e513cee591cdfd7fc60962466a476b51fc1ead682ec8022/merged\\\" at \\\"/proc\\\" caused \\\"permission denied\\\"\"": unknown.
ERRO[0000] error waiting for container: context canceled
I tried restart docker and server, but the problem still exists.
So, it would be great if someone can guide me in fixing this error.
Please let me know if you have any idea about this issue.
Thank you very much!
Ps:
My system is Ubuntu 18.04. Thus, I did not have selinux. Instead of selinux, I checked AppArmor log.
May 19 21:14:55 VM1103-Timi networkd-dispatcher[155]: WARNING:Unknown index 37 seen, reloading interface list
May 19 21:14:55 VM1103-Timi systemd-networkd[126]: veth71cf495: Link UP
May 19 21:14:55 VM1103-Timi containerd[170]: time="2020-05-19T21:14:55.679793295+08:00" level=info msg="shim containerd-shim started" address="/containerd-shim/moby/4c207ce1273d2c863ee419c5ebb271163a031394bd4c17ee75d44267d631954d/shim.sock" debug=false pid=106265
May 19 21:14:55 VM1103-Timi containerd[170]: time="2020-05-19T21:14:55.767796543+08:00" level=info msg="shim reaped" id=4c207ce1273d2c863ee419c5ebb271163a031394bd4c17ee75d44267d631954d
May 19 21:14:55 VM1103-Timi dockerd[15100]: time="2020-05-19T21:14:55.776863367+08:00" level=error msg="stream copy error: reading from a closed fifo"
May 19 21:14:55 VM1103-Timi dockerd[15100]: time="2020-05-19T21:14:55.776953910+08:00" level=error msg="stream copy error: reading from a closed fifo"
May 19 21:14:55 VM1103-Timi systemd-networkd[126]: veth71cf495: Link DOWN
May 19 21:14:55 VM1103-Timi dockerd[15100]: time="2020-05-19T21:14:55.927805156+08:00" level=error msg="4c207ce1273d2c863ee419c5ebb271163a031394bd4c17ee75d44267d631954d cleanup: failed to delete container from containerd: no such container"
The strange thing is that there is no record of permission-denied error.
Here are my ubuntu version, kernal version and docker info:
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.4 LTS
Release: 18.04
Codename: bionic
5.3.18-3-pve
Client:
Debug Mode: false
Server:
Containers: 8
Running: 0
Paused: 0
Stopped: 8
Images: 1
Server Version: 19.03.8
Storage Driver: overlay2
Backing Filesystem: <unknown>
Supports d_type: true
Native Overlay Diff: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
init version: fec3683
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 5.3.18-3-pve
Operating System: Ubuntu 18.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 4GiB
Name: VM1103-Timi
ID: 3G3F:LTVZ:NO25:C7LA:XKQV:ETMB:B6QU:3ZFJ:KBA5:R3KK:QZEA:ZONC
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
It seemed that the AppArmor Profile "docker-default" was lost. "docker-default" was not correctly generated. Check as follows:
root#VM1103-Timi:/etc/apparmor.d# aa-status
apparmor module is loaded.
12 profiles are loaded.
12 profiles are in enforce mode.
/sbin/dhclient
/usr/bin/man
/usr/lib/NetworkManager/nm-dhcp-client.action
/usr/lib/NetworkManager/nm-dhcp-helper
/usr/lib/connman/scripts/dhclient-script
/usr/lib/lightdm/lightdm-guest-session
/usr/lib/lightdm/lightdm-guest-session//chromium
/usr/sbin/mysqld
/usr/sbin/tcpdump
docker-default
man_filter
man_groff
0 profiles are in complain mode.
1 processes have profiles defined.
1 processes are in enforce mode.
/usr/sbin/mysqld (258)
0 processes are in complain mode.
0 processes are unconfined but have a profile defined.
Solution is probably to open ports needed. Your system might be running selinux and (ufw or firewalld or iptables) ?and/or others?. Read up a bit on linux firewall tools, in particular the ones running on your system.
For the selinux case, you need to check selinux logs, is it blocking access? Add exceptions using selinux commands.
https://wiki.centos.org/HowTos/SELinux These tools are well worth learning but can be complicated. A quick test disabling selinux and firewalld can confirm that this is the source of problem and you can enable selinux and firewalld later and allow/open ports in a secure way.
Simple test: disable selinux and firewalld, e.g. on CentOS
systemctl stop firewalld;
setenforcing 0;
If you can create containers with selinux disabled then you have confirmed selinux is your problem. You can enable firewall and selinux and then add exceptions and open ports as needed later.
This looks good (specific to ubuntu but general enough IMHO), It details ufw commands, firewalld commands and iptables commands needed for opening ports to allow docker swarm to work) https://www.digitalocean.com/community/tutorials/how-to-configure-the-linux-firewall-for-docker-swarm-on-ubuntu-16-04
I originally got useful info on ufw commands to open ports needed from here:
Error response from daemon: attaching to network failed, make sure your network options are correct and check manager logs: context deadline exceeded
ufw allow 2376/tcp
ufw allow 2377/tcp
ufw allow 7946/tcp
ufw allow 7946/udp
ufw allow 4789/udp
ufw enable #maybe
ufw reload
systemctl restart docker
This is a common enough problem where something usually selinux is not allowing access to ports needed.
e.g.
https://github.com/google/cadvisor/issues/333
After running docker container,docker run -d --name nginx nginx, I cannot use "docker exec", docker exec nginx echo 123, on this container.
I'm receiving an error:
ERRO[2018-08-19T11:09:10.909894729+03:00] stream copy error: reading from a closed fifo
ERRO[2018-08-19T11:09:10.909988081+03:00] stream copy error: reading from a closed fifo
ERRO[2018-08-19T11:09:10.931102317+03:00] Error running exec 19c6ae3c5d796180e02577f037f6a1bd1453b70393098643719dea3537933ae2 in container: OCI runtime exec failed: exec failed: container_linux.go:348: starting container process caused "process_linux.go:86: executing setns process caused \"exit status 22\"": unknown`
OS: ubuntu 14.04
Kernel: 3.13.0-153-generic
Docker: Docker version 18.06.0-ce, build 0ffa825
Docker Info:
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 1
Server Version: 18.06.0-ce
Storage Driver: aufs
Root Dir: /var/lib/docker/165536.165536/aufs
Backing Filesystem: extfs
Dirs: 5
Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: d64c661f1d51c48782c9cec8fda7604785f93587
runc version: 69663f0bd4b60df09991c08812a60108003fa340
init version: fec3683
Security Options:
apparmor
userns
Kernel Version: 3.13.0-153-generic
Operating System: Ubuntu 14.04.5 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 3.86GiB
Name: **************
ID: OL25:ISXX:RWR7:EY76:OQ6O:XLWG:ETWJ:FV2A:MC6A:ROP7:6DWD:DJX4
Docker Root Dir: /var/lib/docker/165536.165536
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Thanks!
That can happen when them use ENTRYPOINT instead of CMD. Check your image/container with "docker inspect". Your commandline argument becomes a CMD of ENTRYPOINT.
https://docs.docker.com/engine/reference/builder/#understand-how-cmd-and-entrypoint-interact
I could reproduce this issue whenever I executed docker run -it opensuse/leap followed by exit command. The container is actually stopped after exit command, but still showed running in docker ps.
Solution: Restart your docker daemon. And then try running your containers once again. If they stop, they won't show running status.
command: service docker restart
This worked in my case.
Please update your Kernel. Although Docker should work with most Kernel 3.10+ versions, there are often low level issues with older Kernels. See also https://github.com/moby/moby/issues/36084#issuecomment-364886573 for a seemingly same issue with a working solution:
updated to HWE ( 4.13.0-32-generic) and exec works again, however keep in mind that stock 16.04 uses 4.4.0 kernels - there should some kind of warning (at least) that specific versions combination will not work
I'm getting this error when pulling some docker images (but not all):
failed to register layer: Error processing tar file(exit status 1): operation not permitted
For example: docker pull nginx works, but not docker pull redis.
I get the same result wether i run the command with a user that is part of the docker group, using sudo or as root.
If i run dockerd in debug mode i see this in the logs:
DEBU[0025] Downloaded 5233d9aed181 to tempfile /var/lib/docker/tmp/GetImageBlob023191751
DEBU[0025] Applying tar in /var/lib/docker/overlay2/e5290b8c50d601918458c912d937a4f6d4801ecaa90afb3b729a5dc0fc405afc/diff
DEBU[0027] Applied tar sha256:16ada34affd41b053ca08a51a3ca92a1a63379c1b04e5bbe59ef27c9af98e5c6 to e5290b8c50d601918458c912d937a4f6d4801ecaa90afb3b729a5dc0fc405afc, size: 79185732
(...)
DEBU[0029] Applying tar in /var/lib/docker/overlay2/c5c0cfb9907a591dc57b1b7ba0e99ae48d0d7309d96d80861d499504af94b21d/diff
DEBU[0029] Cleaning up layer c5c0cfb9907a591dc57b1b7ba0e99ae48d0d7309d96d80861d499504af94b21d: Error processing tar file(exit status 1): operation not permitted
INFO[0029] Attempting next endpoint for pull after error: failed to register layer: Error processing tar file(exit status 1): operation not permitted
INFO[0029] Layer sha256:938f1cd4eae26ed4fc51c37fa2f7b358418b6bd59c906119e0816ff74a934052 cleaned up
(...)
If i run watch -n 0 "sudo ls -lt /var/lib/docker/overlay2/" while the image is pulling, i can see new folders appearing (and disappearing after it fails) and the permissions on /var/lib/docker/overlay2/ are root:root:700 so i don't think it's exactly a permission issue.
Here are some detail about the environment:
I have a proxmox running the LXC container where i'm having the issue.
The container itself is running Debian 8.
And here are the various versions:
$> uname -a
Linux [redacted-hostname] 4.10.15-1-pve #1 SMP PVE 4.10.15-15 (Fri, 23 Jun 2017 08:57:55 +0200) x86_64 GNU/Linux
$> docker version
Client:
Version: 17.06.0-ce
API version: 1.30
Go version: go1.8.3
Git commit: 02c1d87
Built: Fri Jun 23 21:20:04 2017
OS/Arch: linux/amd64
Server:
Version: 17.06.0-ce
API version: 1.30 (minimum version 1.12)
Go version: go1.8.3
Git commit: 02c1d87
Built: Fri Jun 23 21:18:59 2017
OS/Arch: linux/amd64
Experimental: false
$>docker info
Containers: 20
Running: 0
Paused: 0
Stopped: 20
Images: 28
Server Version: 17.06.0-ce
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: cfb82a876ecc11b5ca0977d1733adbe58599088a
runc version: 2d41c047c83e09a6d61d464906feb2a2f3c52aa4
init version: 949e6fa
Kernel Version: 4.10.15-1-pve
Operating System: Debian GNU/Linux 8 (jessie)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.906GiB
Name: resumed-dev
ID: EBJ6:AFVS:L3RC:ZEE7:A6ZJ:WDQE:GTIZ:RXHA:P4AQ:QJD7:H6GG:YIQB
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
File Descriptors: 16
Goroutines: 24
System Time: 2017-08-17T14:17:07.800849127+02:00
EventsListeners: 0
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
EDIT: This will be fixed by any release after December 18, 2017 of Moby via this merge. Will update again when fully incorporated into Docker.
If your container is unprivileged, this appears to be an issue with the overlay2 storage driver for Docker. This does not appear to be an issue with overlay (GitHub issue). So either utilize the overlay storage driver instead of overlay2, or make your container privileged.
I have almost the same environment as you, and met the same problem.
Some image works perfectly (alpine), while some images fails at cleaning up (ubuntu).
strace -f dockerd -D then docker pull or docker load gives the reason:
mknodat(AT_FDCWD, "/dev/agpgart", S_IFCHR|0660, makedev(10, 175)) = -1 EPERM (Operation not permitted)
Unprivileged container prohibit mknod by design. If you insist nesting Docker inside lxc, you will have to choose privileged container. (And notice that existing unprivileged container cannot be converted to privileged container directly due to uid/gid mapping)