Docker is unable to delete a file when building images - docker

My DockerFile contains the following instruction:
rm -f plugins.7z
This command worked as expected in earlier versions of docker but fails with version 1.13. I see the error:
cannot access plugins.7z: No such file or directory
If I bring up a container with the base image and execute the command manually, I see the same error.
Trying to list the folder contents displays:
# ls -lrt
ls: cannot access plugins.7z: No such file or directory
total 12
??????????? ? ? ? ? ? plugins.7z
This is not listed as a known issue in Docker Issues. How do I debug the issue further?
Edit:
For reasons of IP, I cannot post the full Dockerfile here. Also, it may not be necessary. As I mentioned, I am able to simulate the issue even by manually running the container and trying to execute the command
The file exists before I attempt to delete it
I was wrong about there not being a similar bug in the issues list. Here is one
The issue may not be to do with that file. Deleting other files/folders in the folder also makes them appear with ??? permissions
The user performing the operation is root

The reason removing directories fails is that the backing (xfs) filesystem was not formatted with d_type support ("ftype=1"); you can find a discussion on github; https://github.com/docker/docker/issues/27358.
To verify if d_type support is available on your system, check the output of docker info;
Server Version: 1.13.1
Storage Driver: overlay
Backing Filesystem: xfs
Supports d_type: false
Logging Driver: json-file
This requirement is also described in the release notes for RHEL/CentOS
Note that XFS file systems must be created with the -n ftype=1 option enabled for use as an overlay. With the rootfs and any file systems created during system installation, set the --mkfsoptions=-n ftype=1 parameters in the Anaconda kickstart. When creating a new file system after the installation, run the # mkfs -t xfs -n ftype=1 /PATH/TO/DEVICE command. To determine whether an existing file system is eligible for use as an overlay, run the # xfs_info /PATH/TO/DEVICE | grep ftype command to see if the ftype=1 option is enabled.
To resolve the issue, either;
re-format the device with ftype=1
use a different storage driver. Note that the default device mapper configuration (which uses loopback devices) is not recommended for production use, so requires manual configuration.
For backward-compatibility (older versions of docker allowed running overlay on systems without d_type), docker 1.13 will only log a warning in the daemon logs (https://github.com/docker/docker/pull/27433), but will no longer be supported in a future version.

Was able to get past the issue.
The change log for 1.13 says
"IMPORTANT: On Linux distributions where devicemapper was the default
storage driver, the overlay2, or overlay is now used by default
(if the kernel supports it)."
So I tried putting back devicemapper and it is now working as expected.

Related

Pulling / saving a few docker images in parallel - failure

I am trying to pull and save a few docker images in parallel (they are quite big and parallelism can save a lot of time - it is done within a Python script activating docker pull and then save within each thread). However, it fails all the time with the message like this:
Client Error: Not Found ("open /var/lib/docker/overlay2/41693d132695cd5ada8cf37f210d5b70bc1bac1b2cedfa5a4f352efa5ff00fc6/merged/some_file_name: no such file or directory")
the specific file on which it complains ('no such file or directory') varies.
In /var/log/messages (even after adding the debug flag to docker daemon options) I can't see anything valuable.
e.g.
level=error msg="Handler for GET /v1.35/images/xxx/xxx:xxx/get returned error: open /var/lib/docker/overlay2/41693d132695cd5ada8cf37f210d5b70bc1bac1b2cedfa5a4f352efa5ff00fc6/merged/opt/external/postgresql-42.2.5/postgresql-42.2.5.ja
r: no such file or directory"
Important (probably) note: - the images share many layers in common as they are built based on the same parent images (is this the reason for the collision in overlay FS?).
Running the same sequentually (number of parallel threads set to 1) works perfectly
OS: centos 7.9
Docker:
Server Version: 1.13.1
Storage Driver: overlay2
Backing Filesystem: xfs

How to run singularity container on HPC cluster? - ERROR : Failed to create user namespace: user namespace disabled

I'm trying to launch a singularity container on a hpc cluster. I have been running the projectNetv2.sif and sandbox on my local with no issue. After exporting them to a hpc I get the following error.
(singularity) [me#hpc Project]$ ls
examples projectnet_image_v2.tar.gz projectnet_sandboxv2 projectNetv2.sif
(singularity) [me#hpc Project]$ singularity run projectNetv2.sif
INFO: Converting SIF file to temporary sandbox...
FATAL: while extracting SimNetv21.sif: root filesystem extraction failed: extract command failed: ERROR : Failed to create user namespace: user namespace disabled
: exit status 1
##Attempting to run sandbox
(singularity) [me#hpc Project]$ singularity run projectnet_sandboxv2/
ERROR : Failed to create user namespace: user namespace disabled
Can anyone advise on how I either enable user namespace? or alternatively run the sif without user name space since I don't have sudo privilege.
Short answer:
bug your HPC admins to install Singularity
Longer answer:
There are two ways to install Singularity, as a privileged installation or an unprivileged / user namespace installation. The first way is the default, but requires sudo/root for certain actions (mostly singularity build). The latter removes the need for root, but has other system requirements. It's possible additional OS configuration is necessary for Singularity to function as expected.
In addition to privileged/unprivileged installations, disk storage in clusters is usually on NFS or another networked/distributed filesystem so all nodes have access to the same data. Unfortunately, as is usually the case any time it is involved, NFS is a likely cause for your problem. Singularity relies on SUID for its core functionality, but for (quite good) security reasons SUID is disabled on NFS by default. It is unlikely the cluster admins will enable that option, so your best bet is to ask them install it locally on whichever computer/interactive nodes you need it on.

Iotedge windows container volume access

I have a windows container module that is supposed to write to a simple text file inside the volumes folder on the host machine.
The module is hardcoded to write the same thing to the same file on start up (this is for testing purposes).
Expected behavior
The module is initialized and a volume is created on the host machine and a text file is created in that volume.
Actual Behavior
The module is not allowed to write to its volume and I get the below access permission issue.
Volume Access Permission Issue
If I add "Users" to the volume folder and give that group permission to modify the volume then everything works.
Question
Is there a way to do this without changing volume access options manually every time? If not what is the best practice for allowing volume access to its windows container?
Device Info
Windows 10 Enterprise LTSC
iotedge 1.1.3
Do you have the same behavior in the default path for the Moby engine volumes?
Path: C:\ProgramData\iotedge-moby\volumes
Command to create/set:
docker -H npipe:////./pipe/iotedge_moby_engine volume create testmodule
In this volume I never had a problem (currently we use Edge Runtime 1.1.4 + Windows Server 2019).
If we use a directory outside this "default" volume, we need to manually authorize the "Authenticated Users" (Modify, Read, Write, List and Execute) to allow the container/Moby engine to write/read there.

How can I get `cos-extensions install gpu` to work on a Google Cloud VM?

I'm trying to set up a container-optimized OS (COS) on GCE with a GPU, following the instructions at https://cloud.google.com/container-optimized-os/docs/how-to/run-gpus. After creating the VM, it says to ssh in and run cos-extensions install gpu. That works; you can see during the install it runs nvidia-smi which prints out the driver version (440.33.01) and connects to the card.
But it installs the nvidia bins and libs in /var/lib/nvidia, which is mounted as noexec in this OS (it's very locked down). That means none of the libs or utilities work. And when you mount them to a docker container, they don't work there either; they're still noexec.
The only workaround I've found is to copy the whole /var/lib/nvidia dir to a tmpfs scratch disk and use it from there.
Am I using it wrong, or is it just broken?
This doesn't look to be a containerd issue but rather a Container-Optimized OS expected behaviour due to COS provides another level of hardening by providing security-minded default values for several features.
If you look at the documentation, for Container-Optimized OS filesystem, everything under /var is mounted as no-exec except for
/var/lib/google
/var/lib/docker
/var/lib/toolbox
Those are mounted with writable, executable and stateful properties.
On the other hand, Ubuntu containerd does not have the same strict exec/noexec depending on the mount like with COS, so, it could be a good idea to use Ubuntu based images instead of COS as a workaround.
Another option is to copy the contents of the /var/lib/nvidiaunder another mount point that was not mounted using the noexec option, as you already did.
Turns out I wasn't doing anything wrong. This is confirmed now as a bug in cos-extensions: https://issuetracker.google.com/issues/164134488
Odd, because it seems like this would have shown up in testing.
There aren't any good production workarounds at the moment, because as a user it's hard to modify COS's behavior without some advanced scripting.

How do I change the default docker container location? [duplicate]

This question already has answers here:
How to change the docker image installation directory?
(20 answers)
Closed 2 years ago.
When I run docker, downloaded docker images (seem to be) stored in /var/lib/docker somewhere.
Since disk space is limited on this directory, and I'm provisioning docker to multiple machines at once; is there a way to change this default location to i.e. /mnt/hugedrive/docker/?
Working solution as of Docker v18.03
I found #Alfabravo's comment to work in my situation, so credit to them and upvoted.
However I think it adds value to provide an answer here to elaborate on it:
Ensure docker stopped (or not started in the first place, e.g. if you've just installed it)
(e.g. as root user):
systemctl stop docker
(or you can sudo systemctl stop docker if not root but your user is a sudo-er, i.e. belongs to the sudo group)
By default, the daemon.json file does not exist, because it is optional - it is added to override the defaults. (Reference - see Answer to: Where's docker's deamon.json? (missing)
)
So new installs of docker and those setups that haven't ever modified it, won't have it, so create it:
vi /etc/docker/daemon.json
And add the following to tell docker to put all its files in this folder, e.g:
{
"graph":"/mnt/cryptfs/docker"
}
and save.
(Note: According to stackoverflow user Alireza Mohamadi's comment beneath this answer on May 11 5:01: "graph option is deprecated in v17.05.0. Use data-root instead." - I haven't tried this myself yet but will update the answer when I have)
Now start docker:
systemctl start docker
(if root or prefix with sudo if other user.)
And you will find that docker has now put all its files in the new location, in my case, under: /mnt/cryptfs/docker.
This answer from #Alfabravo is also supported by: This answer to this problem: Docker daemon flags ignored
Notes and thoughts on Docker versioning
My host platform that is running docker is Ubuntu Linux 16.04.4 LTS 64bit.
I would therefore assume that this solution would apply to later, future versions of Docker, as well as the current time of writing, v18.03. In other words: "this solution should work from v18.03 onwards". As what seems to be the case with other answers, there is also the possibility that this answer might not work for some future version of Docker, if the Docker developers decide to change things in this area. But for now, it works with v18.03, at least in my case, I hope you also find it to work for you.
Optional Housekeeping tip:
If you had files in the original location /var/lib/docker and you know yourself that you definitely don't need them anymore (i.e. you have all the data (databases inside containers, files etc) within them backed up or in another form), you can delete them, so as to keep your machine tidy.
What did NOT work - other answers here (unfortunately):
Other solutions here did not work for my situation for the current version of docker that I am using (as the time of writing, current docker version was: Docker v18.03 (current) ).
Also note (as #AlfaBravo correctly points out in their comment to my answer) that the other answers may well have worked for different or earlier versions of docker.
I should note that my host platform is Ubuntu Linux 16.04.4 LTS 64bit.
In all cases when attempting the other answers I had followed the process of stopping docker before doing the solution and then starting it up after, as required. :
https://stackoverflow.com/a/47604857/227926 - #Gerald Sabu M's
solution to alter the /lib/systemd/system/docker.service - alter
the line to: ExecStart=/usr/bin/docker daemon -g
/mnt/hugedrive/docker/ - Outcome for me: docker still put its files
in the default, original location: /var/lib/docker
I tried #Fai's comment, but that file does not exist on my system, so
it would be something particular to their setup:
/etc/systemd/system/docker.service.d/exec_start.conf.
docker.service
I also tried #Hatem Jaber's answer
https://stackoverflow.com/a/32072042/227926 - but again, as will
#Gerald Sabu M's answer, docker still puts the files in the original
default location of /var/lib/docker.
(I would of course like to thank them for their efforts, though).
Why I am changing the default docker location: encrypted file system for GDPR purposes:
As an aside, and perhaps useful to you, I'm running docker inside an encrypted file system (as part of a GDPR initiative) in order to provide Encryption of Data-at-Rest data state (also known as Encryption-at-Rest) and also for Data-In-Use) (definitions).
The process of defining a GDPR datamap includes, among many other things, looking at the systems where the sensitive data is stored (Reference 1: GDPR Data Map Template: An easy to use self-assessment tool for understanding how data moves through your organisation) (Reference 2: Data mapping: Where to start for GDPR compliance). And by encrypting the filesystem where the database and application code is stored and the swap file, risk of residual data left behind when deleting or moving a VM can be eliminated.
I've made use of some of the steps defined in the following links, credit to them:
Encrypting Docker containers on a Virtual Server
How To: Linux Hard Disk Encryption With LUKS [ cryptsetup Command
]
I would note that a further step of encryption is recommended: to encrypt the database fields themselves - the sensitive fields at least - i.e. user data. You can probably find out about various levels of support for this in the implementation of popular database systems. Field encryption provides defence against malicious instrusion and leakage of data while the web application is running.
Also, as another aside point: to cover 'Data-In-Motion' state of data, I am using the free Let's Encrypt
The best solution would be to start the docker daemon (dockerd) with a correct data root path. According to the official documentation, as of Feb 2019, there are no --graph, -g options. These were renamed to the single argument --data-root.
https://docs.docker.com/engine/reference/commandline/dockerd/
So you should modify your /lib/systemd/system/docker.service so that the ExecStart takes into consideration that argument
An example could be
ExecStart=/usr/bin/dockerd --data-root /mnt/data/docker -H fd://
Then you should restart your docker daemon. (Keep in mind that you will no longer have your containers and your images, copy the data from your old folder to the new one if you want to maintain everything)
service docker restart
Keep in mind that if you restart the docker daemon your containers will be stopped, and only those with a correct restart policy will be restarted.
Tested on Ubuntu 16.04.5 Docker version 18.09.1, build 4c52b90
You can start the Docker daemon using -g option and the directory of your choice. This sets the appropriate runtime for Docker.
With version 1.8, it should be something like:
docker daemon -g /path/to/directory
With earlier versions, it would be:
docker -d -g /path/to/directory
From the man page:
-g, --graph=""
Path to use as the root of the Docker runtime. Default is /var/lib/docker.
You can perform the following steps to modify the default docker image location, i.e /var/lib/docker:-
Stop Docker
# systemctl stop docker
# systemctl daemon-reload
Add the following parameters to /lib/systemd/system/docker.service.
FROM:
ExecStart=/usr/bin/dockerd
TO:
ExecStart=/usr/bin/docker daemon -g /mnt/hugedrive/docker/
Create a new directory and rsync the current docker data to new directory.
# mkdir /mnt/hugedrive/docker/
# rsync -aqxP /var/lib/docker/ /mnt/hugedrive/docker/
Now, Docker Daemon can be started safely
# systemctl start docker
In /etc/default/docker or whatever location it exists in your system, change the following to something like this:
DOCKER_OPTS="--dns 8.8.8.8 --dns 8.8.8.4 -g /drive/location
If you have issues and it is ignored, apply this solution: Docker Opts in Etc Default Docker Ignored

Resources