How to use nvidia-docker container as non-root user? - docker

I want to use a CUDA container in Docker as a non root user, but am running into permission problems. Here's an example Dockerfile:
FROM nvidia/cudagl:11.2.2-runtime-ubuntu18.04
RUN useradd -ms /bin/bash testuser -G video,sudo
USER testuser
ENTRYPOINT "/bin/bash"
Running nvidia-smi gives the following error: Failed to initialize NVML: Insufficient Permissions
My application uses VirtualGL and Xvfb to render Chrome with a GPU if that's relevant. Works perfectly fine with the root user.

TL;DR - Check the gid of vglusers group on the host. Add this group with the gid in the container, and add the user to this group.
So investigating this a bit, I looked at the nvidia devices in the container:
root#56cef279b83f:/# cd /dev
root#56cef279b83f:/dev# ls -l | grep nvidia
crw-rw---- 1 root 1005 195, 0 Nov 23 23:13 nvidia0
crw-rw---- 1 root 1005 195, 255 Nov 23 23:13 nvidiactl
crw-rw---- 1 root 1005 195, 254 Nov 23 23:13 nvidia-modeset
crw-rw-rw- 1 root root 506, 0 Nov 23 23:13 nvidia-uvm
crw-rw-rw- 1 root root 506, 1 Nov 23 23:13 nvidia-uvm-tools
The nvidia devices belonged to a group with gid 1005. This was odd as there was no group in the container with that ID.
I went to look into the devices on the host, and as per my VGL setup, they belong to root, or the vglusers group.
(venv) jsim#goliath:/var/log$ cd /dev/
(venv) jsim#goliath:/dev$ ls -l | grep nvidia
crw-rw---- 1 root vglusers 195, 0 Nov 24 10:13 nvidia0
drwxr-xr-x 2 root root 80 Nov 24 10:31 nvidia-caps
crw-rw---- 1 root vglusers 195, 255 Nov 24 10:13 nvidiactl
crw-rw---- 1 root vglusers 195, 254 Nov 24 10:13 nvidia-modeset
crw-rw-rw- 1 root root 506, 0 Nov 24 10:13 nvidia-uvm
crw-rw-rw- 1 root root 506, 1 Nov 24 10:13 nvidia-uvm-tools
As it turns out, vglusers has a gid of 1005!
jsim#goliath:/dev$ cat /etc/group | grep vglusers
vglusers:x:1005:jsim
So in my Dockerfile, all I had to do is add the group vglusers with gid 1005, and add my user to this group. Problem solved.
RUN groupadd -g 1005 vglusers && \
useradd -ms /bin/bash testuser -u 1000 -g 1005 && \
usermod -a -G video,sudo testuser

Related

lxc Container (Proxmox) Nextcloud problem

I created a lxc Container with Proxmox using
https://www.turnkeylinux.org/download?file=turnkey-nextcloud-17.1-bullseye-amd64.iso
i have mount a harddisk to the proxmox main system with
root#pve:/mnt/nas/data# pct set 101 -mp0 /mnt/nas ,mp=/mnt/nextcloud
but i have the problem , the folder permissions are nobody:nogroup and i cant change it as root user inside the lxc container.
And the www-data user/group are exist in the lxc Container,but not shown as about the commands.
that is from the Proxmox System
root#pve:/mnt/nas# ls -la
total 29
drwxr-xr-x 5 root root 4096 Jan 9 13:53 .
drwxr-xr-x 3 root root 3 Jan 14 12:10 ..
drwxr-xr-x 2 root root 4096 Jan 3 08:01 code
drwxr-x--- 10 www-data www-data 4096 Jan 9 23:05 data
drwx------ 2 root root 16384 Nov 24 10:39 lost+found
root#pve:/mnt/nas# cat /etc/fstab
# \<file system\> \<mount point\> \<type\> \<options\> \<dump\> \<pass\>
proc /proc proc defaults 0 0
UUID=7a2cccf9-745c-462a-acf8-80bca216da85 /mnt/nas ext4 defaults 0 1
root#pve:/mnt/nas#
from the lxc Container is this :
root#Nextcloud /mnt# ls
nextcloud
root#Nextcloud /mnt# ls -la
total 13
drwxr-xr-x 3 root root 3 Jan 14 11:14 .
drwxr-xr-x 17 root root 23 Jan 14 11:09 ..
drwxr-xr-x 5 nobody nogroup 4096 Jan 9 12:53 nextcloud
root#Nextcloud /mnt# chown -R www-data:www-data /mnt/nextcloud/data/
chown: cannot read directory '/mnt/nextcloud/data/': Permission denied
root#Nextcloud /mnt# chown -R root:root /mnt/nextcloud/data/
chown: cannot read directory '/mnt/nextcloud/data/': Permission denied
root#Nextcloud /mnt# groups
root
root#Nextcloud /mnt# addgroup www-data
addgroup: The group \`www-data' already exists.
root#Nextcloud /mnt#
how i can solved that problem?
########################
LXC uses linux namespaces to separate user IDs from the host. By default the UID 0 (root) inside the container is seen as UID 100000 by the Proxmox host. That's why the directory you're bind-mounting, which is owned by www-data (UID 33) from the host perspective is nobody:nogroup inside the container.
There are a couple ways to deal with this, but my preferred method, if you can get away with it, is to change the owner of the directory from the host to the desired UID + 100000. So in this case, do chown -R 100033:100033 /mnt/nas and that should give you the desired permissions in the container.
If it's important to keep the permissions as they are from the host perspective, try using an ID map (there's a good description in the Proxmox wiki: https://pve.proxmox.com/wiki/Unprivileged_LXC_containers; and also a website to help calculate the proper UID numbers: https://proxmox-idmap-helper.nieradko.com/)
root#Nextcloud ~# cd /mnt
root#Nextcloud /mnt# ls
nextcloud
root#Nextcloud /mnt# cd nextcloud/
root#Nextcloud /mnt/nextcloud# ls
code data lost+found
root#Nextcloud /mnt/nextcloud# cd data/
root#Nextcloud .../nextcloud/data# ls
Biene appdata_oczb14gwpmn2 flow.log nextcloud.log.1
Meltymon audit.log flow.log.1 owncloud.db
__groupfolders biene index.html updater-oczb14gwpmn2
appdata_ochaal06qhnm files_external nextcloud.log updater.log
root#Nextcloud .../nextcloud/data# cd ..
root#Nextcloud /mnt/nextcloud# ls -la
total 29
drwxr-xr-x 5 www-data www-data 4096 Jan 9 12:53 .
drwxr-xr-x 3 root root 3 Jan 14 11:14 ..
drwxr-xr-x 2 www-data www-data 4096 Jan 3 07:01 code
drwxr-x--- 10 www-data www-data 4096 Jan 9 22:05 data
drwx------ 2 www-data www-data 16384 Nov 24 09:39 lost+found
root#Nextcloud /mnt/nextcloud#
Nice it Works!
i must be install sudo for reinitialize the Database and Folders
sudo -u www-data php occ files:scan --all
sudo -u www-data php occ db:add-missing-indices
And give the occ file the x permissions
chown +x /var/www/nextcloud/occ
Thank you so much,ive searched the hole day for a solution with Google but dont find anything like that.

How to ensure that users inside and outside docker are consistent?

I am making a Docker image. I would like to have a ready-made environment in there as well as some ready-made directories. In this way, I only need to mount some of my directories and use them directly. I made the image using the Dockerfile below. In order to have the same permissions inside and outside the container (not root), I created a user user.
FROM matthewfeickert/docker-python3-ubuntu:latest
USER root
# Create an arbitrary non-root user; we don't care about its uid
# or other properties
RUN useradd --system user
RUN sudo pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
RUN set -x; \
sudo apt-get update \
&& DEBIAN_FRONTEND=noninteractive sudo apt-get install -y build-essential git-core m4 zlib1g zlib1g-dev libprotobuf-dev protobuf-compiler libprotoc-dev libgoogle-perftools-dev swig xz-utils gdb git \
&& sudo -H python3 -m pip install scons==3.0.1 \
&& sudo -H python3 -m pip install six
# RUN apt-get -y install gdb
RUN apt-get clean
RUN git config --global url."https://hub.fastgit.xyz/".insteadOf "https://github.com/"
WORKDIR /usr/local/src
RUN git clone https://github.com/gem5/gem5.git
RUN sudo chown user /usr/local/src/gem5 -R
USER user
# RUN mkdir -p /usr/local/src/gem5/build
# RUN sudo chown user /usr/local/src/gem5/build
WORKDIR /usr/local/src/gem5/
After making the image, I mount my directory into it.
docker run -it --rm \
-v my_dir/runScripts:/usr/local/src/gem5/runScripts \
-v my_dir/gem5/src:/usr/local/src/gem5/src \
-v my_dir/gem5/configs:/usr/local/src/gem5/configs \
-v my_dir/gem5/programs:/usr/local/src/gem5/programs
-v my_dir/gem5/build:/usr/local/src/gem5/build \
-v my_dir/gem5/results:/usr/local/src/gem5/results \
-v my_dir/gem5/update.sh:/usr/local/src/gem5/update.sh \
--security-opt seccomp=unconfined --user 1000:1000 gerrie/gem5:v1 "/bin/bash"
When I enter the docker container, I output the UID at this time.
$ echo $UID
1000
This is the same as outside the container.
What I think is that the inside and outside of the gem5 directory should be exactly the same user. But it's not.
$ ll
total 232
drwxr-xr-x 1 user root 4096 Jun 16 09:12 ./
drwxr-xr-x 1 root root 4096 Jun 16 03:16 ../
drwxr-xr-x 1 user root 4096 Jun 16 03:17 .git/
-rw-r--r-- 1 user root 984 Jun 16 03:17 .git-blame-ignore-revs
-rw-r--r-- 1 user root 645 Jun 16 03:17 .gitignore
-rw-r--r-- 1 user root 19339 Jun 16 03:17 .mailmap
-rw-r--r-- 1 user root 5595 Jun 16 03:17 CODE-OF-CONDUCT.md
-rw-r--r-- 1 user root 26112 Jun 16 03:17 CONTRIBUTING.md
-rw-r--r-- 1 user root 2332 Jun 16 03:17 COPYING
-rw-r--r-- 1 user root 1478 Jun 16 03:17 LICENSE
-rw-r--r-- 1 user root 7790 Jun 16 03:17 MAINTAINERS.yaml
-rw-r--r-- 1 user root 2133 Jun 16 03:17 README
-rw-r--r-- 1 user root 34435 Jun 16 03:17 RELEASE-NOTES.md
-rwxr-xr-x 1 user root 28876 Jun 16 03:17 SConstruct*
-rw-r--r-- 1 user root 8616 Jun 16 03:17 TESTING.md
drwxrwxr-x 2 docker docker 4096 Jun 16 08:52 build/
drwxr-xr-x 1 user root 4096 Jun 16 03:17 build_opts/
drwxr-xr-x 1 user root 4096 Jun 16 03:17 build_tools/
drwxrwxr-x 13 docker docker 4096 Jun 16 08:54 configs/
drwxr-xr-x 1 user root 4096 Jun 16 03:17 ext/
drwxr-xr-x 1 user root 4096 Jun 16 03:17 include/
drwxrwxr-x 2 docker docker 4096 Jun 16 09:03 programs/
-rw-rw-r-- 1 docker docker 0 Jun 16 08:58 results
drwxrwxr-x 2 docker docker 4096 Jun 16 02:33 runScripts/
drwxr-xr-x 1 user root 4096 Jun 16 03:17 site_scons/
drwxrwxr-x 17 docker docker 4096 Jun 16 02:33 src/
drwxr-xr-x 1 user root 4096 Jun 16 03:17 system/
drwxr-xr-x 1 user root 4096 Jun 16 03:17 tests/
-rw-rw-r-- 1 docker docker 0 Jun 16 08:58 update.sh
drwxr-xr-x 1 user root 4096 Jun 16 03:17 util/
All the directories I mount belong to the docker user, and all other directories are user.
I am able to create files inside my mounted directory. But for gem5's directory, I don't even have permission to create files.
But according to the Dockfile, I have clearly chown this directory to user. And when entering the container, I set the uid.
docker#7df3004beb2a:/usr/local/src/gem5$ touch test
touch: cannot touch 'test': Permission denied
docker#7df3004beb2a:/usr/local/src/gem5$ cd runScripts/
docker#7df3004beb2a:/usr/local/src/gem5/runScripts$ touch test
docker#7df3004beb2a:/usr/local/src/gem5/runScripts$ ll
total 8
drwxrwxr-x 2 docker docker 4096 Jun 16 09:13 ./
drwxr-xr-x 1 user root 4096 Jun 16 09:12 ../
-rw-r--r-- 1 docker docker 0 Jun 16 09:13 test
When I compile, this problem occurs. I think this is caused by a permissions issue. Where did I go wrong? How should I modify it? Thanks a lot!
FileNotFoundError: [Errno 2] No such file or directory: "/usr/local/src/gem5/fatal: unsafe repository ('/usr/local/src/gem5' is owned by someone else)\nTo add an exception for this directory, call:\n\n\tgit config --global --add safe.directory /usr/local/src/gem5/hooks":
As the question was tagged with podman (in addition to docker), here is a Podman solution to the problem of mapping users between the host and the container:
If you want to map your regular user on the host to a user with the same UID inside the container, you could add the Podman option --userns=keep-id. A more general solution (that also works when the UIDs are not the same) can be found in the troubleshooting.md tip and tip. The tips make use of the options --uidmap and --gidmap. (I wrote those tips).
The two options --uidmap and --gidmap may look to be a bit complicated to use, but as soon as you understand how rootless Podman maps UIDs and GIDs it will be pretty straight forward.

How can I avoid `Permission denied` Errors when mounting a container into my deployment?

Background
I am currently deploying Apache Airflow using Helm (using this chart). I am using a git-sync sidecar to mount the SQL & Python files which Airflow will need to have access to to be able to execute scripts/files.
What seems not to work
Once I am done with deploying my container, it seems that my Airflow user is unable to use the files (that have been mounted by the git sidecar), and exits with error (this error happens for all files that have been mounted not only target):
[Errno 13] Permission denied: 'target'
What I have tried
My docker container for the deployment looks like:
FROM apache/airflow:1.10.14-python3.8
USER root
# apt deps
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
nano \
gcc \
python3-dev \
&& apt-get autoremove -yqq --purge \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Use airflow user for pip installs and other things.
USER airflow
# Additional requirements for Airflow
COPY requirements.txt /tmp/requirements.txt
RUN pip install -r /tmp/requirements.txt
# Creating folder for logs
RUN mkdir -p /opt/airflow/dbt_logs
RUN pip install dbt==0.18.1
EXPOSE 8081
Running ls -ld */ (in /opt/airflow ) in the scheduler container I get:
airflow#airflow-scheduler-58c7cb87b8-8nk6f:/opt/airflow$ ls -ld */
drwxrwsrwx 5 1000 1000 41 Dec 26 16:02 dags/
drwxr-xr-x 2 airflow airflow 6 Dec 26 14:10 dbt_logs/
drwxrwxr-x 1 airflow root 52 Dec 26 16:02 logs/
And running ls -ld */ (in /opt/airflow ) in the web server container I get:
airflow#airflow-web-7f8df9457-7n2dp:/opt/airflow$ ls -ld */
drwxrwsrwx 5 root nogroup 41 Dec 26 14:20 dags/
drwxr-xr-x 1 airflow airflow 21 Dec 26 14:41 dbt_logs/
drwxrwxr-x 1 airflow root 23 Dec 26 14:20 logs/
This is how the structure of the dbt folder looks like withing my dags dir:
airflow#airflow-scheduler-6cdc985b9b-ssmhx:/opt/airflow/dags/dbt/dw$ ls -l
total 16
-rw-r--r-- 1 root 1000 3061 Dec 27 09:00 README.md
drwxr-sr-x 2 root 1000 22 Dec 27 09:00 analysis
drwxr-sr-x 2 root 1000 22 Dec 27 09:00 data
-rw-r--r-- 1 root 1000 1852 Dec 27 09:00 dbt_project.yml
drwxr-sr-x 2 root 1000 214 Dec 27 09:00 macros
drwxr-sr-x 3 root 1000 21 Dec 27 09:00 models
-rw-r--r-- 1 root 1000 141 Dec 27 09:00 packages.yml
-rw-r--r-- 1 root 1000 842 Dec 27 09:00 profiles.yml
drwxr-sr-x 2 root 1000 22 Dec 27 09:00 snapshots
drwxr-sr-x 2 root 1000 22 Dec 27 09:00 tests
Worth mentioning that I seem not to be able to create files within the dbt dir with the airflow user (permission denied)
It seems to me that once the volume is mounted its owner becomes the root user. How can I provide the Airflow user with the ability to access the mounted git repository?
Happy to provide additional details if needed

Docker: file permissions with --volume bind mount

I'm following the guidelines from: https://denibertovic.com/posts/handling-permissions-with-docker-volumes/ to setup a --volume bind mount in my container and creating a user in the guest container with the same UID as my host user - the theory being that my container user should be able to access the mount. It's not working for me and I'm looking for some pointers to try next.
More background details:
My Dockerfile starts from an alpine base and adds python dev packages. It copies across an entrypoint.sh script per guidelines from denibertovic. It then jumps to the entrpoint.sh script.
FROM alpine
RUN apk update
RUN apk add bash
RUN apk add python3
RUN apk add python3-dev
RUN apk add su-exec
COPY entrypoint.sh /usr/local/bin/entrypoint.sh
RUN chmod +x /usr/local/bin/entrypoint.sh
ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
The entrpoint.sh script adds a user to the container with the UID passed in as an environment variable.
#!/bin/bash
# Add local user
# Either use the LOCAL_USER_ID if passed in at runtime or
# fallback
USER_ID=${LOCAL_USER_ID:-9001}
echo "Starting with UID : $USER_ID"
adduser -s /bin/bash -u $USER_ID -H -D user
export HOME=/home/user
su-exec user "$#"
The container builds no problem.
I then run it with the following command line:
sudo docker run -it -e LOCAL_USER_ID=`id -u` -v `realpath ../..`:/ws django-runtime /bin/bash
You'll see that I'm passing in my host UID to be mapped to the container user's UID and I'm asking for a volume bind mount from my local working directory to the /ws mountpoint in the container.
From the bash shell inside the container I can see that /ws is owned by the 'user' UID matching my own 'id'. However, when I go to list the contents of /ws I get a Permission Denied error as follows:
[dleclair#localhost runtime]$ sudo docker run -it -e LOCAL_USER_ID=`id -u` -v `realpath ../..`:/ws django-runtime /bin/bash
[sudo] password for dleclair:
Starting with UID : 1000
bash-5.0$ id
uid=1000(user) gid=1000(user) groups=1000(user)
bash-5.0$ ls -la .
total 0
drwxr-xr-x 1 root root 27 Feb 8 09:15 .
drwxr-xr-x 1 root root 27 Feb 8 09:15 ..
-rwxr-xr-x 1 root root 0 Feb 8 09:15 .dockerenv
drwxr-xr-x 1 root root 18 Feb 8 07:44 bin
drwxr-xr-x 5 root root 360 Feb 8 09:15 dev
drwxr-xr-x 1 root root 91 Feb 8 09:15 etc
drwxr-xr-x 2 root root 6 Jan 16 21:52 home
drwxr-xr-x 1 root root 17 Jan 16 21:52 lib
drwxr-xr-x 5 root root 44 Jan 16 21:52 media
drwxr-xr-x 2 root root 6 Jan 16 21:52 mnt
drwxr-xr-x 2 root root 6 Jan 16 21:52 opt
dr-xr-xr-x 119 root root 0 Feb 8 09:15 proc
drwx------ 2 root root 6 Jan 16 21:52 root
drwxr-xr-x 1 root root 21 Feb 8 07:44 run
drwxr-xr-x 1 root root 21 Feb 8 08:22 sbin
drwxr-xr-x 2 root root 6 Jan 16 21:52 srv
dr-xr-xr-x 13 root root 0 Feb 8 01:58 sys
drwxrwxrwt 2 root root 6 Jan 16 21:52 tmp
drwxr-xr-x 1 root root 19 Feb 8 07:44 usr
drwxr-xr-x 1 root root 19 Jan 16 21:52 var
drwxrwxr-x 5 user user 111 Feb 8 02:15 ws
bash-5.0$
bash-5.0$
bash-5.0$ cd /ws
bash-5.0$ ls -la
ls: can't open '.': Permission denied
total 0
bash-5.0$
Appreciate any pointers anyone can offer. Thanks!
After more searching I found the answer to my problem here: Permission denied on accessing host directory in Docker and here: http://www.projectatomic.io/blog/2015/06/using-volumes-with-docker-can-cause-problems-with-selinux/.
In short, the problem was with the SELinux default labels for the volume mount blocking access to the mounted files. The solution was to add a ':Z' trailer to the -v command line argument to force docker to set the appropriate flags against the mounted files to allow access.
The command line therefore became:
sudo docker run -it -e LOCAL_USER_ID=`id -u` -v `realpath ../..`:/ws:Z django-runtime /bin/bash
Worked like a charm.

Docker run, no response

I'm implementing docker: docker build -t from the following docker file.
**FROM centos:7**
RUN yum -y update
RUN yum -y install wget
RUN wget http://stedolan.github.io/jq/download/linux64/jq && chmod 755 jq && mv jq /bin
RUN yum -y install openssh-clients
RUN yum -y install cronie
RUN yum -y install java-1.8.0-openjdk
RUN yum -y install nmap-ncat
RUN yum -y install ntpdate
ENTRYPOINT tail -f /dev/null
After executing the build, even if docker run -it is executed, there is no response and I cannot login to the container.
However, when you run docker ps, the container is running.
Why is not the response coming back? I am wondering if it is a description of ENTRYPOINT.
Try starting container in detached mode.
-d, --detach Run container in background and print container ID
#>docker build -t myimg .
#>docker run -d --name mycontainer myimg
#>docker exec -it mycontainer bash
[root#mycontainer/]# ls -l
total 12
-rw-r--r-- 1 root root 11976 Apr 2 18:39 anaconda-post.log
lrwxrwxrwx 1 root root 7 May 25 06:51 bin -> usr/bin
dr-xr-xr-x 2 root root 6 Apr 11 04:59 boot
drwxr-xr-x 5 root root 340 May 25 06:53 dev
drwxr-xr-x 1 root root 66 May 25 06:53 etc
drwxr-xr-x 1 root root 6 Apr 11 04:59 home
lrwxrwxrwx 1 root root 7 May 25 06:51 lib -> usr/lib
lrwxrwxrwx 1 root root 9 May 25 06:51 lib64 -> usr/lib64
drwxr-xr-x 1 root root 6 Apr 11 04:59 media
drwxr-xr-x 1 root root 6 Apr 11 04:59 mnt
drwxr-xr-x 1 root root 6 Apr 11 04:59 opt
dr-xr-xr-x 985 root root 0 May 25 06:53 proc
dr-xr-x--- 1 root root 6 Apr 11 04:59 root
drwxr-xr-x 1 root root 6 May 25 06:52 run
lrwxrwxrwx 1 root root 8 May 25 06:51 sbin -> usr/sbin
drwxr-xr-x 1 root root 6 Apr 11 04:59 srv
dr-xr-xr-x 13 root root 0 May 2 14:37 sys
drwxrwxrwt 1 root root 6 May 25 06:52 tmp
drwxr-xr-x 1 root root 44 May 25 06:51 usr
drwxr-xr-x 1 root root 52 May 25 06:51 var
[root#mycontainer/]#
ENTRYPOINT is used to set default init process in container, which can be overwritten by command line.
docker run container_image will use ENTRYPOINT as init.
docker run container_image prog will ignore ENTRYPOINT and use prog as init.

Resources