Almost all files are created by root user in my Docker image - docker

This is my Dockerfile:
FROM python:3.10.5-alpine
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
RUN adduser -D appuser
USER appuser
WORKDIR /home/appuser/
COPY requirements.txt .
RUN python -m pip install --user --no-cache-dir --disable-pip-version-check --requirement requirements.txt
COPY . .
ENTRYPOINT [ "./entrypoint.sh" ]
So I create a user called appuser and switch to it as soon as I can before copying anything (I've checked both user and its home folder is created).
But when I browse the filesystem of my image:
~ $ ls -l
total 156
-rwxr-xr-x 1 root root 335 Jul 28 10:57 Dockerfile
-rw-r--r-- 1 appuser appuser 131072 Jul 28 12:28 db.sqlite3
-rwxr-xr-x 1 root root 150 Jul 28 11:37 entrypoint.sh
-rwxr-xr-x 1 root root 685 Jul 28 10:04 manage.py
drwxr-xr-x 2 root root 4096 Jul 28 10:56 project
-rwxr-xr-x 1 root root 41 Jul 28 11:56 requirements.txt
drwxr-xr-x 2 root root 4096 Jul 28 11:50 static
drwxr-xr-x 5 root root 4096 Jul 28 10:05 venv
... almost everything belongs to root user and this gives me several permission denied errors.
What is my mistake because I assume Docker shouldn't operate under root when I've switched the user?
I know I can add RUN mkdir ~/static to the Dockerfile and get over it, but then what the documentation says about USER command doesn't make sense to me:
The USER instruction sets the user name (or UID) and optionally the user group (or GID) to use as the default user and group for the remainder of the current stage.

Use the optional flag --chown=<user>:<group> with either the ADD or COPY commands.
For example:
COPY --chown=appuser:appuser . .
docker docs

Related

How to ensure that users inside and outside docker are consistent?

I am making a Docker image. I would like to have a ready-made environment in there as well as some ready-made directories. In this way, I only need to mount some of my directories and use them directly. I made the image using the Dockerfile below. In order to have the same permissions inside and outside the container (not root), I created a user user.
FROM matthewfeickert/docker-python3-ubuntu:latest
USER root
# Create an arbitrary non-root user; we don't care about its uid
# or other properties
RUN useradd --system user
RUN sudo pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
RUN set -x; \
sudo apt-get update \
&& DEBIAN_FRONTEND=noninteractive sudo apt-get install -y build-essential git-core m4 zlib1g zlib1g-dev libprotobuf-dev protobuf-compiler libprotoc-dev libgoogle-perftools-dev swig xz-utils gdb git \
&& sudo -H python3 -m pip install scons==3.0.1 \
&& sudo -H python3 -m pip install six
# RUN apt-get -y install gdb
RUN apt-get clean
RUN git config --global url."https://hub.fastgit.xyz/".insteadOf "https://github.com/"
WORKDIR /usr/local/src
RUN git clone https://github.com/gem5/gem5.git
RUN sudo chown user /usr/local/src/gem5 -R
USER user
# RUN mkdir -p /usr/local/src/gem5/build
# RUN sudo chown user /usr/local/src/gem5/build
WORKDIR /usr/local/src/gem5/
After making the image, I mount my directory into it.
docker run -it --rm \
-v my_dir/runScripts:/usr/local/src/gem5/runScripts \
-v my_dir/gem5/src:/usr/local/src/gem5/src \
-v my_dir/gem5/configs:/usr/local/src/gem5/configs \
-v my_dir/gem5/programs:/usr/local/src/gem5/programs
-v my_dir/gem5/build:/usr/local/src/gem5/build \
-v my_dir/gem5/results:/usr/local/src/gem5/results \
-v my_dir/gem5/update.sh:/usr/local/src/gem5/update.sh \
--security-opt seccomp=unconfined --user 1000:1000 gerrie/gem5:v1 "/bin/bash"
When I enter the docker container, I output the UID at this time.
$ echo $UID
1000
This is the same as outside the container.
What I think is that the inside and outside of the gem5 directory should be exactly the same user. But it's not.
$ ll
total 232
drwxr-xr-x 1 user root 4096 Jun 16 09:12 ./
drwxr-xr-x 1 root root 4096 Jun 16 03:16 ../
drwxr-xr-x 1 user root 4096 Jun 16 03:17 .git/
-rw-r--r-- 1 user root 984 Jun 16 03:17 .git-blame-ignore-revs
-rw-r--r-- 1 user root 645 Jun 16 03:17 .gitignore
-rw-r--r-- 1 user root 19339 Jun 16 03:17 .mailmap
-rw-r--r-- 1 user root 5595 Jun 16 03:17 CODE-OF-CONDUCT.md
-rw-r--r-- 1 user root 26112 Jun 16 03:17 CONTRIBUTING.md
-rw-r--r-- 1 user root 2332 Jun 16 03:17 COPYING
-rw-r--r-- 1 user root 1478 Jun 16 03:17 LICENSE
-rw-r--r-- 1 user root 7790 Jun 16 03:17 MAINTAINERS.yaml
-rw-r--r-- 1 user root 2133 Jun 16 03:17 README
-rw-r--r-- 1 user root 34435 Jun 16 03:17 RELEASE-NOTES.md
-rwxr-xr-x 1 user root 28876 Jun 16 03:17 SConstruct*
-rw-r--r-- 1 user root 8616 Jun 16 03:17 TESTING.md
drwxrwxr-x 2 docker docker 4096 Jun 16 08:52 build/
drwxr-xr-x 1 user root 4096 Jun 16 03:17 build_opts/
drwxr-xr-x 1 user root 4096 Jun 16 03:17 build_tools/
drwxrwxr-x 13 docker docker 4096 Jun 16 08:54 configs/
drwxr-xr-x 1 user root 4096 Jun 16 03:17 ext/
drwxr-xr-x 1 user root 4096 Jun 16 03:17 include/
drwxrwxr-x 2 docker docker 4096 Jun 16 09:03 programs/
-rw-rw-r-- 1 docker docker 0 Jun 16 08:58 results
drwxrwxr-x 2 docker docker 4096 Jun 16 02:33 runScripts/
drwxr-xr-x 1 user root 4096 Jun 16 03:17 site_scons/
drwxrwxr-x 17 docker docker 4096 Jun 16 02:33 src/
drwxr-xr-x 1 user root 4096 Jun 16 03:17 system/
drwxr-xr-x 1 user root 4096 Jun 16 03:17 tests/
-rw-rw-r-- 1 docker docker 0 Jun 16 08:58 update.sh
drwxr-xr-x 1 user root 4096 Jun 16 03:17 util/
All the directories I mount belong to the docker user, and all other directories are user.
I am able to create files inside my mounted directory. But for gem5's directory, I don't even have permission to create files.
But according to the Dockfile, I have clearly chown this directory to user. And when entering the container, I set the uid.
docker#7df3004beb2a:/usr/local/src/gem5$ touch test
touch: cannot touch 'test': Permission denied
docker#7df3004beb2a:/usr/local/src/gem5$ cd runScripts/
docker#7df3004beb2a:/usr/local/src/gem5/runScripts$ touch test
docker#7df3004beb2a:/usr/local/src/gem5/runScripts$ ll
total 8
drwxrwxr-x 2 docker docker 4096 Jun 16 09:13 ./
drwxr-xr-x 1 user root 4096 Jun 16 09:12 ../
-rw-r--r-- 1 docker docker 0 Jun 16 09:13 test
When I compile, this problem occurs. I think this is caused by a permissions issue. Where did I go wrong? How should I modify it? Thanks a lot!
FileNotFoundError: [Errno 2] No such file or directory: "/usr/local/src/gem5/fatal: unsafe repository ('/usr/local/src/gem5' is owned by someone else)\nTo add an exception for this directory, call:\n\n\tgit config --global --add safe.directory /usr/local/src/gem5/hooks":
As the question was tagged with podman (in addition to docker), here is a Podman solution to the problem of mapping users between the host and the container:
If you want to map your regular user on the host to a user with the same UID inside the container, you could add the Podman option --userns=keep-id. A more general solution (that also works when the UIDs are not the same) can be found in the troubleshooting.md tip and tip. The tips make use of the options --uidmap and --gidmap. (I wrote those tips).
The two options --uidmap and --gidmap may look to be a bit complicated to use, but as soon as you understand how rootless Podman maps UIDs and GIDs it will be pretty straight forward.

files in docker image always owned by root

My docker file
FROM mcr.microsoft.com/playwright:focal as influencer-scraper
USER root
# Install tools as telnet
RUN apt-get update && apt-get install telnet -y
# RUN apk add chromium
RUN groupadd --gid 888 node \
&& useradd --uid 888 --gid node --shell /bin/bash --create-home node
USER node
WORKDIR /home/node
# Copy package.json and Yarn install (separate for cache)
COPY ./package.json ./
COPY ./yarn.lock ./
RUN yarn
# Copy everything and build
COPY . .
# Copy other config files
COPY ./.env ./.env
# Entry point
ENTRYPOINT ["yarn", "start"]
CMD ["--mongodb", "host.docker.internal:27017"]
However, after I login to the docker image, I found that all files are owned by root, which is creating trouble during the runtime
➜ influencer-scraper-js git:(master) ✗ docker run -it --entrypoint /bin/bash influencer-scraper:v0.1-6-gfe17ad4962-dirty
node#bce54c1024db:~$ ls -l
total 52
-rw-r--r--. 1 root root 542 Apr 16 04:15 Docker.md
-rw-r--r--. 1 root root 589 Apr 16 05:03 Dockerfile
-rw-r--r--. 1 root root 570 Apr 16 03:58 Makefile
-rw-r--r--. 1 root root 358 Apr 13 01:27 README.md
drwxr-xr-x. 1 root root 20 Apr 16 03:58 config
drwxr-xr-x. 1 root root 16 Apr 16 03:58 data
drwxr-xr-x. 1 root root 14 Apr 12 06:00 docker
-rw-r--r--. 1 root root 558 Apr 16 03:58 docker-compose.yml
drwxr-xr-x. 1 root root 140 Apr 13 01:27 generated
drwxr-xr-x. 1 root root 1676 Apr 16 04:47 node_modules
-rw-r--r--. 1 root root 583 Apr 16 03:58 package.json
drwxr-xr-x. 1 root root 34 Apr 13 01:27 proxy
drwxr-xr-x. 1 root root 40 Apr 13 01:27 src
-rw-r--r--. 1 root root 26230 Apr 16 03:58 yarn.lock
How can I resolve this? I would like the workdir to be still owned by user node.
Quoting Docker Documentation : https://docs.docker.com/engine/reference/builder/#copy
COPY has two forms:
COPY [--chown=<user>:<group>] <src>... <dest>
COPY [--chown=<user>:<group>] ["<src>",... "<dest>"]
If you do not specify any user in --chown, the default used is root
All new files and directories are created with a UID and GID of 0, unless the optional --chown flag specifies a given username, groupname, or UID/GID combination to request specific ownership of the copied content.
You can also try doing chown after copying.
chown root:node filename
The file listing you show looks almost correct to me. You want most of the files to be owned by root and not be world-writeable: in the event that there's some security issue or other bug in your code, you don't want that to accidentally overwrite your source files, static assets, or other content.
This means you need the actual writeable data to be stored in a different directory, and your listing includes a data directory which presumably serves this role. You can chown it in your Dockerfile.
For clarity, it helps to stay as the root user until the very end of the file, and then you can declare the alternate user to actually run the container.
# USER root (if required)
RUN chown node data
...
USER node
CMD ["yarn", "start"]
When you launch the container, you can mount a volume on that specific directory. This setup should work as-is with a named volume
docker run \
-v app_data:/home/node/data \
...
If you want/need to use a host directory to store the data, you also need to specify the host user ID that owns the directory (typically the current user). Again, the application code will be owned by root and world-readable, so this won't change; it's only the data directory whose contents and ownership matter.
docker run \
-u $(id -u) \
-v "$(pwd)/app_data:/home/node/data" \
...
(Do not use volumes to replace the application code or libraries in the container. In this particular case, doing that would obscure this configuration problem in the Dockerfile, and your container setup would fail when you tried to deploy to production without the local-build volumes.)

Unable to run kafka connect datagen inside kafka connect docker image

I am trying to run kafka datagen connector inside kafka-connect container and my kafka resides in AWS MSK using : https://github.com/confluentinc/kafka-connect-datagen/blob/master/Dockerfile-confluenthub.
I am using kafdrop as a web browser for kafka broker (MSK). I don't see Kafka datagen generating any test messages.
Is there anything other configuration I need to do except installing the kafka-datagen connector
Also, how can I check inside confluentinc/kafka-connect image what topics are created and whether messages are consumed or not?
Dockerfile looks like :
ARG BASE_PREFIX=confluentinc
ARG CONNECT_IMAGE=cp-kafka-connect
FROM $BASE_PREFIX/$CONNECT_IMAGE:6.1.0
ENV CONNECT_PLUGIN_PATH="/usr/share/java,/usr/share/confluent-hub-components"
RUN confluent-hub install --no-prompt confluentinc/kafka-connect-datagen:0.4.0
docker exec 51e32e20b292 bash -c 'echo $CONNECT_PLUGIN_PATH'
shows : /usr/share/java,/usr/share/confluent-hub-components
[appuser#88db8385b575 ~]$ ls -la /usr/share/confluent-hub-components/
total 20
drwxr-xr-x 1 appuser appuser 4096 Mar 26 21:19 .
drwxr-xr-x 1 root root 4096 Feb 4 21:10 ..
drwxr-xr-x 6 appuser appuser 4096 Mar 26 18:00 confluentinc-kafka-connect-datagen
[appuser#88db8385b575 ~]$ ls -la /usr/share/confluent-hub-components/confluentinc-kafka-connect-datagen/
total 28
drwxr-xr-x 6 appuser appuser 4096 Mar 26 18:00 .
drwxr-xr-x 1 appuser appuser 4096 Mar 26 21:19 ..
drwxr-xr-x 2 appuser appuser 4096 Mar 26 18:00 assets
drwxr-xr-x 4 appuser appuser 4096 Mar 26 18:00 doc
drwxr-xr-x 2 appuser appuser 4096 Mar 26 18:00 etc
drwxr-xr-x 2 appuser appuser 4096 Mar 26 18:00 lib
-rw-r--r-- 1 appuser appuser 1380 Mar 26 18:00 manifest.json
Docker logs :
docker logs 51e32e20b292 | grep "DatagenConnector"
"connector.class": "io.confluent.kafka.connect.datagen.DatagenConnector",
"connector.class": "io.confluent.kafka.connect.datagen.DatagenConnector",
I just added in the dockerfile and ran RUN confluent-hub install --no-prompt confluentinc/kafka-connect-datagen:0.4.0 inside the dockerfile. Nothing else. No error logs .
That alone doesn't run the connector, only makes it available to the Connect API. Notice the curl example in the docs https://github.com/confluentinc/kafka-connect-datagen#run-connector-in-docker-compose
So, expose port 8083 and make the request to add the connector, and make sure to add all the relevant environment variables when you're running the container

How can I use "docker run --user" but with root priviliges

I have a Docker image which contains an analysis pipeline. To run this pipeline, I need to provide input data and I want to keep the outputs. This pipeline must be able to be run by other users than myself, on their own laptops.
Briefly, my root (/) folder structure is as follows:
total 72
drwxr-xr-x 1 root root 4096 May 29 15:38 bin
drwxr-xr-x 2 root root 4096 Feb 1 17:09 boot
drwxr-xr-x 5 root root 360 Jun 1 15:31 dev
drwxr-xr-x 1 root root 4096 Jun 1 15:31 etc
drwxr-xr-x 2 root root 4096 Feb 1 17:09 home
drwxr-xr-x 1 root root 4096 May 29 15:49 lib
drwxr-xr-x 2 root root 4096 Feb 24 00:00 lib64
drwxr-xr-x 2 root root 4096 Feb 24 00:00 media
drwxr-xr-x 2 root root 4096 Feb 24 00:00 mnt
drwxr-xr-x 1 root root 4096 Mar 12 19:38 opt
drwxr-xr-x 1 root root 4096 Jun 1 15:24 pipeline
dr-xr-xr-x 615 root root 0 Jun 1 15:31 proc
drwx------ 1 root root 4096 Mar 12 19:38 root
drwxr-xr-x 3 root root 4096 Feb 24 00:00 run
drwxr-xr-x 1 root root 4096 May 29 15:38 sbin
drwxr-xr-x 2 root root 4096 Feb 24 00:00 srv
dr-xr-xr-x 13 root root 0 Apr 29 10:14 sys
drwxrwxrwt 1 root root 4096 Jun 1 15:25 tmp
drwxr-xr-x 1 root root 4096 Feb 24 00:00 usr
drwxr-xr-x 1 root root 4096 Feb 24 00:00 var
The pipeline scripts are in /pipeline and are packaged into the image with a "COPY. /pipeline" instruction in my Dockerfile.
For various reasons, this pipeline (which is a legacy pipeline) is set up so that the input data must be in a folder such /pipeline/project. To run my pipeline, I use:
docker run --rm --mount type=bind,source=$(pwd),target=/pipeline/project --user "$(id -u):$(id -g)" pipelineimage:v1
In other words, I mount a folder with the data to /pipeline/project. I found I needed to use the --user to insure the output files would have the correct permissions - i.e. I would have read/write/exec access on my host computer after the container exits.
The pipeline runs but I have one issue: one particular software used by the pipeline automatically tries to produce (and I can't change that) 1 folder in $HOME (so / - which I showed above) and 1 folder in my WORKDIR (which I have set up in my Dockerfile to be /pipeline). These attempts fails, and I'm guessing it's because I am not running the pipeline as root. But I need to use --user to make sure my outputs have the correct permissions - i.e. that I don't require sudo rights to read these outputs etc.
My question is: how am I meant to handle this? It seems that by using --user, I have the correct permissions set for the mounted folder (/pipeline/projects) where many output files are successfully made, no problems there. But how can I ensure the other 2 folders are correctly made outside of that mount?
I have tried the following but not success:
Doing "COPY -chown myhostuid:mygroupid" . pipeline/". This works but I have to hardcode my uid and gid so that won't work if another colleague tries to run the image.
Adding a new user with sudo rights and making it run the image: "RUN useradd -r newuser -g sudo" (I also tried using the "root" group but no success). This just gives me outputs which require sudo rights to read/write/exec. Which is not what I want.
Am I missing something? I don't understand why it's "easy" to handle permissions for a mounted folder but so much harder for the other folders in a container. Thanks.
If your software doesn't rely on relative paths (~/, ./), you can just set $HOME and WORKDIR to a directory that any user can write:
ENV HOME=/tmp
WORKDIR /tmp
If you can't do that, you can pass the uid/gid via the environment to an entrypoint script running as root, chown/chmod as necessary, then drop privileges to run the pipeline (runuser, su, sudo, setuidgid).
For example (untested):
entrypoint.sh
#!/bin/bash
[[ -v "RUN_UID" ]] || { echo "unset RUN_UID" >&2; exit 1; }
[[ -v "RUN_GID" ]] || { echo "unset RUN_GID" >&2; exit 1; }
# chown, chmod, set env, etc.
chown $RUN_UID:$RUN_GID "/path/that/requires/write/permissions"
export HOME=/tmp
# Run the pipeline as a non-root user.
sudo -E -u "#$RUN_UID" -g "#$RUN_GID" /path/to/pipeline
Dockerfile
...
COPY entrypoint.sh /usr/local/bin/entrypoint.sh
ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
Finally, pass the user and group IDs via the environment when running:
docker run --rm --mount type=bind,source=$(pwd),target=/pipeline/project -e RUN_UID=$(id -u) -e RUN_GID=$(id -g) pipelineimage:v1

Docker: file permissions with --volume bind mount

I'm following the guidelines from: https://denibertovic.com/posts/handling-permissions-with-docker-volumes/ to setup a --volume bind mount in my container and creating a user in the guest container with the same UID as my host user - the theory being that my container user should be able to access the mount. It's not working for me and I'm looking for some pointers to try next.
More background details:
My Dockerfile starts from an alpine base and adds python dev packages. It copies across an entrypoint.sh script per guidelines from denibertovic. It then jumps to the entrpoint.sh script.
FROM alpine
RUN apk update
RUN apk add bash
RUN apk add python3
RUN apk add python3-dev
RUN apk add su-exec
COPY entrypoint.sh /usr/local/bin/entrypoint.sh
RUN chmod +x /usr/local/bin/entrypoint.sh
ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
The entrpoint.sh script adds a user to the container with the UID passed in as an environment variable.
#!/bin/bash
# Add local user
# Either use the LOCAL_USER_ID if passed in at runtime or
# fallback
USER_ID=${LOCAL_USER_ID:-9001}
echo "Starting with UID : $USER_ID"
adduser -s /bin/bash -u $USER_ID -H -D user
export HOME=/home/user
su-exec user "$#"
The container builds no problem.
I then run it with the following command line:
sudo docker run -it -e LOCAL_USER_ID=`id -u` -v `realpath ../..`:/ws django-runtime /bin/bash
You'll see that I'm passing in my host UID to be mapped to the container user's UID and I'm asking for a volume bind mount from my local working directory to the /ws mountpoint in the container.
From the bash shell inside the container I can see that /ws is owned by the 'user' UID matching my own 'id'. However, when I go to list the contents of /ws I get a Permission Denied error as follows:
[dleclair#localhost runtime]$ sudo docker run -it -e LOCAL_USER_ID=`id -u` -v `realpath ../..`:/ws django-runtime /bin/bash
[sudo] password for dleclair:
Starting with UID : 1000
bash-5.0$ id
uid=1000(user) gid=1000(user) groups=1000(user)
bash-5.0$ ls -la .
total 0
drwxr-xr-x 1 root root 27 Feb 8 09:15 .
drwxr-xr-x 1 root root 27 Feb 8 09:15 ..
-rwxr-xr-x 1 root root 0 Feb 8 09:15 .dockerenv
drwxr-xr-x 1 root root 18 Feb 8 07:44 bin
drwxr-xr-x 5 root root 360 Feb 8 09:15 dev
drwxr-xr-x 1 root root 91 Feb 8 09:15 etc
drwxr-xr-x 2 root root 6 Jan 16 21:52 home
drwxr-xr-x 1 root root 17 Jan 16 21:52 lib
drwxr-xr-x 5 root root 44 Jan 16 21:52 media
drwxr-xr-x 2 root root 6 Jan 16 21:52 mnt
drwxr-xr-x 2 root root 6 Jan 16 21:52 opt
dr-xr-xr-x 119 root root 0 Feb 8 09:15 proc
drwx------ 2 root root 6 Jan 16 21:52 root
drwxr-xr-x 1 root root 21 Feb 8 07:44 run
drwxr-xr-x 1 root root 21 Feb 8 08:22 sbin
drwxr-xr-x 2 root root 6 Jan 16 21:52 srv
dr-xr-xr-x 13 root root 0 Feb 8 01:58 sys
drwxrwxrwt 2 root root 6 Jan 16 21:52 tmp
drwxr-xr-x 1 root root 19 Feb 8 07:44 usr
drwxr-xr-x 1 root root 19 Jan 16 21:52 var
drwxrwxr-x 5 user user 111 Feb 8 02:15 ws
bash-5.0$
bash-5.0$
bash-5.0$ cd /ws
bash-5.0$ ls -la
ls: can't open '.': Permission denied
total 0
bash-5.0$
Appreciate any pointers anyone can offer. Thanks!
After more searching I found the answer to my problem here: Permission denied on accessing host directory in Docker and here: http://www.projectatomic.io/blog/2015/06/using-volumes-with-docker-can-cause-problems-with-selinux/.
In short, the problem was with the SELinux default labels for the volume mount blocking access to the mounted files. The solution was to add a ':Z' trailer to the -v command line argument to force docker to set the appropriate flags against the mounted files to allow access.
The command line therefore became:
sudo docker run -it -e LOCAL_USER_ID=`id -u` -v `realpath ../..`:/ws:Z django-runtime /bin/bash
Worked like a charm.

Resources