ADD command in Dockerfile download jars as root user - docker

I am looking that during the docker build of my docker image the jar packages downloaded can't be used by the defined USER since these have root proprietary
The weird thing is that I put USER before to download all jars file, so I thought that this command was performed as USER 65534 than root.
FROM myimage:1.0
USER 65534
ADD [ \
"https://repo1.maven.org/maven2/org/scala-lang/scala-library/2.13.1/scala-library-2.13.1.jar", \
"https://repo1.maven.org/maven2/com/typesafe/akka/akka-actor_2.13/2.6.5/akka-actor_2.13-2.6.5.jar", \
"https://repo1.maven.org/maven2/com/typesafe/akka/akka-osgi_2.13/2.6.5/akka-osgi_2.13-2.6.5.jar", \
"https://repo1.maven.org/maven2/com/typesafe/akka/akka-slf4j_2.13/2.6.5/akka-slf4j_2.13-2.6.5.jar", \
"https://repo1.maven.org/maven2/com/typesafe/akka/akka-stream_2.13/2.6.5/akka-stream_2.13-2.6.5.jar", \"
/tmp/myfolder/lib/" ]
Then looking inside the container I can see that these packages are root and not usable from the defined USER.
ls -alt
-rw------- 1 root root 2433561 Apr 30 09:09 akka-remote_2.13-2.6.5.jar
-rw------- 1 root root 4665057 Apr 30 09:06 akka-stream_2.13-2.6.5.jar
-rw------- 1 root root 17078 Apr 30 09:05 akka-slf4j_2.13-2.6.5.jar
-rw------- 1 root root 25253 Apr 30 09:04 akka-osgi_2.13-2.6.5.jar
-rw------- 1 root root 3598880 Apr 30 09:02 akka-actor_2.13-2.6.5.jar
what could be the issue?

USER does not affect ADD or COPY, so Docker added chown flags to these commands. You have more info here https://docs.docker.com/engine/reference/builder/#add.
So you can do something like this to change the ownership during ADD or COPY.
ADD [--chown=<user>:<group>] <src>... <dest>
ADD [--chown=<user>:<group>] ["<src>",... "<dest>"]
Also from the docs of USER https://docs.docker.com/engine/reference/builder/#user:
The USER instruction sets the user name (or UID) and optionally the user group (or GID) to use when running the image and for any RUN, CMD and ENTRYPOINT instructions that follow it in the Dockerfile.
So USER is limited to RUN, CMD and ENTRYPOINT instructions.

Related

Almost all files are created by root user in my Docker image

This is my Dockerfile:
FROM python:3.10.5-alpine
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
RUN adduser -D appuser
USER appuser
WORKDIR /home/appuser/
COPY requirements.txt .
RUN python -m pip install --user --no-cache-dir --disable-pip-version-check --requirement requirements.txt
COPY . .
ENTRYPOINT [ "./entrypoint.sh" ]
So I create a user called appuser and switch to it as soon as I can before copying anything (I've checked both user and its home folder is created).
But when I browse the filesystem of my image:
~ $ ls -l
total 156
-rwxr-xr-x 1 root root 335 Jul 28 10:57 Dockerfile
-rw-r--r-- 1 appuser appuser 131072 Jul 28 12:28 db.sqlite3
-rwxr-xr-x 1 root root 150 Jul 28 11:37 entrypoint.sh
-rwxr-xr-x 1 root root 685 Jul 28 10:04 manage.py
drwxr-xr-x 2 root root 4096 Jul 28 10:56 project
-rwxr-xr-x 1 root root 41 Jul 28 11:56 requirements.txt
drwxr-xr-x 2 root root 4096 Jul 28 11:50 static
drwxr-xr-x 5 root root 4096 Jul 28 10:05 venv
... almost everything belongs to root user and this gives me several permission denied errors.
What is my mistake because I assume Docker shouldn't operate under root when I've switched the user?
I know I can add RUN mkdir ~/static to the Dockerfile and get over it, but then what the documentation says about USER command doesn't make sense to me:
The USER instruction sets the user name (or UID) and optionally the user group (or GID) to use as the default user and group for the remainder of the current stage.
Use the optional flag --chown=<user>:<group> with either the ADD or COPY commands.
For example:
COPY --chown=appuser:appuser . .
docker docs

files in docker image always owned by root

My docker file
FROM mcr.microsoft.com/playwright:focal as influencer-scraper
USER root
# Install tools as telnet
RUN apt-get update && apt-get install telnet -y
# RUN apk add chromium
RUN groupadd --gid 888 node \
&& useradd --uid 888 --gid node --shell /bin/bash --create-home node
USER node
WORKDIR /home/node
# Copy package.json and Yarn install (separate for cache)
COPY ./package.json ./
COPY ./yarn.lock ./
RUN yarn
# Copy everything and build
COPY . .
# Copy other config files
COPY ./.env ./.env
# Entry point
ENTRYPOINT ["yarn", "start"]
CMD ["--mongodb", "host.docker.internal:27017"]
However, after I login to the docker image, I found that all files are owned by root, which is creating trouble during the runtime
➜ influencer-scraper-js git:(master) ✗ docker run -it --entrypoint /bin/bash influencer-scraper:v0.1-6-gfe17ad4962-dirty
node#bce54c1024db:~$ ls -l
total 52
-rw-r--r--. 1 root root 542 Apr 16 04:15 Docker.md
-rw-r--r--. 1 root root 589 Apr 16 05:03 Dockerfile
-rw-r--r--. 1 root root 570 Apr 16 03:58 Makefile
-rw-r--r--. 1 root root 358 Apr 13 01:27 README.md
drwxr-xr-x. 1 root root 20 Apr 16 03:58 config
drwxr-xr-x. 1 root root 16 Apr 16 03:58 data
drwxr-xr-x. 1 root root 14 Apr 12 06:00 docker
-rw-r--r--. 1 root root 558 Apr 16 03:58 docker-compose.yml
drwxr-xr-x. 1 root root 140 Apr 13 01:27 generated
drwxr-xr-x. 1 root root 1676 Apr 16 04:47 node_modules
-rw-r--r--. 1 root root 583 Apr 16 03:58 package.json
drwxr-xr-x. 1 root root 34 Apr 13 01:27 proxy
drwxr-xr-x. 1 root root 40 Apr 13 01:27 src
-rw-r--r--. 1 root root 26230 Apr 16 03:58 yarn.lock
How can I resolve this? I would like the workdir to be still owned by user node.
Quoting Docker Documentation : https://docs.docker.com/engine/reference/builder/#copy
COPY has two forms:
COPY [--chown=<user>:<group>] <src>... <dest>
COPY [--chown=<user>:<group>] ["<src>",... "<dest>"]
If you do not specify any user in --chown, the default used is root
All new files and directories are created with a UID and GID of 0, unless the optional --chown flag specifies a given username, groupname, or UID/GID combination to request specific ownership of the copied content.
You can also try doing chown after copying.
chown root:node filename
The file listing you show looks almost correct to me. You want most of the files to be owned by root and not be world-writeable: in the event that there's some security issue or other bug in your code, you don't want that to accidentally overwrite your source files, static assets, or other content.
This means you need the actual writeable data to be stored in a different directory, and your listing includes a data directory which presumably serves this role. You can chown it in your Dockerfile.
For clarity, it helps to stay as the root user until the very end of the file, and then you can declare the alternate user to actually run the container.
# USER root (if required)
RUN chown node data
...
USER node
CMD ["yarn", "start"]
When you launch the container, you can mount a volume on that specific directory. This setup should work as-is with a named volume
docker run \
-v app_data:/home/node/data \
...
If you want/need to use a host directory to store the data, you also need to specify the host user ID that owns the directory (typically the current user). Again, the application code will be owned by root and world-readable, so this won't change; it's only the data directory whose contents and ownership matter.
docker run \
-u $(id -u) \
-v "$(pwd)/app_data:/home/node/data" \
...
(Do not use volumes to replace the application code or libraries in the container. In this particular case, doing that would obscure this configuration problem in the Dockerfile, and your container setup would fail when you tried to deploy to production without the local-build volumes.)

How can I use "docker run --user" but with root priviliges

I have a Docker image which contains an analysis pipeline. To run this pipeline, I need to provide input data and I want to keep the outputs. This pipeline must be able to be run by other users than myself, on their own laptops.
Briefly, my root (/) folder structure is as follows:
total 72
drwxr-xr-x 1 root root 4096 May 29 15:38 bin
drwxr-xr-x 2 root root 4096 Feb 1 17:09 boot
drwxr-xr-x 5 root root 360 Jun 1 15:31 dev
drwxr-xr-x 1 root root 4096 Jun 1 15:31 etc
drwxr-xr-x 2 root root 4096 Feb 1 17:09 home
drwxr-xr-x 1 root root 4096 May 29 15:49 lib
drwxr-xr-x 2 root root 4096 Feb 24 00:00 lib64
drwxr-xr-x 2 root root 4096 Feb 24 00:00 media
drwxr-xr-x 2 root root 4096 Feb 24 00:00 mnt
drwxr-xr-x 1 root root 4096 Mar 12 19:38 opt
drwxr-xr-x 1 root root 4096 Jun 1 15:24 pipeline
dr-xr-xr-x 615 root root 0 Jun 1 15:31 proc
drwx------ 1 root root 4096 Mar 12 19:38 root
drwxr-xr-x 3 root root 4096 Feb 24 00:00 run
drwxr-xr-x 1 root root 4096 May 29 15:38 sbin
drwxr-xr-x 2 root root 4096 Feb 24 00:00 srv
dr-xr-xr-x 13 root root 0 Apr 29 10:14 sys
drwxrwxrwt 1 root root 4096 Jun 1 15:25 tmp
drwxr-xr-x 1 root root 4096 Feb 24 00:00 usr
drwxr-xr-x 1 root root 4096 Feb 24 00:00 var
The pipeline scripts are in /pipeline and are packaged into the image with a "COPY. /pipeline" instruction in my Dockerfile.
For various reasons, this pipeline (which is a legacy pipeline) is set up so that the input data must be in a folder such /pipeline/project. To run my pipeline, I use:
docker run --rm --mount type=bind,source=$(pwd),target=/pipeline/project --user "$(id -u):$(id -g)" pipelineimage:v1
In other words, I mount a folder with the data to /pipeline/project. I found I needed to use the --user to insure the output files would have the correct permissions - i.e. I would have read/write/exec access on my host computer after the container exits.
The pipeline runs but I have one issue: one particular software used by the pipeline automatically tries to produce (and I can't change that) 1 folder in $HOME (so / - which I showed above) and 1 folder in my WORKDIR (which I have set up in my Dockerfile to be /pipeline). These attempts fails, and I'm guessing it's because I am not running the pipeline as root. But I need to use --user to make sure my outputs have the correct permissions - i.e. that I don't require sudo rights to read these outputs etc.
My question is: how am I meant to handle this? It seems that by using --user, I have the correct permissions set for the mounted folder (/pipeline/projects) where many output files are successfully made, no problems there. But how can I ensure the other 2 folders are correctly made outside of that mount?
I have tried the following but not success:
Doing "COPY -chown myhostuid:mygroupid" . pipeline/". This works but I have to hardcode my uid and gid so that won't work if another colleague tries to run the image.
Adding a new user with sudo rights and making it run the image: "RUN useradd -r newuser -g sudo" (I also tried using the "root" group but no success). This just gives me outputs which require sudo rights to read/write/exec. Which is not what I want.
Am I missing something? I don't understand why it's "easy" to handle permissions for a mounted folder but so much harder for the other folders in a container. Thanks.
If your software doesn't rely on relative paths (~/, ./), you can just set $HOME and WORKDIR to a directory that any user can write:
ENV HOME=/tmp
WORKDIR /tmp
If you can't do that, you can pass the uid/gid via the environment to an entrypoint script running as root, chown/chmod as necessary, then drop privileges to run the pipeline (runuser, su, sudo, setuidgid).
For example (untested):
entrypoint.sh
#!/bin/bash
[[ -v "RUN_UID" ]] || { echo "unset RUN_UID" >&2; exit 1; }
[[ -v "RUN_GID" ]] || { echo "unset RUN_GID" >&2; exit 1; }
# chown, chmod, set env, etc.
chown $RUN_UID:$RUN_GID "/path/that/requires/write/permissions"
export HOME=/tmp
# Run the pipeline as a non-root user.
sudo -E -u "#$RUN_UID" -g "#$RUN_GID" /path/to/pipeline
Dockerfile
...
COPY entrypoint.sh /usr/local/bin/entrypoint.sh
ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
Finally, pass the user and group IDs via the environment when running:
docker run --rm --mount type=bind,source=$(pwd),target=/pipeline/project -e RUN_UID=$(id -u) -e RUN_GID=$(id -g) pipelineimage:v1

Files inside a docker image disappear when mounting a volume

Inside of docker image has several files in /tmp directory.
Example
/tmp # ls -al
total 4684
drwxrwxrwt 1 root root 4096 May 19 07:09 .
drwxr-xr-x 1 root root 4096 May 19 08:13 ..
-rw-r--r-- 1 root root 156396 Apr 24 07:12 6359688847463040695.jpg
-rw-r--r-- 1 root root 150856 Apr 24 06:46 63596888545973599910.jpg
-rw-r--r-- 1 root root 142208 Apr 24 07:07 63596888658550828124.jpg
-rw-r--r-- 1 root root 168716 Apr 24 07:12 63596888674472576435.jpg
-rw-r--r-- 1 root root 182211 Apr 24 06:51 63596888734768961426.jpg
-rw-r--r-- 1 root root 322126 Apr 24 06:47 6359692693565384673.jpg
-rw-r--r-- 1 root root 4819 Apr 24 06:50 635974329998579791105.png
When I type the command to run this image -> container.
sudo docker run -v /home/media/simple_dir2:/tmp -d simple_backup
Expected behavior is if I run ls -al /home/media/simple_dir2
then the files show up.
But actual behavior is nothing exists in /home/media/simple_dir2.
On the other hand, if I run the same image without the volume option such as:
sudo docker run -d simple_backup
And enter that container using:
sudo docker exec -it <simple_backup container id> /bin/sh
ls -al /tmp
Then the files exist.
TL;DR
I want to mount a volume (directory) on the host, and have it filled with the files which are inside of the docker image.
My env
Ubuntu 18.04
Docker 19.03.6
From: https://docs.docker.com/storage/bind-mounts/
Mount into a non-empty directory on the container
If you bind-mount into a non-empty directory on the container, the directory’s existing contents are obscured by the bind mount. This can be beneficial, such as when you want to test a new version of your application without building a new image. However, it can also be surprising and this behavior differs from that of docker volumes.
"So, if host os's directory is empty, then container's directory will override is that right?"
Nope, it doesn't compare them for which one has files; it just overrides the folder on the container with the one on the host no matter what.

Pass along user credentials in Docker

Using a Docker application, I want to run an app as Daemon:
docker run -v $(pwd)/:/src -dit --name DOCKER_NAME my-app
And then execute a Python script from the mounted drive:
docker exec -w /src DOCKER_NAME python my_script.py
This Python script generates some files and figures, that I would later want to use. However, I have an issue that the files generated from within the Docker app have different rights than my outer environment.
[2D] drwxrwxr-x 5 jenkins_slave jenkins_slave 4096 Mar 21 10:47 .
[2D] drwxrwxr-x 24 jenkins_slave jenkins_slave 4096 Mar 21 10:46 ..
[2D] drwxrwxr-x 2 jenkins_slave jenkins_slave 4096 Mar 21 10:46 my_script.py
[2D] -rw-r--r-- 1 root root 268607 Mar 21 10:46 spaider_2d_0_000.png
[2D] -rw-r--r-- 1 root root 271945 Mar 21 10:46 spaider_2d_0_001.png
[2D] -rw-r--r-- 1 root root 283299 Mar 21 10:46 spaider_2d_0_010.png
In the above example, the latter 3 files are generated from within the Docker mount.
Can I in any way specify that the Docker app should be run with same credentials as the outer environment, and/or the generated files should have certain permissions?
Use Docker's -u/--user instruction to set user and group to run the container.
For example, if I would like to run the container not by root but by myself, I can do the following:
user=$(id -u)
group=$(cut -d: -f3 < <(getent group $(whoami)))
docker run -it -u "$user:$group" <CONTAINER_NAME> <COMMAND>
Inside the container you will find the user ID has changed to the one as in the host.
$ whoami
whoami: unknown uid 1000
Yes the username becomes unknown, but I guess you will not bother with it. You are doing this to set the correct permissions, not to get a nicely displayed name, right?
P.S., Docs here: https://docs.docker.com/engine/reference/run/#user

Resources