I've been reading today on the theory behind uid 1001 specifically in Docker where it is a best principle not to have your container running as the root user.
What I've been able to tell so far for a unix system is...
root user has the UID of 0 Most unix distributions reserve the first
100 New users are assigned UIDs starting from 500 or 1000
When you
create a new account, it will usually be give the next-highest unused
number of 1001 (not sure what this means in relation to the previous dot point. If someone could clarify please)
I've seen a lot of Dockerfile examples that will just try to use 1001.
USER 1001
Two questions
Is the theory 1001 is safe to use because it is above the uid allocations ranges for new users on the host i.e. dot point 2?
Is it best practice to specify user as 1001 or would it be best adding a new user?
RUN useradd -ms /bin/bash newuser
USER newuser
WORKDIR /home/newuser
thank you
If your container doesn't need to write data in a named volume or a directory bind-mounted from the host, it usually doesn't matter at all what user ID the container runs as. There are a couple of restrictions still (if you're trying to listen on a port number less than 1024 your user ID must be 0 or you must manually add a capability at startup time).
I would use your second form, except I would not switch users until the end of the Dockerfile.
FROM python:3.8 # arbitrary choice
# ... install OS packages ...
# Create the user. This doesn't need to be repeated if the
# application code changes. It is a "system" user without
# a home directory and a default shell. We don't care what
# its numeric user ID is.
RUN useradd -r newuser
# WORKDIR creates the directory. It does not need to be
# under /home. It should be owned by root.
WORKDIR /app
# Copy the application in and do its installation, still as root.
COPY requirements.txt .
RUN pip3 install -r requirements.txt
COPY . .
# Switch USER only at the end of the file.
USER newuser
CMD ["./app.py"]
"Home directory" isn't usually a well-defined concept in Docker; I've created the user without a specific home directory here, and the application directory isn't under the normal Linux /home directory. Similarly, I haven't gone out of my way to specify a shell for this user or to specifically use GNU bash.
One important security point of this setup is that root owns the application files, but newuser is running the code. If there is some sort of compromise, this gives you an additional layer of protection: the compromised application code can't overwrite the application code or its fixed static data.
I started this with the caveat that this works fine if you don't need to persist data in the filesystem. If it does it will need to be somewhat adaptable, which probably means starting up as root but then dropping privileges. I'd support two modes of running:
The container is started up initially with a totally empty directory, possibly owned by root (a named Docker volume; a Kubernetes named volume). In this case, create the data directory you need and make it owned by the user in the Dockerfile.
docker run -v somevolume:/data myimage
The container is started up with a bind-mounted host directory and also a -u option naming the host user ID to use.
docker run -u $(id -u) "$PWD/data:data" myimage
You would need to use an entrypoint wrapper script to detect which case you're in, create the initial storage structure, set its permissions, and switch to a non-root user if required. There are lighter-weight tools like gosu or su-exec that specifically support this case. The Docker Hub consul image's entrypoint script has an example of doing this at startup time.
Related
I know you can specify user and group IDs with the docker run command and you can force the same IDs when building an image in order to have an easy life dealing with file permissions (e.g. as described here).
But what if you want to reuse an image containing a user definition across different machines / users with different user/group IDs?
E.g. on a machine with user/group IDs are 1000:1000 you build an image and use those values to create a user. Then you push the image to some registry. On another machine (or just another user) with IDs 1001:1000 you want to pull and use the image.
AFAIU you would either have to know the IDs to use and provide them to docker run and you might get trouble dealing with files created by the container. Using the local IDs will let you experience those issues inside the container.
Right?
What's the usual approach to this? I'd like to have some way to 'translate' those ID's, i.e. having UID 1001 outside and 1000 inside the container.
Currently the only ways I know of are: 1. ignoring the issue, 2. not sharing images or 3. entering the container with IDs of the user inside the image and rewriting permissions afterwards.
Do not build specific user IDs into your image. As you note, if you do depend on the runtime user ID matching host-directory permissions, this will be wrong if a different host user runs the container. You must specify this at docker run time.
If at all possible, avoid writing to local files in your container. Store data somewhere like a relational database instead. If you don't need to write files, then it doesn't actually matter what user ID the container runs as. This also makes it easier to scale the application and to run it in clustered environments like Kubernetes.
If your application does write to local files, then limit it to a single directory. Say your application code is in /app; maybe the data goes in /data. Only that one directory needs to be writable. The files in /app should stay owned by root; you do not want the application to be able to overwrite its source code or static assets while it's running.
In the Dockerfile, a good practice would be to create a non-root user, with any user ID, but only switch to it at the end of the Dockerfile. Your container should be operable without any special options.
FROM ...
# Create a non-root user, with an arbitrary user ID
RUN adduser --system --no-create-home appuser
# We are still root; do the normal build-and-install steps
# (Do not run `chown` on anything here, leave it all owned by root)
WORKDIR /app
COPY ...
RUN ...
# Create the empty data directory and give the arbitrary user
# permissions on it
RUN mkdir /data && chown appuser /data
ENV APP_DATA_DIR=/data # recognized by the application
# Normal metadata to run the application, as the arbitrary user
USER appuser
CMD ...
If you decide you want the data to be backed by a bind-mounted host directory, then when you run the container you need to provide the corresponding user ID.
docker run \
-u $(id -u):$(id -g) \ # as the host user
-v "$PWD:/data" \ # mounting the current directory on /data
...
You may need an entrypoint wrapper script to set up the /data directory on first use. Anything that is in the image will be hidden by the bind mount, so if the host directory is initially empty then the image will need to know how to put the required initial data there.
I understand that it's considered a bad security practice to run Docker images as root, but I have a specific situation that I wanted to pass by the community to see if anyone can help.
We are currently using a pipeline on an Amazon Linux 2 instance with a single user called ec2-user. Unfortunately, a lot of the scripts we're using for our pipeline have hard-coded paths baked in (notably /home/ec2-user/) ... which may or may not reference the $HOME variable.
I've been talking to one of the engineers that is building a Docker image for our pipeline and suggested that he creates a new user entirely so root user isn't running our pipeline.
For example:
# add clip user
RUN groupadd-r clip && useradd -r -g clip clip
# disable root
RUN chsh -s /usr/sbin/nologin root
# set environment variables
ENV HOME /home/clip
ENV DEBIAN FRONTEND-noninteractive
However, the engineer mentioned that the clip user inside the container will have some uid that may or may not exist in the host machine. For example, if the clip user had uid 1001 in the container, but 1001 was john in the host, all the files created as the clip user inside the container would be owned by john on the outside.
Further, he is more concerned about the situation where the clip user has a uid in the container that doesn’t exist in the host’s passwd. In that case files created by the clip user in the container would be owned by a bare unassociated uid on the host.
If we decided to pass in ids from the host as the user/group to run the image. The kernel will be ok with it (same kernel as the host), and when all is said and done files created inside the container will then be owned by the user/group you pass in. However, the container wouldn’t know who that user/group are, so it’ll just use the raw ids, and stuff like $HOME or whoami won’t work.
With that said, we're curious if anyone else has experienced these problems and if anyone has found solutions?
Everything you say is totally normal. The container has its own /etc/passwd file, and so a given numeric user ID might map to different user names (or to not at all) in the host and in the container. Beyond some cosmetic issues around debug shells, it shouldn't usually matter if the current numeric uid is actually present in the container /etc/passwd, and there's no reason a container uid would need to be mapped in the host /etc/passwd.
Note that there are a couple of ways to directly assume another user ID in Docker, either using the docker run -u option or the Dockerfile USER directive. The RUN chsh command you propose doesn't really do anything and doesn't prevent becoming root inside a container.
clip user inside the container will have some uid that may or may not exist in the host machine.
True, totally normal.
For example, if the clip user had uid 1001 in the container, but 1001 was john in the host, all the files created as the clip user inside the container would be owned by john on the outside.
This is partially true, but only in the case where you've explicitly mapped a host directory into the container with a docker run -v option. Otherwise, the host user with uid 1001 won't be able to navigate to the /var/lib/docker/... directory that actually contains the container files, so it doesn't matter that they could hypothetically write them.
The more usual case around this is to explicitly supply a host uid so that the container process can save its state in a mapped host directory. Pass a numeric uid to the docker run -u option; there's no particular need for that uid to exist in the container's /etc/passwd.
docker run \
-u $(id -u) \
-v "$PWD/data:/data" \
...
the container wouldn’t know who that user/group are, so it’ll just use the raw ids, and stuff like $HOME or whoami won’t work.
Unless your application explicitly calls these things, they won't usually matter. "Home directory" is a pretty poorly defined concept in a Docker container since it's usually a wrapper around a single process.
I already run my docker build and docker run without sudo. However, when I launch a process inside a docker container, it appears as a root process on top on the host (not inside the container).
While it cannot access the host filesystem because of namespacing and cgroups from docker, is it still more dangerous than running as a simple user?
If so, how is the right way of running things inside docker as non root?
Should I just do USER nonroot at the end of the Dockerfile?
UPDATE:
root it also needed for building some things. Should I put USER on the very top of the Dockerfile and then install sudo together with other dependencies, and then use sudo only when needed in the build?
Can someone give a simple Dockerfile example with USER in the beggining and installing and using sudo?
Running the container as root brings a lot of risks. Although being root inside the container is not the same as root on the host machine (some more details here) and you're able to deny a lot of capabilities during container startup, it is still the recommended approach to avoid being root.
Usually it is a good idea to use the USER directive in your Dockerfile after you install some general packages/libraries. In other words - after the operations that require root privileges. Installing sudo in a production service image is a mistake, unless you have a really good reason for it. In most cases - you don't need it and it is more of a security issue. If you need permissions to access some particular files or directories in the image, then make sure that the user you specified in the Dockerfile can really access them (setting proper uid, gid and other options, depending on where you deploy your container). Usually you don't need to create the user beforehand, but if you need something custom, you can always do that.
Here's an example Dockerfile for a Java application that runs under user my-service:
FROM alpine:latest
RUN apk add openjdk8-jre
COPY ./some.jar /app/
ENV SERVICE_NAME="my-service"
RUN addgroup --gid 1001 -S $SERVICE_NAME && \
adduser -G $SERVICE_NAME --shell /bin/false --disabled-password -H --uid 1001 $SERVICE_NAME && \
mkdir -p /var/log/$SERVICE_NAME && \
chown $SERVICE_NAME:$SERVICE_NAME /var/log/$SERVICE_NAME
EXPOSE 8080
USER $SERVICE_NAME
CMD ["java", "-jar", "/app/some.jar"]
As you can see, I create the user beforehand and set its gid, disable its shell and password login, as it is going to be a 'service' user. The user also becomes owner of /var/log/$SERVICE_NAME, assuming it will write to some files there. Now we have a lot smaller attack surface.
Why you shouldn't run as root
While other people have pointed out that you shouldn't run images as root, there isn't much information here, or in the docs about why that is.
While it's true that there is a difference between having root access to a container and root access on the host, root access on a container is still very powerful.
Here is a really good article that goes in depth on the difference between the two, and this issue in general:
https://www.redhat.com/en/blog/understanding-root-inside-and-outside-container
The general point is that if there is a malicious process in your container, it can do whatever it wants in the container, from installing packages, uploading data, hijacking resources, you name it, it can do it.
This also makes it easier for a process to break out of the container and gain privileges on the host since there are no safeguards within the container itself.
How and when to run as non-root
What you want to do is run all your installation and file download/copy steps as root (a lot of things need to be installed as root, and in general it's just a better practice for the reasons I outline below). Then, explicitly create a user and grant that user the minimum level of access that they need to run the application. This is done through the use of chmod and chown commands.
Immediately before your ENTRYPOINT or CMD directive, you then add a USER directive to switch to the newly created user. This will ensure that your application runs as a non-root user, and that user will only have access to what you explicitly gave it access to in previous steps.
The general idea is that the user that runs the container should have an absolute minimum of permissions (most of the time the user doesn't need read, write, and execute access to a file). That way, if there is a malicious process in your container, its behavior will be as restricted as possible. This means that you should avoid creating or copying in any files, or installing any packages as that user too, since they would have complete control over any resources they create by default. I've seen comments suggesting otherwise. Ignore them. If you want to be in line with security best practices, you would then have to go back and revoke the user's excess permissions, and that would just be awful and error prone.
You can check out the CIS benchmark for Docker and they recommend to use non-root and this is one of the "Compliance" checks. Adding USER non-root at the bottom should suffice or you can use '-u' with your RUN command to specify user as well.
https://www.cisecurity.org/benchmark/docker/
https://docs.docker.com/develop/develop-images/dockerfile_best-practices/
Running your containers as non-root gives you an extra layer of security. By default, Docker containers are run as root, but this allows for unrestricted container activities.
Problem
For a docker image (alpine based) that is supposed to run as non-root I have two requirements:
I have to mount a FUSE filesystem inside the docker container
The users of the docker image are able to set the UID/GID of the docker
user with docker run --user {uid}:{gid}
FUSE's fusermount command requires a valid entry for the user in /etc/passwd, otherwise it won't mount the filesystem. Given that I don't know the the UID/GID of the user at build time I can't call adduser at build time. And I can't do it at runtime either, as the user then doesn't have the appropriate privileges.
Solutions found
So far I have found two solutions that both feel not appropriate/secure
1. Make /etc/passwd writable
When adding chmod 555 /etc/passwd to the Dockerfile I can then do at runtime
echo "someuser:x:${my_uid}:$(id -g)::/tmp:/sbin/nologin" >> /etc/passwd
This does the job for fusermount. Unfortunately I did not find a way to make change the passwd file back to read-only at runtime and without that I have security concerns that someone might be able to misuse this to gain root rights back. While I could not find a simple way to use the open passwd file for some exploit (while I was able to add/modify password & configurations directly in /etc/passwd for all users and then change users via login, alpine did not allow this for user root (neither via login nor via su). But I guess there are folk out there more clever than me, and somehow the whole solution feels like a quite dirty hack. Does anyone have specific ideas how a writeable passwd file inside a container could be used for getting inappropriate rights inside the container?
2. Replace requirement #2 with two additional environment variables
By introducing DUID and DGID as environment variables and set USER to some newly added non-root user inside the Dockerfile I found a solution with the help of sudo & /etc/sudoers: In a launch script that I use as entrypoint I can call sudo adduser/addgroup for the given DUID/DGID and then launch the actual program with the user specified via sudo -u someuser someprog.
Except for the fact that the whole setup became quite ugly, I disliked the fact the user's of my docker image could no longer use the regular docker run --user option, as this would break the sudo configuration.
I would like to volume mount a directory from a Docker container to my work station, so when I edit the content in the volume mount from my work station it updated in the container as well. It would be very useful for testing and develop web applications in general.
However I get a permission denied in the container, because the UID's in the container and host isn't the same. Isn't the original purpose of Docker that it should make development faster and easier?
This answer works around the issue I am facing when volume mounting a Docker container to my work station. But by doing this, I make changes to the container that I won't want in production, and that defeats the purpose of using Docker during development.
The container is Alpine Linux, work station Fedora 29, and editor Atom.
Question
Is there another way, so both my work station and container can read/write the same files?
There are multiple ways to do this, but the central issue is that bind mounts do not include any UID mapping capability, the UID on the host is what appears inside the container and vice versa. If those two UID's do not match, you will read/write files with different UID's and likely experience permission issues.
Option 1: get a Mac or deploy docker inside of VirtualBox. Both of these environments have a filesystem integration that dynamically updates the UID's. For Mac, that is implemented with OSXFS. Be aware that this convenience comes with a performance penalty.
Option 2: Change your host. If the UID on the host matches the UID inside the container, you won't experience any issues. You'd just run a usermod on your user on the host to change your UID there, and things will happen to work, at least until you run a different image with a different UID inside the container.
Option 3: Change your image. Some will modify the image to a static UID that matches their environment, often to match a UID in production. Others will pass a build arg with something like --build-arg UID=$(id -u) as part of the build command, and then the Dockerfile with something like:
FROM alpine
ARG UID=1000
RUN adduser -u ${UID} app
The downside of this is each developer may need a different image, so they are either building locally on each workstation, or you centrally build multiple images, one for each UID that exists among your developers. Neither of these are ideal.
Option 4: Change the container UID. This can be done in the compose file, or on a one off container with something like docker run -u $(id -u) your_image. The container will now be running with the new UID, and files in the volume will be accessible. However, the username inside the container will not necessarily map to your UID which may look strange to any commands you run inside the container. More importantly, any files own by the user inside the container that you have not hidden with your volume will have the original UID and may not be accessible.
Option 5: Give up, run everything as root, or change permissions to 777 allowing everyone to access the directory with no restrictions. This won't map to how you should run things in production, and the container may still write new files with limited permissions making them inaccessible to you outside the container. This also creates security risks of running code as root or leaving filesystems open to both read and write from any user on the host.
Option 6: Setup an entrypoint that dynamically updates your container. Despite not wanting to change your image, this is my preferred solution for completeness. Your container does need to start as root, but only in development, and the app will still be run as the user, matching the production environment. However, the first step of that entrypoint will be to change the user's UID/GID inside the container to match your volume's UID/GID. This is similar to option 4, but now files inside the image that were not replaced by the volume have the right UID's, and the user inside the container will now show with the changed UID so commands like ls show the username inside the container, not a UID to may map to another user or no one at all. While this is a change to your image, the code only runs in development, and only as a brief entrypoint to setup the container for that developer, after which the process inside the container will look identical to that in a production environment.
To implement this I make the following changes. First the Dockerfile now includes a fix-perms script and gosu from a base image I've pushed to the hub (this is a Java example, but the changes are portable to other environments):
FROM openjdk:jdk as build
# add this copy to include fix-perms and gosu or install them directly
COPY --from=sudobmitch/base:scratch / /
RUN apt-get update \
&& apt-get install -y maven \
&& useradd -m app
COPY code /code
RUN mvn build
# add an entrypoint to call fix-perms
COPY entrypoint.sh /usr/bin/
ENTRYPOINT ["/usr/bin/entrypoint.sh"]
CMD ["java", "-jar", "/code/app.jar"]
USER app
The entrypoint.sh script calls fix-perms and then exec and gosu to drop from root to the app user:
#!/bin/sh
if [ "$(id -u)" = "0" ]; then
# running on a developer laptop as root
fix-perms -r -u app -g app /code
exec gosu app "$#"
else
# running in production as a user
exec "$#"
fi
The developer compose file mounts the volume and starts as root:
version: '3.7'
volumes:
m2:
services:
app:
build:
context: .
target: build
image: registry:5000/app/app:dev
command: "/bin/sh -c 'mvn build && java -jar /code/app.jar'"
user: "0:0"
volumes:
- m2:/home/app/.m2
- ./code:/code
This example is taken from my presentation available here: https://sudo-bmitch.github.io/presentations/dc2019/tips-and-tricks-of-the-captains.html#fix-perms
Code for fix-perms and other examples are available in my base image repo: https://github.com/sudo-bmitch/docker-base
Since the UID in your containers are baked into the container definition, you can safely assume that they are relatively static. In this case, you can create a user in your host system with the machine UID and GID. Change user to the new account, and then make your edits to the files. Your host OS will not complain since it thinks it's just the user accessing its own files, and your container OS will see the same.
Alternatively, you can consider editing these files as root.