How to use bash profile in dockerfile - docker

I am trying to build an image using dockerfile.
The commands in the dockerfile looks something like these:
FROM ubuntu:16.04
:
:
RUN pip3 install virtualenvwrapper
RUN echo '# Python virtual environment wrapper' >> ~/.bashrc
RUN echo 'export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3' >> ~/.bashrc
RUN echo 'export WORKON_HOME=$HOME/.virtualenvs' >> ~/.bashrc
RUN echo 'source /usr/local/bin/virtualenvwrapper.sh' >> ~/.bashrc
After these commands, I will use virtualenvwrapper commands to make some virtualenvs.
If I had only environment variables to deal with in ~/.bashrc, I would have used ARG or ENV to set them up.
But now I also have other shell script files like virtualenvwrapper.sh the will be setting some of their own variables.
Also, RUN source ~/.bashrc is not working (source not found).
What should I do?

You shouldn't try to edit shell dotfiles like .bash_profile in a Dockerfile. There are many common paths that don't go via a shell (e.g., CMD ["python", "myapp.py"] won't launch any sort of shell and won't read a .bash_profile). If you need to globally set an environment variable in an image, use the Dockerfile ENV directive.
For a Python application, you should just install your application into the image's "global" Python using pip install. You don't specifically need a virtual environment; Docker provides a lot of the same isolation capabilities (something you pip install in a Dockerfile won't affect your host system's globally installed packages).
A typical Python application Dockerfile (copied from https://hub.docker.com/_/python/) might look like
FROM python:3
WORKDIR /usr/src/app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "./your-daemon-or-script.py"]
On your last question, source is a vendor extension that only some shells provide; the POSIX standard doesn't require it and the default /bin/sh in Debian and Ubuntu doesn't provide it. In any case since environment variables get reset on every RUN command, RUN source ... (or more portably RUN . ...) is a no-op if nothing else happens in that RUN line.

avoid using ~ => put your bashrc in a specific path
put source bashrc and your command on the same RUN line with ;
the RUN lines are totally independent from each others for the environment

Related

Docker: Set export PATH in Dockerfile

I have to set export PATH=~/.local/bin:$PATH. As long I do it via docker exec -it <container> bash manually it works. However, I tried to automate it in my .dockerfile with:
FROM jupyter/scipy-notebook
RUN conda install --yes -c conda-forge fbprophet
ENV PATH="~/.local/bin:${PATH}"
RUN pip install awscli --upgrade --user
And it seems like ENV PATH="~/.local/bin:${PATH}" is not having the same effect as I receive that WARNING here. Do you see what I am doing wrong?
WARNING: The scripts pyrsa-decrypt, pyrsa-decrypt-bigfile, pyrsa-encrypt, pyrsa-encrypt-bigfile, pyrsa-keygen, pyrsa-priv2pub, pyrsa-sign and pyrsa-verify are installed in '/home/jovyan/.local/bin' which is not on PATH.
Make use of ENV directive in dockerfile
ENV PATH "$PATH:/home/jovyan/.local/bin"
Hope this helps.
$PATH is a list of actual directories, e.g., /bin:/usr/bin:/usr/local/bin:/home/dmaze/bin. No expansion ever happens while you're reading $PATH; if it contains ~/bin, it looks for a directory named exactly ~, like you might create with a shell mkdir \~ command.
When you set $PATH in a shell, PATH="~/bin:$PATH", first your local shell expands ~ to your home directory and then sets the environment variable. Docker does not expand ~ to anything, so you wind up with a $PATH variable containing a literal ~.
The best practice here is actually to avoid needing to set $PATH at all. A Docker image is an isolated filesystem space, so you can install things into the "system" directories and not worry about confusing things maintained by the package manager or things installed by other users; the Dockerfile is the only thing that will install anything.
RUN pip install awscli
# without --user
But if you must set it, you need to use a Dockerfile ENV directive, and you need to specify absolute paths. ($HOME does seem to be well-defined but since Docker containers aren't usually multi-user, "home directory" isn't usually a concept.)
ENV PATH="$HOME/.local/bin:$PATH"
(In a Dockerfile, Docker will replace $VARIABLE, ${VARIABLE}, ${VARIABLE:-default}, and ${VARIABLE:+isset}, but it doesn't do any other shell expansion; path expansion of ~ isn't supported but variable expansion of $HOME is.)

App relies on sourcing secrets.sh for env variables. How to accomplish this in my Dockerfile?

I'm working on creating a container to hold my running Django app. During development and manual deployment I've been setting environment variables by sourcing a secrets.sh file in my repo. This has worked fine until now that I'm trying to automate my server's configuration environment in a Dockerfile.
So far it looks like this:
FROM python:3.7-alpine
RUN pip install --upgrade pip
RUN pip install pipenv
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
WORKDIR /home/appuser/site
COPY . /home/appuser/site
RUN /bin/sh -c "source secrets.sh"
RUN env
I'd expect this to set the environment variables properly but it doesn't. I've also tried adding the variables to my appuser's bashrc, but this doesn't work either.
Am I missing something here? Is there another best practice to set env variables to be accessible by django, without having to check them into the Dockerfile in my repo?
Each RUN step launches a totally new container with a totally new shell; only its filesystem is persisted afterwards. RUN commands that try to start processes or set environment variables are no-ops. (RUN export or RUN service start do absolutely nothing.)
In your setup you need the environment variables to be set at container startup time based on information that isn't available at build time. (You don't want to persist secrets in an image: they can be easily read out by anyone who gets the image later on.) The usual way to do this is with an entrypoint script; this could look like
#!/bin/sh
# If the secrets file exists, read it in.
if [ -f /secrets.sh ]; then
# (Prefer POSIX "." to bash-specific "source".)
. /secrets.sh
fi
# Now run the main container CMD, replacing this script.
exec "$#"
A typical Dockerfile built around this would look like:
FROM python:3.7-alpine
RUN pip install --upgrade pip
WORKDIR /app
# Install Python dependencies, as an early step to support
# Docker layer caching.
COPY requirements.txt ./
RUN pip install -r requirements.txt
# Install the main application.
COPY . ./
# Create a non-root user. It doesn't own the source files,
# and so can't modify the application.
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
# Startup-time metadata.
ENTRYPOINT ["/app/entrypoint.sh"]
CMD ["/app/app.py"]
And then when you go to run the container, you'd inject the secrets file
docker run -p 8080:8080 -v $PWD/secrets-prod.sh:/secrets.sh myimage
(As a matter of style, I reserve ENTRYPOINT for this pattern and for single-binary FROM scratch containers, and always use CMD for whatever the container's main process is.)

How do you use environment variables in the start up scripts in a passenger-docker container

When I try to use an environment variable($HOME) that I set in the Dockerfile, in the script that runs at start up, $HOME is not set. If I run printenv in the container, $HOME is set. So I am confused, and not sure what is going on.
I am using the phusion/passenger-customizable image, so that I can run a custom node server via pm2. I need a different version of Node then what is bundled in the node specific passenger image.
Dockerfile
# Simplified
FROM phusion/passenger-customizable:0.9.27
RUN apt-get update && apt-get upgrade -y -o Dpkg::Options::="--force-confold"
# Set environment variables needed for the docker image.
ARG HOME=/opt/var/app
ENV HOME $HOME
# Use baseimage-docker's init process.
CMD ["/sbin/my_init"]
RUN mkdir /etc/service/app
ADD start.sh /etc/service/app/run
RUN chmod a+x /etc/service/app/run
start.sh
echo $HOME
# run some scripts that reference the $HOME directory
What do I need to do to be able to reference a environment variable, set in the Dockerfile, in my start up scripts? Or do I just need to hardcode the paths in that start up script and call it a day?
$HOME is reserved, in some fashion. When running printenv, per #Sebastian, all my other variables where there but not $HOME. I prepended it with the initials of my company and it is working as intended.

condas `source activate virtualenv` does not work within Dockerfile

Scenario
I'm trying to setup a simple docker image (I'm quite new to docker, so please correct my possible misconceptions) based on the public continuumio/anaconda3 container.
The Dockerfile:
FROM continuumio/anaconda3:latest
# update conda and setup environment
RUN conda update conda -y \
&& conda env list \
&& conda create -n testenv pip -y \
&& source activate testenv \
&& conda env list
Building and image from this by docker build -t test . ends with the error:
/bin/sh: 1: source: not found
when activating the new virtual environment.
Suggestion 1:
Following this answer I tried:
FROM continuumio/anaconda3:latest
# update conda and setup environment
RUN conda update conda -y \
&& conda env list \
&& conda create -y -n testenv pip \
&& /bin/bash -c "source activate testenv" \
&& conda env list
This seems to work at first, as it outputs: prepending /opt/conda/envs/testenv/bin to PATH, but conda env list as well ass echo $PATH clearly show that it doesn't:
[...]
# conda environments:
#
testenv /opt/conda/envs/testenv
root * /opt/conda
---> 80a77e55a11f
Removing intermediate container 33982c006f94
Step 3 : RUN echo $PATH
---> Running in a30bb3706731
/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
The docker files work out of the box as a MWE.
I appreciate any ideas. Thanks!
Using the docker ENV instruction it is possible to add the virtual environment path persistently to PATH. Although this does not solve the selected environment listed under conda env list.
See the MWE:
FROM continuumio/anaconda3:latest
# update conda and setup environment
RUN conda update conda -y \
&& conda create -y -n testenv pip
ENV PATH /opt/conda/envs/testenv/bin:$PATH
RUN echo $PATH
RUN conda env list
Method 1: use SHELL with a custom entrypoint script
EDIT: I have developed a new, improved approach which better than the "conda", "run" syntax.
Sample dockerfile available at this gist. It works by leveraging a custom entrypoint script to set up the environment before execing the arguments of the RUN stanza.
Why does this work?
A shell is (put very simply) a process which can act as an entrypoint for arbitrary programs. exec "$#" allows us to launch a new process, inheriting all of the environment of the parent process. In this case, this means we activate conda (which basically mangles a bunch of environment variables), then run /bin/bash -c CONTENTS_OF_DOCKER_RUN.
Method 2: SHELL with arguments
Here is my previous approach, courtesy of Itamar Turner-Trauring; many thanks to them!
# Create the environment:
COPY environment.yml .
RUN conda env create -f environment.yml
# Set the default docker build shell to run as the conda wrapped process
SHELL ["conda", "run", "-n", "vigilant_detect", "/bin/bash", "-c"]
# Set your entrypoint to use the conda environment as well
ENTRYPOINT ["conda", "run", "-n", "myenv", "python", "run.py"]
Modifying ENV may not be the best approach since conda likes to take control of environment variables itself. Additionally, your custom conda env may activate other scripts to further modulate the environment.
Why does this work?
This leverages conda run to "add entries to PATH for the environment and run any activation scripts that the environment may contain" before starting the new bash shell.
Using conda can be a frustrating experience, since both tools effectively want to monopolize the environment, and theoretically, you shouldn't ever need conda inside a container. But deadlines and technical debt being a thing, sometimes you just gotta get it done, and sometimes conda is the easiest way to provision dependencies (looking at you, GDAL).
Piggybacking on ccauet's answer (which I couldn't get to work), and Charles Duffey's comment about there being more to it than just PATH, here's what will take care of the issue.
When activating an environment, conda sets the following variables, as well as a few that backup default values that can be referenced when deactivating the environment. These variables have been omitted from the Dockerfile, as the root conda environment need never be used again. For reference, these are CONDA_PATH_BACKUP, CONDA_PS1_BACKUP, and _CONDA_SET_PROJ_LIB. It also sets PS1 in order to show (testenv) at the left of the terminal prompt line, which was also omitted. The following statements will do what you want.
ENV PATH /opt/conda/envs/testenv/bin:$PATH
ENV CONDA_DEFAULT_ENV testenv
ENV CONDA_PREFIX /opt/conda/envs/testenv
In order to shrink the number of layers created, you can combine these commands into a single ENV command setting all the variables at once as well.
There may be some other variables that need to be set, based on the package. For example,
ENV GDAL_DATA /opt/conda/envs/testenv/share/gdal
ENV CPL_ZIP_ENCODING UTF-8
ENV PROJ_LIB /opt/conda/envs/testenv/share/proj
The easy way to get this information is to call printenv > root_env.txt in the root environment, activate testenv, then call printenv > test_env.txt, and examine
diff root_env.txt test_env.txt.

Change directory command in Docker?

In docker I want to do this:
git clone XYZ
cd XYZ
make XYZ
However because there is no cd command, I have to pass in the full path everytime (make XYZ /fullpath). Any good solutions for this?
To change into another directory use WORKDIR. All the RUN, CMD and ENTRYPOINT commands after WORKDIR will be executed from that directory.
RUN git clone XYZ
WORKDIR "/XYZ"
RUN make
You can run a script, or a more complex parameter to the RUN. Here is an example from a Dockerfile I've downloaded to look at previously:
RUN cd /opt && unzip treeio.zip && mv treeio-master treeio && \
rm -f treeio.zip && cd treeio && pip install -r requirements.pip
Because of the use of '&&', it will only get to the final 'pip install' command if all the previous commands have succeeded.
In fact, since every RUN creates a new commit & (currently) an AUFS layer, if you have too many commands in the Dockerfile, you will use up the limits, so merging the RUNs (when the file is stable) can be a very useful thing to do.
I was wondering if two times WORKDIR will work or not, but it worked :)
FROM ubuntu:18.04
RUN apt-get update && \
apt-get install -y python3.6
WORKDIR /usr/src
COPY ./ ./
WORKDIR /usr/src/src
CMD ["python3", "app.py"]
You can use single RUN command for all of them
RUN git clone XYZ && \
cd XYZ && \
make XYZ
In case you want to change the working directory for the container when you run a docker image, you can use the -w (short for --workdir) option:
docker run -it -w /some/valid/directory/inside/docker {image-name}
Ref:
docker run options: https://docs.docker.com/engine/reference/commandline/run/#options
Mind that if you must run in bash shell, you need not just the RUN make, but you need to call the bash shell before, since in Docker, you are in the sh shell by default.
Taken from /bin/sh: 1: gvm: not found, which would say more or less:
Your shell is /bin/sh, but source expects /bin/bash, perhaps because it
puts its initialization in ~/.bashrc.
In other words, this problem can occur in any setting where the "sh" shell is used instead of the "bash", causing "/bin/sh: 1: MY_COMMAND: not found".
In the Dockerfile case, use the recommended
RUN /bin/bash -c 'source /opt/ros/melodic/setup.bash'
or with the "[]" (which I would rather not use):
RUN ["/bin/bash", "-c", "source /opt/ros/melodic/setup.bash"]
Every new RUN of a bash is isolated, "starting at 0". For example, mind that setting WORKDIR /MY_PROJECT before the bash commands in the Dockerfile does not affect the bash commands since the starting folder would have to be set in the ".bashrc" again. It needs cd /MY_PROJECT even if you have set WORKDIR.
Side-note: do not forget the first "/" before "opt/../...". Else, it will throw the error:
/bin/bash: opt/ros/melodic/setup.bash: No such file or directory
Works:
=> [stage-2 18/21] RUN ["/bin/bash", "-c", "source /opt/ros/melodic/setup.bash"] 0.5s
=> [stage-2 19/21] [...]
See “/bin/sh: 1: MY_COMMAND: not found” at SuperUser for some more details on how this looks with many lines, or how you would fill the ".bashrc" instead. But that goes a bit beyond the actual question here.
PS: You might also put the commands you want to execute in a single bash script and run that bash script in the Dockerfile (though I would rather put the bash commands in the Dockerfile as well, just my opinion):
#!/bin/bash
set -e
source /opt/ros/melodic/setup.bash

Resources