What is wrong with this Dockerfile statement? Which one should I use?

What is wrong with this Dockerfile statement? Which one should I use? - docker

If I want to run, for example wget, in a Docker file, I can type this:
RUN wget http://example.com
If I want do an echo command I could do this
RUN echo 'Hello' >> /home/file.text
But I've also seen this:
RUN bash -c 'echo $USERNAME:ros | chpasswd'
If I want to run a shell script, I could do this
RUN 'bash ./install_foo.sh'
I also was recommended this:
RUN . /home/ros/.bashrc
I think there are some invalid examples above and others that have subtle differing semantics. I would like to
Understand it so I can learn
What the right one is to use when I want to run a shell script

Here's a brain dump of related one-line answers:
Every RUN command launches a new shell (in a new container even) with a new clean environment and doesn't read any dotfiles. RUN export ... and RUN . ... are both no-ops that will have no effect on later steps.
Many standard Docker paths (like docker run ... some command) don't involve a shell at all, so if you create a .bashrc or .profile file it will be ignored in many common cases.
Unquoted RUN some command, CMD some command, and ENTRYPOINT some command are all automatically wrapped in sh -c '...' and you basically never need to say this explicitly. (In the case of ENTRYPOINT using the unquoted form is probably a bug.) Forms like CMD ["some", "command"] do not implicitly involve a shell (and don't expand environment variables).
GNU bash has several vendor extensions that unfortunately are in widespread use; Alpine base images don't include bash. In particular never say source when . is in the standard and does the same thing.
If you're installing software in an image, your best choice is to install it in a "system" location (pip install without an active virtual environment, npm install -g, ./configure --prefix=/usr/local); if you must install it somewhere else, use the Dockerfile ENV directive to set any environment variables that are needed; and if you can't do that, an ENTRYPOINT wrapper script can programmatically set the environment for the main process (but not any docker exec shells).
Just in general, ./foo.sh will run a shell script (provided it is executable and starts with a #!/bin/sh line); bash foo.sh will as well (but doesn't require it to be executable and explicitly specifies which shell to use); and . ./foo.sh runs it in the context of the current shell (only this form can change environment variables for example).

Related

Docker is adding single quotes to ENTERYPOINT argument

I am creating a Dockerfile that needs to source a script before a shell is run.
ENTRYPOINT ["/bin/bash", "-rcfile","<(echo '. ./mydir/scripttosource.sh')"]
However, the script isn't sourced as expected.
Combining these parameters on a command line (normal Linux instance, outside of any Docker container), it works properly, for example:
$ /bin/bash -rcfile <(echo '. ./mydir/scripttosource.sh')
So I took a look at what was actually used by the container when it was run.
$ docker ps --format "table {{.ID}} \t {{.Names}} \t {{.Command}}" --no-trunc
CONTAINER ID NAMES COMMAND
70a5f846787075bd9bd55432dc17366268c33c1ab06fb36b23a50f5c3aef19bb happy_cray "/bin/bash -rcfile '<(echo '. ./mydir/scripttosource.sh')'"
Besides the fact that it properly identified the emotional state of Cray computers, Docker seems to be sneaking in undesired single quotes into the third parameter to ENTRYPOINT.
'<(echo '. ./mydir/scripttosource.sh')'
Thus the command actually being executed is:
$ /bin/bash -rcfile '<(echo '. ./mydir/scripttosource.sh')'
Which doesn't work...
Now I realize there are more ways to skin this cat, and I could make this work a different way, I am curious about the insertion of single quotes to the third argument to ENTRYPOINT. Is there a way to avoid this?
Thank you,

At a super low level, the Unix execve(2) function launches a process by taking a sequence of words, where the first word is the actual command to run and the remaining words are its arguments. When you run a command interactively, the shell breaks it into words, usually at spaces, and then calls an exec-type function to run it. The shell also does other processing like replacing $VARIABLE references or the bash-specific <(subprocess) construct; all of these are at layers above simply "run a process".
The Dockerfile ENTRYPOINT (and also CMD, and less frequently RUN) has two forms. You're using the JSON-array exec form. If you do this, you're telling Docker that you want to run the main container command with exactly these three literal strings as arguments. In particular the <(...) string is passed as a literal argument to bash --rcfile, and nothing actually executes it.
The obvious answer here is to use the string-syntax shell form instead
ENTRYPOINT /bin/bash -rcfile <(echo '. ./mydir/scripttosource.sh')
Docker wraps this in an invocation of sh -c (or the Dockerfile SHELL). That causes a shell to preprocess the command string, break it into words, and execute it. Assuming the SHELL is bash and not a pure POSIX shell, this will handle the substitution.
However, there are some downsides to this, most notably that the sh -c invocation "eats" all of the arguments that might be passed in the CMD. If you want your main container process to be anything other than an interactive shell, this won't work.
This brings you to the point of trying to find simpler alternatives to doing this. One specific observation is that the substitution here isn't doing anything; <(echo something) will always produce the fixed string something and you can do it without the substitution. If you can avoid the substitution then you don't need the shell either:
ENTRYPOINT ["/bin/bash", "--rcfile", "./mydir/scripttosource.sh"]
Another sensible approach here is to use an entrypoint wrapper script. This uses the ENTRYPOINT to run a shell script that does whatever initialization is needed, then exec "$#" to run the main container command. In particular, if you use the shell . command to set environment variables (equivalent to the bash-specific source) those will "stick" for the main container process.
#!/bin/sh
# entrypoint.sh
# read the file that sets variables
. ./mydir/scripttosource.sh
# run the main container command
exec "$#"
# Dockerfile
COPY entrypoint.sh ./ # may be part of some other COPY
ENTRYPOINT ["./entrypoint.sh"] # must be JSON-array syntax
CMD ???
This should have the same net effect. If you get a debugging shell with docker run --rm -it your-image bash, it will run under the entrypoint wrapper and see the environment variables. You can do other setup in the wrapper script if required. This particular setup also doesn't use any bash-specific options, and might run better under minimal Alpine-based images.

insertion of single quotes can be avoided by using escape characters in the third argument to ENTRYPOINT.
ENTRYPOINT ["/bin/bash", "-rcfile","$(echo '. ./mydir/scripttosource.sh')"]

How do I run the eval $(envkey-source) command in docker using Dockerfile?

I want to run a command, eval $(envkey-source) for setting certain environment variables using envkey. I install it, set my ENVKEY variable and then try to import all the environment variables. I do this all via Docker. However, docker is giving an error in this command:
Step 31/35 : RUN eval $(envkey-source)
---> Running in 6a9ebf1ede96
/bin/sh: 1: export: : bad variable name
The command '/bin/sh -c eval $(envkey-source)' returned a non-zero code: 2
I tried reading the documentation of envkey but they tell nothing about Docker.
I have installed envkey using following commands:
ENV ENVKEY=yada_yada
RUN curl -s https://raw.githubusercontent.com/envkey/envkey-source/master/install.sh | bash
Until here, all goes well. I get verbose of suggestions on the console about how to run the envkey to get all the environment variables set.
The problem comes on this side:
RUN eval $(envkey-source)
The error:
Step 31/35 : RUN eval $(envkey-source)
---> Running in 6a9ebf1ede96
/bin/sh: 1: export: : bad variable name
The command '/bin/sh -c eval $(envkey-source)' returned a non-zero code: 2

You can't do this, for a couple of reasons. The envkey documentation eventually links to an example in their GitHub which you might find informative.
Each Dockerfile RUN command runs a new shell in a new container. In particular, environment variables set within a RUN command are lost after it exits. Any form of RUN export ... is a no-op. If variables are static you can set them using the ENV directive, but in this case where you're running a program that needs to generate them dynamically, you need another approach.
A typical pattern here is to use a shell script as your container's ENTRYPOINT. That does some initial setup and then replaces itself with the container's CMD. Since the CMD runs in the same shell environment as the rest of the script, you can do dynamic variable setup here. The script might look like:
#!/bin/sh
eval "$(envkey-source)"
exec "$#"
The other thing to keep in mind here is that anyone can docker inspect your image and get its environment variables back out, or docker run imagename /usr/bin/env. If you could run envkey-source in the Dockerfile then the environment variables would be available in the image in clear text, which defeats the purpose. Even embedding the key in the image effectively leaks it. You should pass this at runtime using a docker run -e option or a Docker Compose environment: key, relaying it from the host's environment.

Pass ENV in docker run command

Is there a way we can pass a variable lets say in this example I want to pass a list of animals into an entrypoint.sh file using ENV animals="turtle, monkey, goose"
But I want to be able to pass different animals when running the container for example docker run -t image animals="mouse,rat,kangaroo"
How do you go about passing arguments when running the docker run command?
The goal is to take that variable when using the docker run command and insert them into that entrypoint.sh file
Right now i hard code that in my Dockerfile. But i want to be able to do this when running the docker run command so I dont always have to change the Dockerfile.
FROM anapsix/alpine-java:8u121b13_jdk
ENV FILE_NAME="file_to_run.zip"
ENV animals="turtle, monkey, goose"
ADD ${FILE_NAME} .
RUN echo "${FILENAME} ${animals}" > ./entrypoint.sh
CMD [ "/bin/ash", "./entrypoint.sh" ]

It looks like you might be confusing the image build with the container run. If the difference between the two isn't immediately clear, I'd recommend reviewing some other questions and docs like:
In Docker, what's the difference between a container and an image?
https://docs.docker.com/develop/develop-images/dockerfile_best-practices/
RUN echo "${FILENAME} ${animals}" > ./entrypoint.sh
With the above, the variables will be expanded during the image build. The entrypoint.sh will not contain ${FILENAME} ${animals}. Instead, it will contain
file_to_run.zip turtle, monkey, goose
After the build, the docker run command will create a container from that image and run the above script with the environment variables defined but never used since the script already has the variables expanded. To prevent the variable expansion, you need to escape the $ or use single quotes to prevent the expansion, e.g.
RUN echo "\${FILENAME} \${animals}" > ./entrypoint.sh
or
RUN echo '${FILENAME} ${animals}' > ./entrypoint.sh
I would also recommend being explicit with a #!/bin/ash at the top of this script. Then when you run the script, do not override the command with parameters after the image name. Instead set the environment variables with the appropriate flag to run:
docker run -it -e animals="mouse,rat,kangaroo" image

Simplest way, forward individual variables:
docker run ... --env animals="turtle, monkey, goose" --env FILE_NAME="file_to_run.zip"
Forward several variables using file:
Or if you need to grab all your environment variables from outside, you can do something like this first:
printenv | grep -E 'animals|FILE_NAME' > my-env
The grep is because Docker doesn't like some variables, e.g. with spaces in them, which you might possibly have in your real environment.
Then use that file in your Docker command:
docker run ... --env-file ./my-env
The latter is also useful if you want to avoid sending environment variables to logs (like for sensitive variables). I use this approach in a CI/CD pipeline that runs some scripts.
Using variables inside Docker:
With either approach, the environment variables actually become available to scripts running inside the container to use.
#BMitch's answer has more complete details about how to achieve this in your case, where you have related logic in both build and execution.
Reference
See docs here.

Parse a variable with the result of a command in DockerFile

I need to fill a variable in dockerfile with the result of a command
Like in bash var=$(date)
EDIT 1
date is a example.
in my case i use FROM phusion/baseimage:0.9.17 so i want at each building use the last version so i use this
curl -v --silent api.github.com/repos/phusion/baseimage-docker/tags 2>&1 | grep -oh 'rel-.*",' | head -1 | sed 's/",//' | sed 's/rel-//' ==> 0.9.17.
but i don't know how i parse it in var with dockerfile for this result
ENV verbaseimage=curl...
FROM phusion/baseimage:$verbaseimage
RESULT
In my use case
FROM phusion/baseimage:latest
But the question remains unresolved for other case

I had same issue and found way to set environment variable as result of function by using RUN command in dockerfile.
For example i need to set SECRET_KEY_BASE for Rails app just once without changing as would when i run:
docker run -e SECRET_KEY_BASE="$(openssl rand -hex 64)"
Instead it i write to Dockerfile string like:
RUN bash -l -c 'echo export SECRET_KEY_BASE="$(openssl rand -hex 64)" >> /etc/bash.bashrc'
and my env variable available from root, even after bash login.
or may be
RUN /bin/bash -l -c 'echo export SECRET_KEY_BASE="$(openssl rand -hex 64)" > /etc/profile.d/docker_init.sh'
then it variable available in CMD and ENTRYPOINT commands
Docker cache it as layer and change only if you change some strings before it.
You also can try different ways to set environment variable.

The old workaround is mentioned here (issue 2637: Feature request: expand Dockerfile ENV $VARIABLES in WORKDIR):
One work around that I've used, is to have a file in my context called "build-env". What I do is source it and run my desired command in the same RUN step. So for example:
build-env:
VERSION=stable
Dockerfile:
FROM radial/axle-base:latest
ADD build-env /build-env
RUN source build-env && mkdir /$VERSION
RUN ls /
But for date, that might not be as precise as you want.
Other workarounds are in issue 2022 "Dockerfile with variable interpolation".
In docker 1.9 (end of October 2015), you will have "support for build-time environment variables to the 'build' API (PR 9176)" and "Support for passing build-time variables in build context (PR 15182)".
docker build --build-arg=[]: Set build-time variables
You can use ENV instructions in a Dockerfile to define variable values. These values persist in the built image. However, often persistence is not what you want. Users want to specify variables differently depending on which host they build an image on.
A good example is http_proxy or source versions for pulling intermediate files. The ARG instruction lets Dockerfile authors define values that users can set at build-time using the ---build-arg flag:
$ docker build --build-arg HTTP_PROXY=http://10.20.30.2:1234 .
This flag allows you to pass the build-time variables that are accessed like regular environment variables in the RUN instruction of the Dockerfile.
Also, these values don't persist in the intermediate or final images like ENV values do.
so I want at each building use the last version so I use this
curl -v --silent api.github.com/repos/phusion/baseimage-docker/tags 2>&1 | grep -oh 'rel-.*",' | head -1 | sed 's/",//' | sed 's/rel-//' ==> 0.9.17.
If you want to use the last version of that image, all you need to do is use the tag 'latest' with the FROM directive:
FROM phusion/baseimage:latest
See also "The misunderstood Docker tag: latest": it doesn't always reference the actual latest build, but in this instance, it should work.
If you really want to use the curl|parse option, use it to generate a Dockerfile with the right value (as in a template processed to generate the right file).
Don't try to use it directly in the Dockerfile.

I wanted to set an ENV or LABEL variable from a computation in the Dockerfile, e.g. to make some computed installation options visible in docker inspect.
There does not seem to be any way to do that, and this issue suggests that it's a security design choice.
A Dockerfile can set an ENV variable to $X, ${X:-default}, or ${X:+substitute} where that $X must be another ENV or ARG variable.
A single RUN command can set and use shell variables, but that goes away at the end of the RUN command when that container layer shuts down.
A RUN command can write computed data into files, but the Dockerfile still can't get that data into an ENV or LABEL even if the file is ~/.bashrc. (File contents can, of course, be used by code running in the Container.)
The build can at least RUN echo $X to record choices to the build log -- unless that step comes from the build cache, in which case the RUN step doesn't run.
Please do correct me if there's a way out.

Partially connected to question. If one wants to use the result of some command later on it is possible within single RUN statement as follows:
RUN CUR_DIR=`pwd` && \
echo $CUR_DIR

Docker multiple entrypoints

Say I have the following Dockerfile:
FROM ubuntu
RUN apt-get update
RUN apt-get install -y apache2
RUN apt-get install -y mongod #pretend this exists
EXPOSE 80
ENTRYPOINT ["/usr/sbin/apache2"]
The ENTRYPOINT command makes it so that apache2 starts when the container starts. I want to also be able to start mongod when the the container starts with the command service mongod start. According to the documentation however, there must be only one ENTRYPOINT in a Dockerfile. What would be the correct way to do this then?

As Jared Markell said, if you wan to launch several processes in a docker container, you have to use supervisor. You will have to configure supervisor to tell him to launch your different processes.
I wrote about this in this blog post, but you have a really nice article here detailing how and why using supervisor in Docker.
Basically, you will want to do something like:
FROM ubuntu
RUN apt-get update
RUN apt-get install -y apache2
RUN apt-get install -y mongod #pretend this exists
RUN apt-get install -y supervisor # Installing supervisord
ADD supervisord.conf /etc/supervisor/conf.d/supervisord.conf
EXPOSE 80
ENTRYPOINT ["/usr/bin/supervisord"]
And add a configuration a file supervisord.conf
[supervisord]
nodaemon=true
[program:mongodb]
command=/etc/mongod/mongo #To adapt, I don't know how to launch your mongodb process
[program:apache2]
command=/usr/sbin/apache2 -DFOREGROUND
EDIT: As this answer has received quite lot of upvotes, I want to precise as a warning that using Supervisor is not considered as a best practice to run several jobs. Instead, you may be interested in creating several containers for your different processes and managing them through docker compose.
In a nutshell, Docker Compose allows you to define in one file all the containers needed for your app and launch them in one single command.

My solution is to throw individual scripts into /opt/run/ and execute them with:
#!/bin/bash
LOG=/var/log/all
touch $LOG
for a in /opt/run/*
do
$a >> $LOG &
done
tail -f $LOG
And my entry point is just the location of this script, say it's called /opt/bin/run_all:
ADD 00_sshd /opt/run/
ADD 01_nginx /opt/run/
ADD run_all /opt/bin/
ENTRYPOINT ["/opt/bin/run_all"]

The simple answer is that you should not because it breaks the single responsibility principle: one container, one service. Imagine that you want to spawn additional cloud images of MongoDB because of a sudden workload - why increasing Apache2 instances as well and at a 1:1 ratio?
Instead, you should link the boxes and make them speak through TCP. See https://docs.docker.com/userguide/dockerlinks/ for more info.

Typically, you would not do this. It is an anti-pattern because:
You typically have different update cycles for the two processes
You may want to change base filesystems for each of these processes
You want logging and error handling for each of these processes that are independent of each other
Outside of a shared network or volume, the two processes likely have no other hard dependencies
Therefore the best option is to create two separate images, and start the two containers with a compose file that handles the shared private network.
If you cannot follow that best practice, then you end up in a scenario like the following. The parent image contains a line:
ENTRYPOINT ["/entrypoint-parent.sh"]
and you want to add the following to your child image:
ENTRYPOINT ["/entrypoint-child.sh"]
Then the value of ENTRYPOINT in the resulting image is replaced with /entrypoint-child.sh, in other words, there is only a single value for ENTRYPOINT. Docker will only call a single process to start your container, though that process can spawn child processes. There are a couple techniques to extend entrypoints.
Option A: Call your entrypoint, and then run the parent entrypoint at the end, e.g. /entrypoint-child.sh could look like:
#!/bin/sh
echo Running child entrypoint initialization steps here
/usr/bin/mongodb ... &
exec /entrypoint-parent.sh "$#"
The exec part is important, it replaces the current shell by the /entrypoint-parent.sh shell or process, which removes issues with signal handling. The result is you run the first bit of initialization in the child entrypoint, and then delegate to the original parent entrypoint. This does require that you keep track of the name of the parent entrypoint, would could change between versions of your base image. This also means you lose error handling and graceful termination on mongodb since it is run in the background. This could result in a false healthy container and data lose, neither of which I would recommend for a production environment.
Option B: Run the parent entrypoint in the background. This is less than ideal since you will no longer have error handling on the parent process unless you take some extra steps. At the simplest, this looks like the following in your /entrypoint-child.sh:
#!/bin/sh
# other initialization steps
/entrypoint-parent.sh "$#" &
# potentially wait for parent to be running by polling
# run something new in the foreground, that may depend on parent processes
exec /usr/bin/mongodb ...
Note, the "$#" notation I keep using is passing through the value of CMD as arguments to the parent entrypoint.
Option C: Switch to a tool like supervisord. I'm not a huge fan of this since it still implies running multiple daemons inside your container, and you are usually best to split that into multiple containers. You need to decide what the proper response is when a single child process keeps failing.
Option D: Similar to Options A and B, I often create a directory of entrypoint scripts that can be extended at different levels of the image build. The entrypoint itself is unchanged, I just add new files into a directory that gets called sequentially based on the filename. In my scenarios, these scripts are all run in the foreground, and I exec the CMD at the end. You can see an example of this in my base image repo, in particular the entrypoint.d directory and bin/entrypointd.sh script which includes the section:
# ...
for ep in /etc/entrypoint.d/*; do
ext="${ep##*.}"
if [ "${ext}" = "env" -a -f "${ep}" ]; then
# source files ending in ".env"
echo "Sourcing: ${ep}"
set -a && . "${ep}" && set +a
elif [ "${ext}" = "sh" -a -x "${ep}" ]; then
# run scripts ending in ".sh"
echo "Running: ${ep}"
"${ep}"
fi
done
# ...
# run command with exec to pass control
echo "Running CMD: $#"
exec "$#"
However, the above is more for extending the initialization steps, and not for running multiple daemons inside the container. Given the bad options and issues they each have, I hope it's clear why running two containers would be preferred in your scenario.

I was not able to get the usage of && to work. I was able to solve this as described here: https://stackoverflow.com/a/19872810/2971199
So in your case you could do:
RUN echo "/usr/sbin/apache2" >> /etc/bash.bashrc
RUN echo "/path/to/mongodb" >> /etc/bash.bashrc
ENTRYPOINT ["/bin/bash"]
You may need/want to edit your start commands.
Be careful if you run your Dockerfile more than once, you probably don't want multiple copies of commands appended to your bash.bashrc file. You could use grep and an if statement to make your RUN command idempotent.

You can't specify multiple entry points in a Dockerfile. To run multiple servers in the same docker container you must use a command that will be able to launch your servers. Supervisord has already been cited but I could also recommend multirun, a project of mine which is a lighter alternative.

There is an answer in docker docs:
https://docs.docker.com/config/containers/multi-service_container/
But in short
If you need to run more than one service within a container, you can accomplish this in a few different ways.
The first one is to run script which mange your process.
The second one is to use process manager like supervisord

I can think of several ways:
you can write a script to put on the container (ADD) that does all the startup commands, then put that in the ENTRYPOINT
I think you can put any shell commands on the ENTRYPOINT, so you can do service mongod start && /usr/sbin/apache2

If you are trying to run multiple concurrent npm scripts such as a watch script and a build script for example, check out:
How can I run multiple npm scripts in parallel?

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart