Commands in Docker ENTRYPOINT - docker

Is there a way to execute a command as an argument in a Dockerfile ENTRYPOINT? I am creating an image that should automatically run mpirun for the number of processors, i.e., mpirun -np $(nproc) or mpirun -np $(getconf _NPROCESSORS_ONLN).
The following line works:
ENTRYPOINT ["/tini", "--", "mpirun", "-np", "4"] # works
But I cannot get an adaptive form to work:
ENTRYPOINT ["/tini", "--", "mpirun", "-np", "$(nproc)"] # doesn't work
ENTRYPOINT ["/tini", "--", "mpirun", "-np", "$(getconf _NPROCESSORS_ONLN)"] # doesn't work
Using the backtick `nproc` notation does not work either. Nor can I pass an environment variable to the command.
ENV processors 4
ENTRYPOINT ["/tini", "--", "mpirun", "-np", "$processors"] # doesn't work
Has anyone managed to get this kind of workflow?

Those likely won't work: see issue 4783
ENTRYPOINT and CMD are special, as they get started without a shell (so you can choose your own) and iirc they are escaped too.
Unlike the shell form, the exec form does not invoke a command shell.
This means that normal shell processing does not happen.
For example, ENTRYPOINT [ "echo", "$HOME" ] will not do variable substitution on $HOME.
If you want shell processing then either use the shell form or execute a shell directly, for example: ENTRYPOINT [ "sh", "-c", "echo", "$HOME" ].
A workaround would be to use a script.
COPY docker-entrypoint.sh /
ENTRYPOINT ["/docker-entrypoint.sh"]
That script, when docker run triggers it, should at least benefit from the environment variable.
See for example the Dockerfile of vromero/activemq-artemis-docker, which runs the script docker-entrypoint.sh.
In order to allow CMD to run as well, the scripts end with:
exec "$#"
(It will execute whatever parameter comes after, either from the CMD directive, or from docker run parameters)
The OP Gilly adds in the comments:
I use in the Dockerfile:
COPY docker-entrypoint.sh
ENTRYPOINT ["/tini", "--", "/docker-entrypoint.sh"]
And in the entrypoint script:
#!/bin/bash
exec mpirun -np $(nproc) "$#"

It is because you are using the exec form for your entry point and variable substitution will not happen in the exec form.
This is the exec form:
ENTRYPOINT ["executable", "param1", "param2"]
this is the shell form:
ENTRYPOINT command param1 param2
From the official documentation:
Unlike the shell form, the exec form does not invoke a command shell. This means that normal shell processing does not happen. For example, ENTRYPOINT [ "echo", "$HOME" ] will not do variable substitution on $HOME

Related

Dockerfile - exec form of ENTRYPOINT and shell form of CMD

I am looking at Docker's documentation to understand what would be behavior of ENTRYPOINT defined in exec form and CMD defined in shell form.
The example in the docs only shows something like exec_entry p1_entry /bin/sh -c exec_cmd p1_cmd which does not tell me anything.
For example, what if we had:
ENV JAVA_OPTS '-XX:+UseG1GC -Xms512m -Xmx1536m'
ENTRYPOINT ["java"]
CMD $JAVA_OPTS -jar app.jar
Would the problem with signal propagation exist here (in other words, would any extra subshell be spawned here)?
If either ENTRYPOINT or CMD are not JSON arrays, they are interpreted as strings and converted to a length-3 array ["/bin/sh", "-c", "..."].
The resulting two lists are concatenated.
So in your example, the final command list would be
["java", "/bin/sh", "-c", "$JAVA_OPTS -jar app.jar"]
or in Bourne shell syntax
java /bin/sh -c '$JAVA_OPTS -jar app.jar'
This passes the shell interpreter /bin/sh as an argument to java; that almost certainly is not what you intend.
If the CMD is anything other than a complete command, it must use the JSON-array syntax, which in turn means it can't use any shell features and it can't expand environment variable references. This would include both the "container-as-command" pattern where ENTRYPOINT is the command to run and CMD its arguments, and the antipattern you show here where ENTRYPOINT is the interpreter only (and you have to repeat the -jar app.jar option in a docker run command override).
I prefer a setup where CMD is always a complete shell command. If you have an ENTRYPOINT at all, it's a script that does some startup-time setup and then runs exec "$#" to run the command passed as arguments. This can accept either form of CMD.
# ENTRYPOINT ["./docker-entrypoint.sh"] # optional
CMD java $JAVA_OPTS -jar app.jar # in a single shell-format CMD
It took me a while but I finally figured out what is the point of /bin/sh -c in this use-case. We can for example use tini as an entrypoint
ENTRYPOINT ["tini", "--"]
and then a shell for of CMD, but we need to use exec in order to replace a subshell, that is
CMD exec java $JAVA_OPTS -jar app.jar

CMD and ENTRYPOINT with script, same Dockerfile

Trying to run a pod based on an image with this Dockerfile:
...
ENTRYPOINT [ "./mybashscript", ";", "flask" ]
CMD [ "run" ]
I would be expecting the full command to be ./mybashscript; flask run.
However, in this example, the pod / container executes ./mybashscript but not flask.
I also tried a couple of variations like:
...
ENTRYPOINT [ "/bin/bash", "-c", "./mybashscript && flask" ]
CMD [ "run" ]
Now, flask gets executed but run is ignored.
PS: I am trying to understand why this doesn't work and I am aware that I can fit all into the entrypoint or shove everything inside the bash script, but that is not the point.
In both cases you show here, you use the JSON-array exec form for ENTRYPOINT and CMD. This means no shell is run, except in the second case where you run it explicitly. The two parts are just combined together into a single command.
The first construct runs the script ./mybashscript, which must be executable and have a valid "shebang" line (probably #!/bin/bash). The script is passed three arguments, which you can see in the shell variables $1, $2, and $3: a semicolon ;, flask, and run.
The second construct runs /bin/sh -c './mybashscript && flask' run. sh -c takes a single argument, which is mybashscript && flask; the remaining argument run is interpreted as a positional argument, and the sh -c command would see it as $0.
The arbitrary split of ENTRYPOINT and CMD you show doesn't really make sense. The only really important difference between the two is that it is easier to change CMD when you run the container, for example by putting it after the image name in a docker run command. It makes sense to put all of the command in the command part, or none of it, but not really to put half of the command in one part and half in another.
My first pass here would be to write:
# no ENTRYPOINT
CMD ./mybashscript && flask run
Docker will insert a sh -c wrapper for you in bare-string shell form, so the && has its usual Bourne-shell meaning.
This setup looks like you're trying to run an initialization script before the main container command. There's a reasonably standard pattern of using an ENTRYPOINT for this. Since it gets passed the CMD as parameters, the script can end with exec "$#" to run the CMD (potentially as overridden in the docker run command). The entrypoint script could look like
#!/bin/sh
# entrypoint.sh
./mybashscript
exec "$#"
(If you wrote mybashscript, you could also end it with the exec "$#" line, and use that script as the entrypoint.)
In the Dockerfile, set this wrapper script as the ENTRYPOINT, and then whatever the main command is as the CMD.
ENTRYPOINT ["./entrypoint.sh"] # must be a JSON array
CMD ["flask", "run"] # can be either form
If you provide an alternate command, it replaces CMD, and so the exec "$#" line will run that command instead of what's in the Dockerfile, but the ENTRYPOINT wrapper still runs.
# See the environment the wrapper sets up
docker run --rm your-image env
# Double-check the data directory setup
docker run --rm -v $PWD/data:/data your-image ls -l /data
If you really want to use the sh -c form and the split ENTRYPOINT, then the command inside sh -c has to read $# to find its positional arguments (the CMD), plus you need to know the first argument is $0 and not $1. The form you show would be functional if you wrote
# not really recommended but it would work
ENTRYPOINT ["/bin/sh", "-c", "./mybashscript && flask \"$#\"", "flask"]
CMD ["run"]

How to use environment variables in CMD for ENTRYPOINT arguments in Dockerfile?

I have a Dockerfile where I start a executable with default arguments like this:
ENTRYPOINT ["executable", "cmd"]
CMD ["--param1=1", "--param2=2"]
This works fine and I can run the container with default arguments:
docker run image_name
or with custom arguments:
docker run image_name --param1=a --param2=2
Now i would like to have a default parameter depend on a environment variable or default to the deafult value (1) like this:
--param1='${PARAM1:-1}'
I Understand that
ENTRYPOINT ["executable", "cmd"]
CMD ["--param1='${PARAM1:-1}'", "--param2=2"]
does not work since CMD is in exec form and does not invoke a command shell and cannot substitute environment variables.
But if I use CMD in shell form:
ENTRYPOINT ["executable", "cmd"]
CMD "--param1='${PARAM1:-1}' --param2=2"
I get no such option: -c
So my question is:
How get I archive environment variable substitution within the default arguments in CMD for my ENTRYPOINT?
One way would be to lose the CMD and wrap all the defaults up in a custom entrypoint. I try to avoid doing this, but sometimes it seems like the cleanest way, and you can be a lot more flexible:
Dockerfile:
COPY 'my-entrypoint.sh' '/somewhere/in/path/my-entrypoint'
ENTRYPOINT ['my-entrypoint']
my-entrypoint.sh
#!/bin/sh
ARGS="${#}"
if [ -z "${ARGS}" ]; then
ARGS="--param1=${PARAM1:-1} --param2=2"
fi
executable cmd $ARGS
You can't do this the way you describe, for the reasons you've laid out in the question. The ENTRYPOINT and CMD simply get concatenated together to form a single command line, and if either or both of those parts is a string rather than a JSON array it gets automatically converted to sh -c 'the string'.
ENTRYPOINT ["executable", "cmd"]
CMD "--param1='${PARAM1:-1}' --param2=2"
# Equivalently:
ENTRYPOINT ["executable", "cmd", "/bin/sh", "-c", "\"--param1=...\""]
CMD []
There are two techniques I'd suggest to work around this problem, though both require potentially substantial changes in the setup.
In Docker and Kubernetes, it turns out to generally be more convenient to pass options via environment variables than on the command line. This means your application needs to know to look for those variables, and supply some of the defaults you describe here. Some argument-parsing libraries support this out-of-the-box, but not all. Python's standard argparse library, for example, doesn't directly have environment-variable support, but you can still easily support them:
import argparse
import os
parser = argparse.ArgumentParser()
parser.add_argument('param1', default=os.environ.get('PARAM1', '1'))
args = parser.parse_args()
print(args.param1)
# Uses --param1 option, or else $PARAM1 variable, or else default "1"
The other approach I generally recommend is to make CMD a well-formed shell command; don't try to split the command between CMD and ENTRYPOINT. This avoids the problem of Docker inserting the sh -c wrapper in the middle of the line.
# no ENTRYPOINT
CMD executable cmd --param1="${PARAM1:-1}" --param2=2
The ENTRYPOINT pattern that I do find useful is to use a wrapper script to provide defaults and do other first-time setup. If that script is a Bourne shell script and ends with exec "$#", then it will run the CMD as the main container process.
#!/bin/sh
# docker-entrypoint.sh
# In Docker specifically, default $PARAM1 to "docker", not "1".
: ${PARAM1:=docker}
# Run the main container command.
exec "$#"
ENTRYPOINT ["/docker-entrypoint.sh"] # must be a JSON array
CMD executable cmd --param2=2
(There is no requirement to have an ENTRYPOINT. Making ENTRYPOINT be an interpreter and putting the script name in CMD doesn't bring any benefit, and makes it harder to run debugging commands like docker run --rm my-image ls -l /app.)

How are CMD and ENTRYPOINT exec forms in a docker file parsed?

I am running Jupyter in a Docker container. The following shell form will run fine:
CMD jupyter lab --ip='0.0.0.0' --port=8888 --no-browser --allow-root /home/notebooks
But the following one on docker file will not:
ENTRYPOINT ["/bin/sh", "-c"]
CMD ["jupyter", "lab", "--ip='0.0.0.0'", "--port=8888", "--no-browser", "--allow-root", "/home/notebooks"]
The error is:
usage: jupyter [-h] [--version] [--config-dir] [--data-dir] [--runtime-dir] [--paths] [--json] [subcommand]
jupyter: error: one of the arguments --version subcommand --config-dir --data-dir --runtime-dir --paths is required
So obviously /bin/sh -c sees the jupyter argument, but not the following ones.
Interestingly,
CMD ["jupyter", "lab", "--ip='0.0.0.0'", "--port=8888", "--no-browser", "--allow-root", "/home/notebooks"]
will run fine, so it cannot be the number of arguments, or can it?
According to https://docs.docker.com/engine/reference/builder/#cmd, the shell form of CMD executes with /bin/sh -c. So from my point of view I see little difference in the 2 versions. But the reason must be how the exec forms are being evaluated when ENTRYPOINT and CMD are present at the same time.
At a very low level, Linux commands are executed as a series of "words". Typically your shell will take a command line like ls -l "a directory" and breaks that into three words ls -l a directory. (Note the space in "a directory": in the shell form that needs to be quoted to be in the same word.)
The Dockerfile CMD and ENTRYPOINT (and RUN) commands have two forms. In the form you've specified that looks like a JSON array, you are explicitly specifying how the words get broken up. If it doesn't look like a JSON array then the whole thing is taken as a single string, and wrapped in an sh -c command.
# Explicitly spelling out the words
RUN ["ls", "-l", "a directory"]
# Asking Docker to run it via a shell
RUN ls -l 'a directory'
# The same as
RUN ["sh", "-c", "ls -l 'a directory'"]
If you specify both ENTRYPOINT and CMD the two lists of words just get combined together. The important thing for your example is that sh -c takes the single next word and runs it as a shell command; any remaining words can be used as $0, $1, ... positional arguments within that command string.
So in your example, the final thing that gets run is more or less
ENTRYPOINT+CMD ["sh", "-c", "jupyter", ...]
# If the string "jupyter" contained "$1" it would expand to the --ip option
The other important corollary to this is that, practically, ENTRYPOINT can't be the bare-string format: when the CMD is appended to it you get
ENTRYPOINT some command
CMD with args
ENTRYPOINT+CMD ["sh", "-c", "some command", "sh", "-c", "with args"]
and by the same rule all of the CMD words get ignored.
In practice you almost never need to explicitly put sh -c or a SHELL declaration in a Dockerfile; use a string-form command instead, or put complex logic into a shell script.

Differences Between Dockerfile Instructions in Shell and Exec Form

What is the difference between shell and exec form for
CMD:
CMD python my_script.py arg
vs.
CMD ["python", "my_script.py", "arg"]
ENTRYPOINT:
ENTRYPOINT ./bin/main
vs.
ENTRYPOINT ["./bin/main"]
and RUN:
RUN npm start
vs.
RUN ["npm", "start"]
Dockerfile instructions?
There are two differences between the shell form and the exec form. According to the documentation, the exec form is the preferred form. These are the two differences:
The exec form is parsed as a JSON array, which means that you must use double-quotes (“) around words not single-quotes (‘).
Unlike the shell form, the exec form does not invoke a command shell. This means that normal shell processing does not happen. For example, CMD [ "echo", "$HOME" ] will not do variable substitution on $HOME. If you want shell processing then either use the shell form or execute a shell directly, for example: CMD [ "sh", "-c", "echo $HOME" ]. When using the exec form and executing a shell directly, as in the case for the shell form, it is the shell that is doing the environment variable expansion, not docker.
Some additional subtleties here are:
The exec form makes it possible to avoid shell string munging, and to RUN commands using a base image that does not contain the specified shell executable.
In the shell form you can use a \ (backslash) to continue a single RUN instruction onto the next line.
There is also a third form for CMD:
CMD ["param1","param2"] (as default parameters to ENTRYPOINT)
Additionally, the exec form is required for CMD if you are using it as parameters/arguments to ENTRYPOINT that are intended to be overwritten.
In the docker documentation, there is a table that summarizes how ENTRYPOINT and CMD interact when they are in the exec form or in the shell form :
Source : https://docs.docker.com/engine/reference/builder/#understand-how-cmd-and-entrypoint-interact

Resources