Accessing `docker run` arguments from the Dockerfile - docker

I run an image like that:
docker run <image_name> <config_file>
where config_file is the path to a JSON file which contains the configuration of my application.
Inside the Dockerfile, I do
ENTRYPOINT ["uwsgi", \
"--log-encoder", "json {\"msg\":\"${msg}\"}\n", \
"--http", ":80", \
"--master", \
"--wsgi-file", "app.py", \
"--callable", "app", \
"--threads", "10", \
"--pyargv"]
At the same time, I would like to access some of the values in the configuration file in the Dockerfile. For example to configure the JSON log encoder of uWSGI.
How can I do that?

This is impossible. See comment by #David Maze.

Related

mounting folder in singularity -which directory to mount

My input files for the singularity program is not recognized (not found), which I think is due to the directory is not mounted within singularity.
I know that the mounting can be set by this command, but I am not sure which folders to mount.
export SINGULARITY_BIND="/somefolder:/somefolder"
How do I know which folders should be before and after the ":" sign in SINGULARITY_BIND?
I have set:
SINGULARITY_CACHEDIR=/mnt/scratch/username/software
where my singularity is stored.
My complete script:
export SINGULARITY_CACHEDIR=/mnt/scratch/username/software
export SINGULARITY_BIND="/home/username:/mnt/scratch/username"
OUTPUT_DIR="${PWD}/quickstart-output"
INPUT_DIR="${PWD}/quickstart-testdata"
BIN_VERSION="1.4.0"
# Run DeepVariant.
singularity run \
docker://google/deepvariant:"${BIN_VERSION}" \
/opt/deepvariant/bin/run_deepvariant \
--model_type=WGS \ **Replace this string with exactly one of the following [WGS,WES,PACBIO,HYBRID_PACBIO_ILLUMINA]**
--ref="${INPUT_DIR}"/ucsc.hg19.chr20.unittest.fasta \
--reads="${INPUT_DIR}"/NA12878_S1.chr20.10_10p1mb.bam \
--regions "chr20:10,000,000-10,010,000" \
--output_vcf="${OUTPUT_DIR}"/output.vcf.gz \
--output_gvcf="${OUTPUT_DIR}"/output.g.vcf.gz \
--intermediate_results_dir "${OUTPUT_DIR}/intermediate_results_dir" \ **Optional.
--num_shards=1 \ **How many cores the `make_examples` step uses. Change it to the number of CPU cores you have.**
My error:
singularity.clean.sh: line 23: --ref=/home/username/scratch/username/software/quickstart-testdata/ucsc.hg19.chr20.unittest.fasta: No such file or directory
If you want to bind mount your current working directory, you can use --bind $(pwd). If you want to bind mount your home directory, you can use --bind $HOME (note that singularity mounts the home directory by default). See the Singularity documentation for more information.
Based on your INPUT_DIR and OUTPUT_DIR, it seems like you can bind mount your current working directory. To do that, use --bind $(pwd). Note this argument goes before the name of the singularity container.
Just to be safe, also use --pwd $(pwd) to set the working directory in the container as the current working directory on the host.
OUTPUT_DIR="${PWD}/quickstart-output"
INPUT_DIR="${PWD}/quickstart-testdata"
BIN_VERSION="1.4.0"
singularity run --bind $(pwd) --pwd $(pwd) \
docker://google/deepvariant:"${BIN_VERSION}" \
/opt/deepvariant/bin/run_deepvariant \
--model_type=WGS \
--ref="${INPUT_DIR}/ucsc.hg19.chr20.unittest.fasta" \
--reads="${INPUT_DIR}/NA12878_S1.chr20.10_10p1mb.bam" \
--regions "chr20:10,000,000-10,010,000" \
--output_vcf="${OUTPUT_DIR}/output.vcf.gz" \
--output_gvcf="${OUTPUT_DIR}/output.g.vcf.gz" \
--intermediate_results_dir "${OUTPUT_DIR}/intermediate_results_dir" \
--num_shards=1
The syntax of the --bind argument is path-on-host:path-in-container. And using --bind path is shorthand for --bind path:path, which means the source path is mounted as the same path in the container. This can be very useful, because one does not have to rewrite paths and think in terms of the container's directories.

Error parsing reference ,invalid reference format

i have ran into an error in my project while the gitlab pipeline was building the docker image
$ docker tag project $NEXUS_URL/project:${TAG_COMMIT}
Error parsing reference: "url:port/repository/project:BugFix-29643813" is not a valid repository/tag: invalid reference format
this is my docker file
FROM codestrongbiz/jdk16-maven-docker:latest
ENV SPRING_OUTPUT_ANSI_ENABLED=ALWAYS \
APP_SLEEP=0 \
JAVA_OPTS=""
# add directly the jar
ADD *.jar /app.jar
EXPOSE 8087
CMD echo "The application will start in ${APP_SLEEP}s..." && \
sleep ${APP_SLEEP} && \
java ${JAVA_OPTS} -Djava.security.egd=file:/dev/./urandom -jar /app.jar
can anyone help?
after testing a lot i realized that my source branch's name should not contain an uppercase letter

Unable to set environment variable inside docker container when calling sh file from Dockerfile CMD

I am following this link to create a spark cluster. I am able to run the spark cluster. However, I have to give an absolute path to start spark-shell. I am trying to set environment variables i.e. PATH and a few others in start-shell.sh. However, it's not setting that inside container. I tried printing it using printenv inside the container. But these variables are never reflected.
Am I trying to set environment variables incorrectly? Spark cluster is running successfully though.
I am using docker-compose.yml to build and recreate an image and container.
docker-compose up --build
Dockerfile
# builder step used to download and configure spark environment
FROM openjdk:11.0.11-jre-slim-buster as builder
# Add Dependencies for PySpark
RUN apt-get update && apt-get install -y curl vim wget software-properties-common ssh net-tools ca-certificates python3 python3-pip python3-numpy python3-matplotlib python3-scipy python3-pandas python3-simpy
# JDBC driver download and install
ADD https://go.microsoft.com/fwlink/?linkid=2168494 /usr/share/java
RUN update-alternatives --install "/usr/bin/python" "python" "$(which python3)" 1
# Fix the value of PYTHONHASHSEED
# Note: this is needed when you use Python 3.3 or greater
ENV SPARK_VERSION=3.1.2 \
HADOOP_VERSION=3.2 \
SPARK_HOME=/opt/spark \
PYTHONHASHSEED=1
# Download and uncompress spark from the apache archive
RUN wget --no-verbose -O apache-spark.tgz "https://archive.apache.org/dist/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz" \
&& mkdir -p ${SPARK_HOME} \
&& tar -xf apache-spark.tgz -C ${SPARK_HOME} --strip-components=1 \
&& rm apache-spark.tgz
My Dockerfile-spark
When using SPARK_BIN="${SPARK_HOME}/bin/ under ENV in Dockerfile, environment variable get's set. It is visible inside the docker container by using printenv
FROM apache-spark
WORKDIR ${SPARK_HOME}
ENV SPARK_MASTER_PORT=7077 \
SPARK_MASTER_WEBUI_PORT=8080 \
SPARK_LOG_DIR=${SPARK_HOME}/logs \
SPARK_MASTER_LOG=${SPARK_HOME}/logs/spark-master.out \
SPARK_WORKER_LOG=${SPARK_HOME}/logs/spark-worker.out \
SPARK_WORKER_WEBUI_PORT=8080 \
SPARK_MASTER="spark://spark-master:7077" \
SPARK_WORKLOAD="master"
COPY start-spark.sh /
CMD ["/bin/bash", "/start-spark.sh"]
start-spark.sh
#!/bin/bash
. "$SPARK_HOME/bin/load-spark-env.sh"
export SPARK_BIN="${SPARK_HOME}/bin/" # This doesn't work here
export PATH="${SPARK_HOME}/bin/:${PATH}" # This doesn't work here
# When the spark work_load is master run class org.apache.spark.deploy.master.Master
if [ "$SPARK_WORKLOAD" == "master" ];
then
export SPARK_MASTER_HOST=`hostname` # This works here
cd $SPARK_BIN && ./spark-class org.apache.spark.deploy.master.Master --ip $SPARK_MASTER_HOST --port $SPARK_MASTER_PORT --webui-port $SPARK_MASTER_WEBUI_PORT >> $SPARK_MASTER_LOG.
My File structure is
dockerfile
dockerfile-spark # this uses pre-built image created by dockerfile
start-spark.sh # invoked buy dockerfile-spark
docker-compose.yml # uses build parameter to build an image from dockerfile-spark
From inside the master container
root#3abbd4508121:/opt/spark# export
declare -x HADOOP_VERSION="3.2"
declare -x HOME="/root"
declare -x HOSTNAME="3abbd4508121"
declare -x JAVA_HOME="/usr/local/openjdk-11"
declare -x JAVA_VERSION="11.0.11+9"
declare -x LANG="C.UTF-8"
declare -x OLDPWD
declare -x PATH="/usr/local/openjdk-11/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
declare -x PWD="/opt/spark"
declare -x PYTHONHASHSEED="1"
declare -x SHLVL="1"
declare -x SPARK_HOME="/opt/spark"
declare -x SPARK_LOCAL_IP="spark-master"
declare -x SPARK_LOG_DIR="/opt/spark/logs"
declare -x SPARK_MASTER="spark://spark-master:7077"
declare -x SPARK_MASTER_LOG="/opt/spark/logs/spark-master.out"
declare -x SPARK_MASTER_PORT="7077"
declare -x SPARK_MASTER_WEBUI_PORT="8080"
declare -x SPARK_VERSION="3.1.2"
declare -x SPARK_WORKER_LOG="/opt/spark/logs/spark-worker.out"
declare -x SPARK_WORKER_WEBUI_PORT="8080"
declare -x SPARK_WORKLOAD="master"
declare -x TERM="xterm"
root#3abbd4508121:/opt/spark#
There are a couple of different ways to set environment variables in Docker, and a couple of different ways to run processes. A container normally runs one process, which is controlled by the image's ENTRYPOINT and CMD settings. If you docker exec a second process in the container, that does not run as a child process of the main process, and will not see environment variables that are set by that main process.
In the setup you show here, the start-spark.sh script is the main container process (it is the image's CMD). If you docker exec your-container printenv, it will see things set in the Dockerfile but not things set in this script.
Things like filesystem paths will generally be fixed every time you run the container, no matter what command you're running there, so you can specify these in the Dockerfile
ENV SPARK_BIN=${SPARK_HOME}/bin PATH=${SPARK_BIN}:${PATH}
You can specify both an ENTRYPOINT and a CMD in your Dockerfile; if you do, the CMD is passed as arguments to the ENTRYPOINT. This leads to a useful pattern where the CMD is a standard shell command, and the ENTRYPOINT is a wrapper that does first-time setup and then runs it. You can split your script into two:
#!/bin/sh
# spark-env.sh
. "${SPARK_BIN}/load-spark-env.snh"
exec "$#"
#!/bin/sh
# start-spark.sh
spark-class org.apache.spark.deploy.master.Master \
--ip "$SPARK_MASTER_HOST" \
--port "$SPARK_MASTER_PORT" \
--webui-port "$SPARK_MASTER_WEBUI_PORT"
Then in your Dockerfile specify both parts
COPY spark-env.sh start-spark.sh /
ENTRYPOINT ["/spark-env.sh"] # must be JSON-array syntax
CMD ["/start-spark.sh"] # or any other valid CMD
This is useful for your debugging since it's straightforward to override the CMD in a docker run or docker-compose run instruction, leaving the ENTRYPOINT in place.
docker-compose run spark \
printenv
This launches a new container based on all of the same Dockerfile setup. When it runs, it runs printenv instead of the CMD in the image. This will do the first-time setup in the ENTRYPOINT script, and then the final exec "$#" line will run printenv instead of starting the Spark application. This will show you the environment the application will have when it starts.

ARG or ENV, which one to use in this case?

This could be maybe a trivial question but reading docs for ARG and ENV doesn't put things clear to me.
I am building a PHP-FPM container and I want to give the ability for enable/disable some extensions on user needs.
Would be great if this could be done in the Dockerfile by adding conditionals and passing flags on the build command perhaps but AFAIK is not supported.
In my case and my personal approach is to run a small script when container starts, something like the following:
#!/bin/sh
set -e
RESTART="false"
# This script will be placed in /config/init/ and run when container starts.
if [ "$INSTALL_XDEBUG" == "true" ]; then
printf "\nInstalling Xdebug ...\n"
yum install -y php71-php-pecl-xdebug
RESTART="true"
fi
...
if [ "$RESTART" == "true" ]; then
printf "\nRestarting php-fpm ...\n"
supervisorctl restart php-fpm
fi
exec "$#"
This is how my Dockerfile looks like:
FROM reynierpm/centos7-supervisor
ENV TERM=xterm \
PATH="/root/.composer/vendor/bin:${PATH}" \
INSTALL_COMPOSER="false" \
COMPOSER_ALLOW_SUPERUSER=1 \
COMPOSER_ALLOW_XDEBUG=1 \
COMPOSER_DISABLE_XDEBUG_WARN=1 \
COMPOSER_HOME="/root/.composer" \
COMPOSER_CACHE_DIR="/root/.composer/cache" \
SYMFONY_INSTALLER="false" \
SYMFONY_PROJECT="false" \
INSTALL_XDEBUG="false" \
INSTALL_MONGO="false" \
INSTALL_REDIS="false" \
INSTALL_HTTP_REQUEST="false" \
INSTALL_UPLOAD_PROGRESS="false" \
INSTALL_XATTR="false"
RUN yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm \
https://rpms.remirepo.net/enterprise/remi-release-7.rpm
RUN yum install -y \
yum-utils \
git \
zip \
unzip \
nano \
wget \
php71-php-fpm \
php71-php-cli \
php71-php-common \
php71-php-gd \
php71-php-intl \
php71-php-json \
php71-php-mbstring \
php71-php-mcrypt \
php71-php-mysqlnd \
php71-php-pdo \
php71-php-pear \
php71-php-xml \
php71-pecl-apcu \
php71-php-pecl-apfd \
php71-php-pecl-memcache \
php71-php-pecl-memcached \
php71-php-pecl-zip && \
yum clean all && rm -rf /tmp/yum*
RUN ln -sfF /opt/remi/php71/enable /etc/profile.d/php71-paths.sh && \
ln -sfF /opt/remi/php71/root/usr/bin/{pear,pecl,phar,php,php-cgi,phpize} /usr/local/bin/. && \
mv -f /etc/opt/remi/php71/php.ini /etc/php.ini && \
ln -s /etc/php.ini /etc/opt/remi/php71/php.ini && \
rm -rf /etc/php.d && \
mv /etc/opt/remi/php71/php.d /etc/. && \
ln -s /etc/php.d /etc/opt/remi/php71/php.d
COPY container-files /
RUN chmod +x /config/bootstrap.sh
WORKDIR /data/www
EXPOSE 9001
Currently this is working but ... If I want to add let's say 20 (a random number) of extensions or any other feature that can be enable|disable then I will end with 20 non necessary ENV (because Dockerfile doesn't support .env files) definition whose only purpose would be set this flag for let the script knows what to do then ...
Is this the right way to do it?
Should I use ENV for this purpose?
I am open to ideas if you have a different approach for achieve this please let me know about it
From Dockerfile reference:
The ARG instruction defines a variable that users can pass at build-time to the builder with the docker build command using the --build-arg <varname>=<value> flag.
The ENV instruction sets the environment variable <key> to the value <value>.
The environment variables set using ENV will persist when a container is run from the resulting image.
So if you need build-time customization, ARG is your best choice.
If you need run-time customization (to run the same image with different settings), ENV is well-suited.
If I want to add let's say 20 (a random number) of extensions or any other feature that can be enable|disable
Given the number of combinations involved, using ENV to set those features at runtime is best here.
But you can combine both by:
building an image with a specific ARG
using that ARG as an ENV
That is, with a Dockerfile including:
ARG var
ENV var=${var}
You can then either build an image with a specific var value at build-time (docker build --build-arg var=xxx), or run a container with a specific runtime value (docker run -e var=yyy)
So if want to set the value of an environment variable to something different for every build then we can pass these values during build time and we don't need to change our docker file every time.
While ENV, once set cannot be overwritten through command line values. So, if we want to have our environment variable to have different values for different builds then we could use ARG and set default values in our docker file. And when we want to overwrite these values then we can do so using --build-args at every build without changing our docker file.
For more details, you can refer this.
Why to use ARG or ENV ?
Let's say we have a jar file and we want to make a docker image of it. So, we can ship it to any docker engine.
We can write a Dockerfile.
Dockerfile
FROM eclipse-temurin:17-jdk-alpine
VOLUME /tmp
ARG JAR_FILE
COPY ${JAR_FILE} app.jar
ENTRYPOINT ["java","-jar","/app.jar"]
Now, if we want to build the docker image using Maven, we can pass the JAR_FILE using the --build-arg as target/*.jar
docker build --build-arg JAR_FILE=target/*.jar -t myorg/myapp
However, if we are using Gradle; the above command doesn't work and we've to pass a different path: build/libs/
docker build --build-arg JAR_FILE=build/libs/*.jar -t myorg/myapp .
Once you have chosen a build system, we don’t need the ARG. We can hard code the JAR location.
For Maven, that would be as follows:
Dockerfile
FROM eclipse-temurin:17-jdk-alpine
VOLUME /tmp
COPY target/*.jar app.jar
ENTRYPOINT ["java","-jar","/app.jar"]
here, we can build an image with the following command:
docker build -t image:tag .
When to use `ENV`?
If we want to set some values at running containers and reflect that to the image like the Port Number that your application can run/listen on. We can set that using the ENV.
Both ARG and ENV seem very similar. Both can be accessed from within our Dockerfile commands in the same manner.
Example:
ARG VAR_A 5
ENV VAR_B 6
RUN echo $VAR_A
RUN echo $VAR_B
Personal Option!
There is a tradeoff between choosing ARG over ENV. If you choose ARG you can't change it later during the run. However, if you chose ENV you can modify the value at the container.
I personally prefer ARG over ENV wherever I can, like,
In the above Example:
I have used ARG as the build system maven or Gradle impacts during build rather than runtime. It thus encapsulates a lot of details and provided a minimum set of arguments for the runtime.
For more details, you can refer to this.

How to export an environment variable to a docker image?

I can define "static" environment variables in a Dockerfile with ENV, but is it possible to pass some value at build time to this variable? I'm attempting something like this, which doesn't work:
FROM phusion/baseimage
RUN mkdir -p /foo/2016/bin && \
FOOPATH=`ls -d /foo/20*/bin` && \
export FOOPATH
ENV PATH $PATH:$FOOPATH
Of course, in the real use case I'd be running/unpacking something that creates a directory whose name will change with different versions, dates, etc., and I'd like to avoid modifying the Dockerfile every time the directory name changes.
Edit: Since it appears it's not possible, the best workaround so far is using a symlink:
FROM phusion/baseimage
RUN mkdir -p /foo/2016/bin && \
FOOPATH=`ls -d /foo/20*/bin` && \
ln -s $FOOPATH /mypath
ENV PATH $PATH:/mypath
To pass a value in at build time, use an ARG.
FROM phusion/baseimage
RUN mkdir -p /foo/2016/bin && \
FOOPATH=`ls -d /foo/20*/bin` && \
export FOOPATH
ARG FOOPATH
ENV PATH $PATH:${FOOPATH}
Then you can run docker build --build-arg FOOPATH=/dir -t myimage .
Edit: from you comment, my answer above won't solve your issue. There's nothing in the Dockerfile you can update from the output of the run command, the output isn't parsed, only the resulting filesystem is saved. For this, I think you're best off in your run command writing the path to the image and read in from your /etc/profile or a custom entrypoint script. That depends on how you want to launch your container and the base image.

Resources