Docker approach for multi-conditional pipeline - docker

I'm completely new in Docker. I have the following idea in mind: I need to provide single image that will be able based on runtime arguments like profile/stage and python is included or not perform different scripts.
These scripts are used lot's of params that can be override from outside. I searched over the similar issues but I didn't find anything similar.
I have the following idea in mind but it seems quite difficult to support and ugly I hope someone can provide better solution:
The image content is raw:
FROM openjdk:8
#ARG py_ver=2.7
#RUN if [-z "$py_ver" ] ; then echo python version not provided ; else echo python version is $py_ver ; fi
#FROM python:${py_ver}
# set the working directory in the container
WORKDIR /models
# copy the dependencies file to the working directory
COPY training.sh execution.sh requirements/ ./
#RUN pip install --no-cache-dir -r requirements.txt
ENV profile="training"
ENV pythonEnabled=false
RUN if [ "$profile" = "training" ]; then \
command="java=/usr/bin/java training.sh"; \
else \
command="java=/usr/bin/java execution.sh"; \
fi
ENTRYPOINT ["${command}"]
I suppose I have several issues: 1) I need to have 1 image but based on runtime parameters I need to choose appropriate run script; 2) I have to pass a lot of args to training and execution scripts (app. 6-7 params). It's a bit difficult to do with "-e"
3) My image can download all python versions and use in runtime specified in args version.
I revised docker-compose but it helps if you need to manage several services. It's not my case. I have single service with different setup params and preparation flow. Could someone suggest better approach than having spaghetti if-else conditions for selected in runtime python version and profile?

It might help to look at this question in two parts. First, how can you control what runtime you're using; and second, how can you control what happens when the container runs?
A Docker image typically contains a single application, but if there's a substantial code base and several ways to invoke it, you can package that all together. In Python land, a Flask application and an associated Celery worker might be bundled together, for example. Regardless, the image still contains a single language interpreter and the associated libraries: you wouldn't build an image with three versions of Python and four versions of supporting libraries.
For things that control the single language interpreter and library stack that get built into an image, ARG as you've shown it is the correct way to do it:
ARG py_ver=3.9
RUN apt-get install python${py_ver}
ARG requirements=requirements.txt
RUN pip install -r ${requirements}
If you need to build the same image for multiple language versions, you can build it using a shell loop, or similar automation:
for py_ver in 3.6 3.7 3.8 3.9; do
docker build --build-arg py_ver="$py_ver" -t "myapp:python${py_ver}" .
done
docker run ... myapp:python3.9
As far as what gets run when you launch the container, you have a couple of choices. You can provide an alternate command when you start the container, and the easiest thing to do is to discard the entire "profile" section at the end of the Dockerfile and just provide that:
docker run ... myapp:python3.9 \
training.sh
You mention that a couple of the invocations are more involved. You can wrap these in shell scripts
#!/bin/sh
java -Dfoo=bar -Dbaz.quux.meep=peem ... \
-jar myapp.jar \
arg1 arg2 arg3
and then COPY them into your image into one of the usual executable paths
COPY training-full.sh /usr/local/bin
and then you can just run that script as the main container command
docker run ... myapp:python3.9 training-full.sh
You can, with some care, use ENTRYPOINT here. The important detail is that the CMD gets passed to the ENTRYPOINT as command-line arguments, and in your Dockerfile the ENTRYPOINT generally must have JSON-array syntax. You could in principle use this to create artificial "commands":
#!/bin/sh
case "$1" of
training)
shift
exec training.sh foo bar "$#"
;;
execution)
shift
exec execution.sh "$#"
;;
*)
exec "$#"
;;
esac
Then you can launch the container in a couple of ways
docker run --rm myapp:python3.9 training
docker run --rm myapp:python3.9 execution 'hello world'
docker run --rm myapp:python3.9 ls -l /
docker run --rm -it myapp:python3.9 bash

Related

RUN pwd does not seem to work in my dockerfile

I am studying on Docker these days and confused that why RUN pwd just does not seem to work while running my docker file.
I am working on IOS
and the full content of my docker file can be seen as below:
FROM ubuntu:latest
MAINTAINER xxx
RUN mkdir -p /ln && echo hello world > /ln/wd6.txt
WORKDIR /ln
RUpwd
CMD ["more" ,"wd6.txt"]
as far as my understanding,
after building the docker image with the tag 'wd8'and running it, I supposed the result should show like this
~ % docker run wd8
::::::::::::::
wd6.txt
::::::::::::::
hello world
ln
however, the fact is without ln.
I have tried with RUN $pwd, and also added ENV at the beginning of my dockerfile, both do not work.
Please help point out where the problem is.
ps: so I should not expect to see the directory 'ln' on my disk, right? since it is supposed to be created within the container...?
enter image description here
1227
There are actually multiple reasons you don't see the output of the pwd command, some of them already mentioned in the comments:
the RUN statements in your Dockerfile are only executed during the build stage, i.e. using docker build and not with docker run
when using the BuildKit backend (which is the case here) the output of successfully run commands is collapsed; to see them anyway use the --progress=plain flag
running the same build multiple times will use the build cache of the previous build and not execute the command again; you can disable this with the --no-cache flag

Understanding workflow of multi-stage Dockerfile

There are a few processes I'm struggling to wrap my brain around when it comes to multi-stage Dockerfile.
Using this as an example, I have a couple questions below it:
# Dockerfile
# Uses multi-stage builds requiring Docker 17.05 or higher
# See https://docs.docker.com/develop/develop-images/multistage-build/
# Creating a python base with shared environment variables
FROM python:3.8.1-slim as python-base
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=off \
PIP_DISABLE_PIP_VERSION_CHECK=on \
PIP_DEFAULT_TIMEOUT=100 \
POETRY_HOME="/opt/poetry" \
POETRY_VIRTUALENVS_IN_PROJECT=true \
POETRY_NO_INTERACTION=1 \
PYSETUP_PATH="/opt/pysetup" \
VENV_PATH="/opt/pysetup/.venv"
ENV PATH="$POETRY_HOME/bin:$VENV_PATH/bin:$PATH"
# builder-base is used to build dependencies
FROM python-base as builder-base
RUN apt-get update \
&& apt-get install --no-install-recommends -y \
curl \
build-essential
# Install Poetry - respects $POETRY_VERSION & $POETRY_HOME
ENV POETRY_VERSION=1.0.5
RUN curl -sSL https://raw.githubusercontent.com/sdispater/poetry/master/get-poetry.py | python
# We copy our Python requirements here to cache them
# and install only runtime deps using poetry
WORKDIR $PYSETUP_PATH
COPY ./poetry.lock ./pyproject.toml ./
RUN poetry install --no-dev # respects
# 'development' stage installs all dev deps and can be used to develop code.
# For example using docker-compose to mount local volume under /app
FROM python-base as development
ENV FASTAPI_ENV=development
# Copying poetry and venv into image
COPY --from=builder-base $POETRY_HOME $POETRY_HOME
COPY --from=builder-base $PYSETUP_PATH $PYSETUP_PATH
# Copying in our entrypoint
COPY ./docker/docker-entrypoint.sh /docker-entrypoint.sh
RUN chmod +x /docker-entrypoint.sh
# venv already has runtime deps installed we get a quicker install
WORKDIR $PYSETUP_PATH
RUN poetry install
WORKDIR /app
COPY . .
EXPOSE 8000
ENTRYPOINT /docker-entrypoint.sh $0 $#
CMD ["uvicorn", "--reload", "--host=0.0.0.0", "--port=8000", "main:app"]
# 'lint' stage runs black and isort
# running in check mode means build will fail if any linting errors occur
FROM development AS lint
RUN black --config ./pyproject.toml --check app tests
RUN isort --settings-path ./pyproject.toml --recursive --check-only
CMD ["tail", "-f", "/dev/null"]
# 'test' stage runs our unit tests with pytest and
# coverage. Build will fail if test coverage is under 95%
FROM development AS test
RUN coverage run --rcfile ./pyproject.toml -m pytest ./tests
RUN coverage report --fail-under 95
# 'production' stage uses the clean 'python-base' stage and copyies
# in only our runtime deps that were installed in the 'builder-base'
FROM python-base as production
ENV FASTAPI_ENV=production
COPY --from=builder-base $VENV_PATH $VENV_PATH
COPY ./docker/gunicorn_conf.py /gunicorn_conf.py
COPY ./docker/docker-entrypoint.sh /docker-entrypoint.sh
RUN chmod +x /docker-entrypoint.sh
COPY ./app /app
WORKDIR /app
ENTRYPOINT /docker-entrypoint.sh $0 $#
CMD [ "gunicorn", "--worker-class uvicorn.workers.UvicornWorker", "--config /gunicorn_conf.py", "main:app"]
The questions I have:
Are you docker build ... this entire image and then just docker run ... --target=<stage> to run a specific stage (development, test, lint, production, etc.) or are you only building and running the specific stages you need (e.g. docker build ... -t test --target=test && docker run test ...)?
I want to say it isn't the former because you end up with a bloated image with build kits and what not... correct?
When it comes to local Kubernetes development (minikube, skaffold, devspace, etc.) and running unit tests, are you supposed referring to these stages in the Dockerfile (devspace Hooks or something) or using native test tools in the container (e.g. npm test, ./manage.py test, etc.)?
Thanks for clearing this questions up.
To answer from a less DevSpace-y persepctive and a more general Docker-y one (With no disrespect to Lukas!):
Question 1
Breakdown
❌ Are you docker build ... this entire image and then just docker run ... --target= to run a specific stage
You're close in your understanding and managed to outline the approach in your second part of the query:
✅ or are you only building and running the specific stages you need (e.g. docker build ... -t test --target=test && docker run test ...)?
The --target option is not present in the docker run command, which can be seen when calling docker run --help.
I want to say it isn't the former because you end up with a bloated image with build kits and what not... correct?
Yes, it's impossible to do it the first way, as when --target is not specified, then only the final stage is incorporated into your image. This is a great benefit as it cuts down the final size of your container, while allowing you to use multiple directives.
Details and Examples
It is a flag that you can pass in at build time so that you can choose which layers to build specifically. It's a pretty helpful directive that can be used in a few different ways. There's a decent blog post here talking about the the new features that came out with multi-stage builds (--target is one of them)
For example, I've had a decent amount of success building projects in CI utilising different stages and targets, the following is pseudo-code, but hopefully the context is applied
# Dockerfile
FROM python as base
FROM base as dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt
FROM dependencies as test
COPY src/ src/
COPY test/ test/
FROM dependencies as publish
COPY src/ src/
CMD ...
A Dockerfile like this would enable you to do something like this in your CI workflow, once again, pseudo-code-esque
docker build . -t my-app:unit-test --target test
docker run my-app:unit-test pyunit ...
docker build . -t my-app:latest
docker push ...
In some scenarios, it can be quite advantageous to have this fine grained control over what gets built when, and it's quite the boon to be able to run those images that comprise of only a few stages without having built the entire app.
The key here, is that there's no expectation that you need to use --target, but it can be used to solve particular problems.
Question 2
When it comes to local Kubernetes development (minikube, skaffold, devspace, etc.) and running unit tests, are you supposed referring to these stages in the Dockerfile (devspace Hooks or something) or using native test tools in the container (e.g. npm test, ./manage.py test, etc.)?
Lukas covers a devspace specific approach very well, but ultimately you can test however you like. Using devspace to make it easier to run (and remember to run) tests certainly sounds like a good idea. Whatever tool you use to enable an easier workflow, will likely still use npm test etc under the hood.
If you wish to call npm test outside of a container that's fine, if you wish to call it in a container, that's also fine. The solution to your problem will always change depending on your landscape. CICD helps to standardise on external factors and provide a uniform means to ensure testing is performed, and deployments are auditable
Hope that helps in any way shape or form 👍
Copying my response to this from Reddit to help others who may look for this on StackOverflow:
DevSpace maintainer here. For my workflow (and the default DevSpace behavior if you set it up with devspace init), image building is being skipped during development because it tends to be the most annoying and time-consuming part of the workflow. Instead, most teams that use DevSpace have a dev image pushed to a registry and build by CI/CD which is then used in devspace.yaml using replacePods.replaceImage as shown here: https://devspace.sh/cli/docs/configuration/development/replace-pods
This means that your manifests or helm charts are being deployed referencing the prod images (as they should be) and then devspace will (after deployment) replace the images of your pods with dev-optimized images that ship all your tooling. Inside these pods, you can then use the terminal to build your application, run tests along with other dependencies running in your cluster etc.
However, typically teams also start using DevSpace in CI/CD after a while and then they add profiles (e.g. prod profile or integration-testing profile etc. - more on https://devspace.sh/cli/docs/configuration/profiles/basics) to their devspace.yaml where they add image building again because they want to build the images in their pipelines using kaniko or docker. For this, you would specify the build target in devspace.yaml as well: https://devspace.sh/cli/docs/configuration/images/docker#target
FWIW regarding 1: I never use docker run --target but I also always use Kubernetes directly over manual docker commands to run any workloads.

How to pass arguments in dockerfile during docker run command

I am new in docker. For learning purpose, I'm working on code submission platform (online judge). So, I know at high level that whenever a user submit a code, it will hit an API which will receive code, languageID and inputs(if any), and this code will run on a docker container and return the output at the client side(or error if any).
Dockerfile :
FROM gcc:latest
COPY main.cpp /usr/src/cpp_test/prog.cpp
WORKDIR /usr/src/cpp_test/
CMD [ "sh", "-c", "g++ -o Test prog.cpp && ./Test" ]
So, whenever user submit a code, Everytime, I am first building this dockerfile(docker build) because main.cpp file will be different everytime and then running the docker run command.
So, my question is, Is there any way that I build this dockerfile only once(by making it more general) and now whenever a user submit a code, I just only need to run the docker run command.
Remember, there are 3 arguments that I have to pass ie., code, languageID, inputs(if any) to dockerfile.
Any help will be appreciated.
Thankyou.
First, looking at your existing Dockerfile: the output of docker build is an image containing a compiled, ready-to-run binary. The image you have now contains a compiler plus a source file, but you'd generally have a compiled binary. (If you're used to a compiler like g++, think of docker build as the same sort of sequence: it creates a directly runnable image and doesn't need copy of the source or to be rebuilt.) So I'd typically RUN g++, not defer it until container startup:
FROM gcc:latest
WORKDIR /usr/src/cpp_test
COPY main.cpp .
RUN g++ -o Test main.cpp
CMD [ "./Test" ]
Even for the higher-level workflow you describe, I'm not sure I'd stray too far from this Dockerfile. It doesn't do a whole lot beyond running the compiler and then creating a runnable image; so long as the source file is named main.cpp and it doesn't have any library dependencies this will work for any C++ source file.
If you really wanted to build and run an arbitrary source file without building an image per submission, you can use the unmodified gcc image here; you don't need a custom image since it won't really do anything. I'd suggest writing a launcher shell script:
#!/bin/sh
set -e
g++ -o Test main.cpp
./Test
Create an execution directory:
mkdir run_1
cp -a launcher.sh run_1
cp submission_1.cpp run_1/main.cpp
Then run the container against that tree:
docker run --rm \
-v "$PWD/run_1:/code" \
-w /code \
gcc:latest \
./launcher.sh

Set environment variables in Docker container

I want to know if what I'm doing is considered best practice, or is there a better way:
What I want
I want to have a Docker image, which will have some environment variables predefined when run as a container. I want to do that by running some shell script that will export those variables.
What I'm doing
My dockerfile looks like this:
Dockerfile code..
..
..
RUN useradd -m develop
RUN echo ". /env.sh" >> /home/develop/.bashrc
USER develop
Is that a good way?
Using the Dockerfile ENV directive would be much more usable than putting environment variables into a file.
ENV SOME_VARIABLE=some-value
# Do not use .bashrc at all
# Do not `RUN .` or `RUN source`
Most ways to use Docker don't involve running shell dotfiles like .bashrc. Adding settings there won't usually have any effect. In a Dockerfile, any environment variable settings in a RUN instruction will get lost at the end of that line, including files you read in using the shell . command (or the equivalent but non-standard source).
For example, given the Dockerfile you show, a docker run invocation like this never invokes a shell at all and never reads the .bashrc file:
docker run the-image env \
| grep A_VARIABLE_FROM_THE_BASHRC
There are some workarounds to this (my answer to How to source a script with environment variables in a docker build process? describes a startup-time entrypoint wrapper) but the two best ways are to (a) restructure your application to need fewer environment variables and have sensible defaults if they're not set, and (b) use ENV instead of a file of environment-variable settings.
** Check edits
Yes will probably work, if in the right order.
# Add user
RUN useradd -m develop
#switch to user
USER develop
#run script as user.
#RUN echo "./env.sh" >> /home/develop/.bashrc
RUN /bin/bash -c "source ./env.sh"
Although the duplicated RUN useradd is not necessary at all

Docker ROS automatic start of launch file

I developed a few ROS packages and I want to put the packages in a docker container because installing all the ROS packages all the time is tedious. Therefore I created a dockerfile that uses a base ROS image, installed all the necessary dependencies, copied my workspace, built the workspace in the docker container and sourced everything afterward. You can find the docker file here:
FROM ros:kinetic-ros-base
RUN apt-get update && apt-get install locales
RUN locale-gen en_US.UTF-8
ENV LANG en_US.UTF-8
RUN apt-get update && apt-get install -y \
&& rm -rf /var/likb/apt/lists/*
COPY . /catkin_ws/src/
WORKDIR /catkin_ws
RUN /bin/bash -c '. /opt/ros/kinetic/setup.bash; catkin_make'
RUN /bin/bash -c '. /opt/ros/kinetic/setup.bash; source devel/setup.bash'
CMD ["roslaunch", "master_launch sim_perception.launch"]
The problem is: When I run the docker container wit the "run" command, docker doesn't seem to know that I sourced my new ROS workspace and therefore it cannot launch automatically my launch script. If I run the docker container as bash script with "run -it bash" I can source my workspace again and then roslaunch my .launch file.
So can someone tell me how to write my dockerfile correctly so I launch my .launch file automatically when I run the container? Thanks!
From Docker Docs
Each RUN instruction is run independently and won't effect next instruction so when you run last Line no PATH are saved from ROS.
You need Source .bashrc or every environment you need using source first.
You can wrap everything you want (source command and roslaunch command) inside a sh file then just run that file at the end
If you review the convention of ros_entrypoint.sh you can see how best to source the workspace you would like in the docker. We're all so busy learning how to make docker and ros do the real things, it's easy to skip over some of the nuance of this interplay. This sucked forever for me; hope this is helpful for you.
I looked forever and found what seemed like only bad advice, and in the absence of an explicit standard or clear guidance I've settled into what seems like a sane approach that also allows you to control what launches at runtime with environment variables. I now consider this as the right solution for my needs.
In the Dockerfile for the image you want to set the start/launch behavior;
towards the end; you should use ADD line to insert your own ros_entrypoint.sh (example included); Set it as the ENTRYPOINT and then a CMD to run by default run something when the docker start.
note: you'll (obviously?) need to run the docker build process for these changes to be effective
Dockerfile looks like this:
all your other dockerfile ^^
.....
# towards the end
COPY ./ros_entrypoint.sh /
ENTRYPOINT ["/ros_entrypoint.sh"]
CMD ["bash"]
Example ros_entryppoint.sh:
#!/bin/bash
set -e
# setup ros environment
if [ -z "${SETUP}" ]; then
# basic ros environment
source "/opt/ros/$ROS_DISTRO/setup.bash"
else
#from environment variable; should be a absolute path to the appropriate workspaces's setup.bash
source $SETUP
fi
exec "$#"
Used in this way the docker will automatically source either the basic ros bits... or if you provide another workspace's setup.bash path in the $SETUP environment variable, it will be used in the container.
So a few ways to work with this:
From the command line prior to running docker
export SETUP=/absolute/path/to/the/setup.bash
docker run -it your-docker-image
From the command line (inline)
docker run --env SETUP=/absolute/path/to/the/setup.bash your-docker-image
From docker-compose
service-name:
network_mode: host
environment:
- SETUP=/absolute/path/to/the_workspace/devel/setup.bash #or whatever
command: roslaunch package_name launchfile_that_needed_to_be_sourced.launch
#command: /bin/bash # wake up and do something else

Resources