Docker build - use the same shell for all RUN commands - docker

Is it possible to use the same shell in all RUN commands when building a docker image? As opposed to each RUN command running on its own shell.
Use case: at some point, I need to source some file containing environment variables that are used later on. I cannot do this, because the commands run in different shells:
RUN source something.sh
RUN ./install.sh
RUN ... more commands
Instead I have to do:
RUN source something.sh && \
./install.sh && \
... more commands
Which I'm trying to avoid since it hurts readability, it's error prone and does not allow inserting comments in between commands.
Any ideas?
Thanks!

It's not possible to have separate RUN statement run in the same shell.
If you don't like the look of concatenated commands, you could write a shell script and RUN that.
You'll have to get it into the container by using a COPY statement.
Or you can use wget or curl to fetch it and pipe it into a shell. That requires that wget or curl is present in the container, so you might have to install them first.
If you use curl and Debian, it could look like this
RUN apt update && \
apt install -y curl && \
curl -sL https://github.com/link/to/my/install-script.sh | bash
If you COPY it in, it'd look like this
COPY install-script.sh .
RUN ./install-script.sh

Related

Singularity arguments conflict with my bioinformatics tool arguments

EDIT: documentation given by the informatic administration was shitty, old version of singularity, now the order of arguments is different and the problem is solved.
To make my tool more portable, and because I have to use it on a cluster, I have to put my bioinformatics tool at disposal for docker. Tool is located here. The docker hub is 007ptar007/metadbgwas, if you want to experience with it. The Dockerfile is in the repo, and to make it easier to everyone :
FROM ubuntu:latest
ENV DEBIAN_FRONTEND=noninteractive
USER root
COPY ./install_docker.sh ./
RUN chmod +x ./install_docker.sh && sh ./install_docker.sh
ENTRYPOINT ["/MetaDBGWAS/metadbgwas.sh"]
ENV PATH="/MetaDBGWAS/:${PATH}"
And the install_docker.sh script contains :
apt-get update
apt install -y libgatbcore-dev libhdf5-dev libboost-all-dev libpstreams-dev zlib1g-dev g++ cmake git r-base-core
Rscript -e "install.packages(c('ape', 'phangorn'))"
Rscript -e "install.packages('https://raw.githubusercontent.com/sgearle/bugwas/master/build/bugwas_1.0.tar.gz', repos=NULL, type='source')"
git clone --recursive https://github.com/Louis-MG/MetaDBGWAS.git
cd MetaDBGWAS
sed -i "51i#include <limits>" ./REINDEER/blight/robin_hood.h #temporary fix for REINDEER compilation
sh install.sh
The problem :
My tool parses the command line, and needs a verbose (-v, or --verbose) argument. It also needs to reject unknown arguments; anything that isn't used by the tool causes the help message to be printed in the standard output and exits. To use the tool, I need to mount volumes were the data is; using -v /path/to/files:/input option:
singularity run docker://007ptar007/metadbgwas --volumes '/path/to/data:/inputd/:/input' --files /input --strains /input/strains --threads 8 --output ~/output
But my tool sees this as a bad -v option value or the --volume as an unknown option. I can't change this on my tool. How do I solve this conflict ?
You need to put any arguments intended for singularity - such as the volume mounting - before the name of the image you want to run (e.g. the docker image you specify in your command):
singularity run -v '/path/to/data:/input' docker://007ptar007/metadbgwas --files /input --strains /input/strains --threads 8 --output ~/output

gRPC service definitions: containerize .proto compilation?

Let's say we have a services.proto with our gRPC service definitions, for example:
service Foo {
rpc Bar (BarRequest) returns (BarReply) {}
}
message BarRequest {
string test = 1;
}
message BarReply {
string test = 1;
}
We could compile this locally to Go by running something like
$ protoc --go_out=. --go_opt=paths=source_relative \
--go-grpc_out=. --go-grpc_opt=paths=source_relative \
services.proto
My concern though is that running this last step might produce inconsistent output depending on the installed version of the protobuf compiler and the Go plugins for gRPC. For example, two developers working on the same project might have slightly different versions installed locally.
It would seem reasonable to me to address this by containerizing the protoc step. For example, with a Dockerfile like this...
FROM golang:1.18
WORKDIR /src
RUN apt-get update && apt-get install -y protobuf-compiler
RUN go install google.golang.org/protobuf/cmd/protoc-gen-go#v1.26
RUN go install google.golang.org/grpc/cmd/protoc-gen-go-grpc#v1.1
CMD protoc --go_out=. --go_opt=paths=source_relative --go-grpc_out=. --go-grpc_opt=paths=source_relative services.proto
... we can run the protoc step inside a container:
docker run --rm -v $(pwd):/src $(docker build -q .)
After wrapping the previous command in a shell script, developers can run it on their local machine, giving them deterministic, reproducible output. It can also run in a CI/CD pipeline.
My question is, is this a sound approach and/or is there an easier way to achieve the same outcome?
NB, I was surprised to find that the official grpc/go image does not come with protoc preinstalled. Am I off the beaten path here?
My question is, is this a sound approach and/or is there an easier way to achieve the same outcome?
It is definitely a good approach. I do the same. Not only to have a consistent across the team, but also to ensure we can produce the same output in different OSs.
There is an easier way to do that, though.
Look at this repo: https://github.com/jaegertracing/docker-protobuf
The image is in Docker hub, but you can create your image if you prefer.
I use this command to generate Go:
docker run --rm -u $(id -u) \
-v${PWD}/protos/:/source \
-v${PWD}/v1:/output \
-w/source jaegertracing/protobuf:0.3.1 \
--proto_path=/source \
--go_out=paths=source_relative,plugins=grpc:/output \
-I/usr/include/google/protobuf \
/source/*

Run vscode in docker

I want to run vscode in docker for internal test, i've created the following
FROM debian:stable
RUN apt-get update && apt-get install -y apt-transport-https curl gpg
RUN curl https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > microsoft.gpg \
&& install -o root -g root -m 644 microsoft.gpg /etc/apt/trusted.gpg.d/ \
&& echo "deb [arch=amd64] https://packages.microsoft.com/repos/vscode stable main" > /etc/apt/sources.list.d/vscode.list
RUN apt-get update && apt-get install -y code libx11-xcb-dev libasound2
RUN code --user-data-dir="~/.vscode-root"
I use to build
docker build -t vscode .
I use to run
docker run vscode code -v
when I run it like this I got error
You are trying to start vscode as a super user which is not recommended. If you really want to, you must specify an alternate user data directory using the --user-data-dir argument.
I just want to verify it by running something like RUN code -v how can I do it ?
should I change the user ? I just want to run vscode in docker and use some vscode apis
Have you tried using VSCode's built in functionality for developing in a container?
Checkout this page which describes how to do this:
Developing inside a Container
You can try out some of the sample container configurations provided by VSCode and use any of those devcontainer.json files as an example to configure a custom development container to your liking. According to the page above:
Workspace files are mounted from the local file system or copied or cloned into the container. Extensions are installed and run inside the container, where they have full access to the tools, platform, and file system. This means that you can seamlessly switch your entire development environment just by connecting to a different container.
This is a very handy way to have different environments for development that are isolated within the container.

Docker proxy config not working for ADD in Dockerfile

I try to write a Dockerfile that adds a file to the image like this:
ADD https://repository.internal/file.zip /tmp/
The repository.internal host is only reachable through a proxy. I provide the proxy configuraton with the --config option but the ADD command seems not to use the proxy and fails.
I know the proxy configuration is correct because I added the line
RUN curl https://repository.internal/file.zip
which is working fine.
Is there any possibility to run the ADD command also with the proxy config?
As per my comments above, I believe this to be something to do with the internal way the Docker build process handles the ADD and RUN commands... I cant find documentation to back this up - so someone with greater internal knowledge may confirm or deny, but makes sense as a RUN command is done in a layer TO the image being built, where as the ADD command is performed and the results of it are baked into the image.
Whichever way this is being handled, you can achieve what you need by moving to the RUN method as follows:
FROM <your base image>
RUN curl https://repository.internal/file.zip >> /tmp/file.zip \
&& cd /tmp \
&& unzip file.zip \
&& rm file.zip
And you will have the files unzipped.
You may need to check if the rm at the end is required - cant remember off the top of my head if the unzip command removes the original zip file.
As you mentioned, this would rely on the curl and unzip packages being available on the image... however you could potentially avoid having these within your final application image by using Docker Multi Stage Builds
Your Dockerfile would then look something like:
FROM <some useful base image> as collector
RUN apt-get install -y curl unzip
RUN mkdir /tmp/files && \
&& curl https://repository.internal/file.zip >> /tmp/files/file.zip \
&& cd /tmp/files \
&& unzip file.zip \
&& rm file.zip
FROM <your final desired base image>
COPY --from=collector /tmp/files /tmp
This would then utilise an image to have curl and unzip in to collect and deal with the extraction of your files without having to install them on your final application image.

Succesfully created a virtualenv (using "mkproject") in Dockerfile, but can't run "workon" properly

Edit: Solved- typo
I have a Dockerfile that successfully creates a virtualenv using virtualenvwrapper (along with setting up a heap of "standard" settings/packages in our normal environment). I am using the resulting image as a "base image" for further use. All good so far. However, the following Dockerfile (based of the first image, "base_image_14.04") falls down at the last line:
FROM base_image_14.04
USER root
RUN DEBIAN_FRONTEND=noninteractive \
apt-get update && apt-get install -y \
libproj0 libproj-dev \
libgeos-c1v5 libgeos-dev \
libjpeg62 libjpeg-dev \
zlib1g zlib1g-dev \
libfreetype6 libfreetype6-dev \
libgdal20 libgdal-dev \
&& rm -rf /var/lib/apt/lists
USER webdev
RUN ["/bin/bash", "-ic", "mkproject maproxy"]
EXPOSE 80
WORKDIR $PROJECT_HOME/mapproxy
ADD ./requirements.txt .
RUN ["/bin/bash", "-ic", "workon mapproxy && pip install -r requirements.txt"]
The "mkproject mapproxy" works fine. If I comment out the last line it builds successfully and I can spin up the container and run "workon mapproxy" manually, not a problem. But when I try and build with the last line, it gives a workon error:
ERROR: Environment 'mapproxy' does not exist. Create it with 'mkvirtualenv mapproxy'.
workon is being called, but for some reason it can't find the mapproxy virtualenv.
WORKON_HOME & PROJECT_HOME both exist (defined in the parent image) and point to the correct locations (and are used successfully by "mkproject mapproxy").
So why is workon returning an error when the mapproxy virtualenv exists? The same error happens when I isolate that last line into a third Dockerfile building on the second.
Solved: It was a simple typo. mkproject maproxy instead of mapproxy. :sigh:
I am trying to build a docker image and am running into similar problems.
First question was why use a virtual env in docker? The main reason in a nutshell is to minimize effort to migrate an existing and working approach into a docker container. I will eventually use docker-compose, but I wanted to start by getting my feet wet with it all in a single docker container.
In my first attempt I installed almost everything with apt-get, including uwsgi. I installed my app "globally" with pip3. The app has command line functionality and a separate flask web app, hence the need for uwsgi. The command line functionality works, but when I make a request of the flask app uwsgi / python has a problem with locale: Fatal Python error: Py_Initialize: Unable to get the locale encoding and ImportError: No module named 'encodings
I have stripped away all my app specific additions to narrow down the problem. This is the Dockerfile I'm using:
# Docker image definition for testing
FROM ubuntu:xenial
# Create a user
RUN useradd -G sudo -ms /bin/bash tester
RUN echo 'tester:password' | chpasswd
WORKDIR /home/tester
# Skipping apt-get update to save some build time. Some are kept
# to insure they are the same as on host setup.
RUN apt-get install -y python3 python3-dev python3-pip \
virtualenv virtualenvwrapper sudo nano && \
apt-get clean -qy
# After above, can we use those installed in rest of Dockerfile?
# Yes, but not always, such as with virtualenvwrapper. What about
# virtualenv? How do you "source" the script? Doesn't appear to be
# installed, as bash complains "source needs a single parameter"
ENV VIRTUALENVWRAPPER_PYTHON /usr/bin/python3
ENV VIRTUALENVWRAPPER_VIRTUALENV /usr/bin/virtualenv
RUN ["/bin/bash", "-c", "source", "/usr/share/virtualenvwrapper/virtualenvwrapper.sh"]
# Create a virtualenv so uwsgi can find locale
# RUN mkdir /home/tester/.virtualenv && virtualenv -p`which python3` /home/bts_tools/.virtualenv/bts_tools
RUN mkvirtualenv -p`which python3` bts_tools && \
workon bts_tools && \
pip3 --disable-pip-version-check install --upgrade bts_tools
USER tester
ENTRYPOINT ["/bin/bash"]
CMD ["--login"]
The build fails on the line I try to source the virtualenvwrapper script. Bash complains source needs an argument - the file to be sourced. So I comment out the RUN lines and it builds without error. When I run the resulting container I see all the additions to the ENV that virtualenvwrapper makes (you can see all of them by executing the "set" command without any args), and the script to be sourced is there too.
So my question is why doesn't docker find them? How does the docker build process work if the results of any previous RUNs or ENVs aren't applied for subsequent use in the Dockerfile? I know some things are applied and work, for example if you apt-get nginx you can refer to /etc/nginx or alter things under that folder. You can create a user and set it's password or cd into its home folder for example. If I move the WORKDIR before the RUN useradd -G I see a warning from useradd the home folder already exists. I tried to use the "time" program to time how long it takes to do various things in the Dockerfile and docker complains it can't find 'time'.
So what exactly is going on? I have spent the last 3 days trying to figure this out. It just shouldn't be this difficult. What am I missing?
Parts of the bts_tools flask app worked when I wasn't using virtual envs. Most of the app didn't work, and the issue was this locale problem. Since everything works on the host outside of docker, and after trying to alter the PATH, PYTHONHOME, PYTHONPATH in my uwsgi start script to overcome the dreaded "locale encoding" fatal error, I decided to try to replicate the host setup as closely as possible since that didn't have the locale issue. When I have had that problem before I could run dpkg-reconfigure python3 or fix with changes to PATH or ENV settings. If you google the problem you'll see many people have difficulties with python & locale. It's almost enough reason to avoid using python!
I posted this elsewhere about locale issue, if it helps.

Resources