How to showvariable with gitversion Docker - docker

I can successfully get the full json string with:
docker run --rm -v `pwd`:`pwd` gittools/gitversion-dotnetcore:linux-4.0.0 `pwd` -output json
which outputs to something like:
{
"Major":0,
"Minor":1,
"Patch":0,
"SemVer":"0.1.0-dev-2.1",
.
.
.
"CommitsSinceVersionSource":20,
"CommitsSinceVersionSourcePadded":"0020",
"CommitDate":"2020-05-28"
}
Since I am only interested in SemVer variable I try to use the -showvariable FullSemVer with:
docker run --rm -v `pwd`:`pwd` gittools/gitversion-dotnetcore:linux-4.0.0 `pwd` -output json -showvariable FullSemVer
But it fails with a quite long and nasty error log.
INFO [05/28/20 18:23:12:10] End: Loading version variables from disk cache (Took: 76.31ms)
ERROR [05/28/20 18:23:12:13] An unexpected error occurred:
System.NotImplementedException: The method or operation is not implemented.
I wonder if there is a way to use the -showvariable flag with the gitversion Docker container?

I think the problem is the path argument passed to GitVersion. pwd will give you the working directory on your host, not within the container. GitVersion is unfortunately not aware of the fact that it's executing within a container, so it needs to be provided with the volume directory /repo as the path to calculate a version number for. This is something we should consider changing in version 6.
I also can't remember when -showvariable was implemented, so to be on the safe side you should try with a newer version of our Docker containers. I can also recommend using the alpine container, as it's the smallest one we offer (only 83.9 MB). This works:
docker run \
--rm \
--volume "$(pwd):/repo" \
gittools/gitversion:5.3.4-linux-alpine.3.10-x64-netcoreapp3.1 \
/repo \
-output json \
-showvariable FullSemVer

Related

Singularity arguments conflict with my bioinformatics tool arguments

EDIT: documentation given by the informatic administration was shitty, old version of singularity, now the order of arguments is different and the problem is solved.
To make my tool more portable, and because I have to use it on a cluster, I have to put my bioinformatics tool at disposal for docker. Tool is located here. The docker hub is 007ptar007/metadbgwas, if you want to experience with it. The Dockerfile is in the repo, and to make it easier to everyone :
FROM ubuntu:latest
ENV DEBIAN_FRONTEND=noninteractive
USER root
COPY ./install_docker.sh ./
RUN chmod +x ./install_docker.sh && sh ./install_docker.sh
ENTRYPOINT ["/MetaDBGWAS/metadbgwas.sh"]
ENV PATH="/MetaDBGWAS/:${PATH}"
And the install_docker.sh script contains :
apt-get update
apt install -y libgatbcore-dev libhdf5-dev libboost-all-dev libpstreams-dev zlib1g-dev g++ cmake git r-base-core
Rscript -e "install.packages(c('ape', 'phangorn'))"
Rscript -e "install.packages('https://raw.githubusercontent.com/sgearle/bugwas/master/build/bugwas_1.0.tar.gz', repos=NULL, type='source')"
git clone --recursive https://github.com/Louis-MG/MetaDBGWAS.git
cd MetaDBGWAS
sed -i "51i#include <limits>" ./REINDEER/blight/robin_hood.h #temporary fix for REINDEER compilation
sh install.sh
The problem :
My tool parses the command line, and needs a verbose (-v, or --verbose) argument. It also needs to reject unknown arguments; anything that isn't used by the tool causes the help message to be printed in the standard output and exits. To use the tool, I need to mount volumes were the data is; using -v /path/to/files:/input option:
singularity run docker://007ptar007/metadbgwas --volumes '/path/to/data:/inputd/:/input' --files /input --strains /input/strains --threads 8 --output ~/output
But my tool sees this as a bad -v option value or the --volume as an unknown option. I can't change this on my tool. How do I solve this conflict ?
You need to put any arguments intended for singularity - such as the volume mounting - before the name of the image you want to run (e.g. the docker image you specify in your command):
singularity run -v '/path/to/data:/input' docker://007ptar007/metadbgwas --files /input --strains /input/strains --threads 8 --output ~/output

gRPC service definitions: containerize .proto compilation?

Let's say we have a services.proto with our gRPC service definitions, for example:
service Foo {
rpc Bar (BarRequest) returns (BarReply) {}
}
message BarRequest {
string test = 1;
}
message BarReply {
string test = 1;
}
We could compile this locally to Go by running something like
$ protoc --go_out=. --go_opt=paths=source_relative \
--go-grpc_out=. --go-grpc_opt=paths=source_relative \
services.proto
My concern though is that running this last step might produce inconsistent output depending on the installed version of the protobuf compiler and the Go plugins for gRPC. For example, two developers working on the same project might have slightly different versions installed locally.
It would seem reasonable to me to address this by containerizing the protoc step. For example, with a Dockerfile like this...
FROM golang:1.18
WORKDIR /src
RUN apt-get update && apt-get install -y protobuf-compiler
RUN go install google.golang.org/protobuf/cmd/protoc-gen-go#v1.26
RUN go install google.golang.org/grpc/cmd/protoc-gen-go-grpc#v1.1
CMD protoc --go_out=. --go_opt=paths=source_relative --go-grpc_out=. --go-grpc_opt=paths=source_relative services.proto
... we can run the protoc step inside a container:
docker run --rm -v $(pwd):/src $(docker build -q .)
After wrapping the previous command in a shell script, developers can run it on their local machine, giving them deterministic, reproducible output. It can also run in a CI/CD pipeline.
My question is, is this a sound approach and/or is there an easier way to achieve the same outcome?
NB, I was surprised to find that the official grpc/go image does not come with protoc preinstalled. Am I off the beaten path here?
My question is, is this a sound approach and/or is there an easier way to achieve the same outcome?
It is definitely a good approach. I do the same. Not only to have a consistent across the team, but also to ensure we can produce the same output in different OSs.
There is an easier way to do that, though.
Look at this repo: https://github.com/jaegertracing/docker-protobuf
The image is in Docker hub, but you can create your image if you prefer.
I use this command to generate Go:
docker run --rm -u $(id -u) \
-v${PWD}/protos/:/source \
-v${PWD}/v1:/output \
-w/source jaegertracing/protobuf:0.3.1 \
--proto_path=/source \
--go_out=paths=source_relative,plugins=grpc:/output \
-I/usr/include/google/protobuf \
/source/*

Avoiding duplicated arguments when running a Docker container

I have a tensorflow training script which I want to run using a Docker container (based on the official TF GPU image). Although everything works just fine, running the container with the script is horribly verbose and ugly. The main problem is that my training script allows the user to specify various directories used during training, for input data, logging, generating output, etc. I don't want to have change what my users are used to, so the container needs to be informed of the location of these user-defined directories, so it can mount them. So I end up with something like this:
docker run \
-it --rm --gpus all -d \
--mount type=bind,source=/home/guest/datasets/my-dataset,target=/datasets/my-dataset \
--mount type=bind,source=/home/guest/my-scripts/config.json,target=/config.json \
-v /home/guest/my-scripts/logdir:/logdir \
-v /home/guest/my-scripts/generated:/generated \
train-image \
python train.py \
--data_dir /datasets/my-dataset \
--gpu 0 \
--logdir ./logdir \
--output ./generated \
--config_file ./config.json \
--num_epochs 250 \
--batch_size 128 \
--checkpoint_every 5 \
--generate True \
--resume False
In the above I am mounting a dataset from the host into the container, and also mounting a single config file config.json (which configures the TF model). I specify a logging directory logdir and an output directory generated as volumes. Each of these resources are also passed as parameters to the train.py script.
This is all very ugly, but I can't see another way of doing it. Of course I could put all this in a shell script, and provide command line arguments which set these duplicated values from the outside. But this doesn't seem a nice solution, because if I want to anything else with the container, for example check the logs, I would use the raw docker command.
I suspect this question will likely be tagged as opinion-based, but I've not found a good solution for this that I can recommend to my users.
As user Ron van der Heijden points out, one solution is to use docker-compose in combination with environment variables defined in an .env file. Nice answer.

Forked docker image not building

I am trying to fork this docker image so that if anything changes on the original it won't affect me.
I have forked the repo corresponding to that image to my own repo.
I have cloned the repo and am trying to build it:
docker build . -t davcal/gcc-cross-x86_64-elf
I am getting this error:
+ cd /usr/local/src
+ ./build-binutils.sh 2.31.1
/bin/sh: 1: ./build-binutils.sh: not found
The command '/bin/sh -c set -x && cd /usr/local/src && ./build-binutils.sh ${BINUTILS_VERSION} && ./build-gcc.sh ${GCC_VERSION}' returned a non-zero code: 127
What makes no sense to me is that if I use the original image, it builds successfully:
FROM randomdude/gcc-cross-x86_64-elf
...
Maybe Docker Hub stores a pre-built image?
How do I fix this?
Note: I am using Windows. This shouldn't make a difference since the error originates within the container.
Edit
I tried patching the Dockerfile to chmod executable permissions to the sh files in case that was causing problems on Windows. Unfortunately, the exact same error occurs.
RUN set -x \
&& chmod +x /usr/local/src/build-binutils.sh \
&& chmod +x /usr/local/src/build-gcc.sh \
&& cd /usr/local/src \
&& ./build-binutils.sh ${BINUTILS_VERSION} \
&& ./build-gcc.sh ${GCC_VERSION}
Edit 2
Following this method, I inspected the container to see if the sh files actually exist. Here is the output.
I ran docker run --rm -it c53693f11514 bash, including the hash of the intermediate container of the previous successful step of the Dockerfile.
This is the output showing that the files do exist:
root#9b8a64ac2090:/# cd usr/local/src
root#9b8a64ac2090:/usr/local/src# ls
binutils-2.31.1 build-binutils.sh build-gcc.sh gcc-8.2.0
From the described symptoms, file exists, is a shell script, and works on other machines, the "file not found" error is most likely from Winidows linefeeds being added to the file. When the Linux kernel processes a shell script, it looks at the first line, the #!/bin/sh or similar, and then finds that interpreter to run the shell script. If that interpreter isn't found, you'll get a "file not found" error.
In this case, the file it's looking for won't be /bin/sh, but instead /bin/sh\r or /bin/sh^M depending on how you want to represent the carriage return character. You can fix that for single files with a tool like dos2unix but in general, you'll want to fix git itself since there are likely other files that have had their linefeeds corrupted. For details on adjusting the behavior of git, see this post.

Is it possible to add an installer, run it and delete it during one build step in Docker?

I'm trying to create a Docker image from a pretty large installer binary (300+ MB). I want to add the installer to the image, install it, and delete the installer. This doesn't seem to be possible:
COPY huge-installer.bin /tmp
RUN /tmp/huge-installer.bin
RUN rm /tmp/huge-installer.bin # <- has no effect on the image size
Using multiple build stages doesn't seem to solve this, since I need to run the installer in the final image. If I could execute the installer directly from a previous build stage, without copying it, that would solve my problem, but as far as I know that's not possible.
Is there any way to avoid including the full weight of the installer in the final image?
I ended up solving this by using the built-in HTTP server in Python to make the project directory available to the image over HTTP.
Inside the Dockerfile, I can run commands like this, piping scripts directly to bash using curl:
RUN curl "http://127.0.0.1:${SERVER_PORT}/installer-${INSTALLER_VERSION}.bin" | bash
Or save binaries, run them and delete them in one step:
RUN curl -O "http://127.0.0.1:${SERVER_PORT}/binary-${INSTALLER_VERSION}.bin" && \
./binary-${INSTALLER_VERSION}.bin && \
rm binary-${INSTALLER_VERSION}.bin
I use a Makefile to start the server and stop it after the build, but you can use a build script instead.
Here's a Makefile example:
SHELL := bash
IMAGE_NAME := app-test
VERSION := 1.0.0
SERVER_PORT := 8580
.ONESHELL:
.PHONY: build
build:
# Kills the HTTP server when the build is done
function cleanup {
pkill -f "python3 -m http.server.*${SERVER_PORT}"
}
trap cleanup EXIT
# Starts a HTTP server that makes the contents of the project directory
# available to the image
python3 -m http.server -b 127.0.0.1 ${SERVER_PORT} &>/dev/null &
sleep 1
EXTRA_ARGS=""
# Allows skipping the build cache by setting NO_CACHE=1
if [[ -n $$NO_CACHE ]]; then
EXTRA_ARGS="--no-cache"
fi
docker build $$EXTRA_ARGS \
--network host \
--build-arg SERVER_PORT=${SERVER_PORT} \
-t ${IMAGE_NAME}:latest \
.
docker tag ${IMAGE_NAME}:latest ${IMAGE_NAME}:${VERSION}
I think the best way is to download the bin from a website then run it:
RUN wget http://myweb/huge-installer.bin && /tmp/huge-installer.bin && rm /tmp/huge-installer.bin
in this way your image layer will not contain the binary you download
I didn't test it thoroughly, but wouldn't such an approach be viable? (Besides LinPy's answer, which is way easier if you have the possibility to just do it that way.)
Dockerfile:
FROM alpine:latest
COPY entrypoint.sh /tmp/entrypoint.sh
RUN \
echo "I am an image that can run your huge installer binary!" \
&& echo "I will only function when you give it to me as a volume mount."
ENTRYPOINT [ "/tmp/entrypoint.sh" ]
entrypoint.sh:
#!/bin/sh
/tmp/your-installer # install your stuff here
while true; do
echo "installer finished, commit me now!"
sleep 5
done
Then run:
$ docker build -t foo-1
$ docker run --rm --name foo-1 --rm -d -v $(pwd)/your-installer:/tmp/your-installer
$ docker logs -f foo-1
# once it echoes "commit me now!", run the next command
$ docker commit foo-1 foo-2
$ docker stop foo-1
Since the installer was only mounted as a volume, the image foo-2 should not contain it anymore. You could also go and build another Dockerfile based on foo-2 to change the entrypoint, for example.
Cf. docker commit

Resources