Problems building an Ubuntu docker container for a distribution - docker

I'm trying to build a Docker container for this module. Main idea is that I'm trying to use, as much as possible, the packages provided by Ubuntu to avoid problem; I'm also using the default perl that comes with the container, which apparently is 5.22.
This is the Dockerfile:
FROM ubuntu:16.04
LABEL version="1.0" maintainer="JJ Merelo <jjmerelo#GMail.com>" perl5version="5.22"
ADD data/* ./
ADD . .
RUN mkdir /test \
&& apt-get update \
&& apt-get install -y build-essential curl hunspell-en-us libtext-hunspell-perl myspell-es libencode-perl cpanminus libfile-slurp-tiny-perl libversion-perl\
&& curl https://raw.githubusercontent.com/SublimeText/Dictionaries/master/Spanish.dic -o Spanish.dic
RUN cpanm .
RUN perl --version
VOLUME /test
WORKDIR /test
# Will run this
ENTRYPOINT prove
It builds locally without a problem (using Docker version 17.05.0-ce, build 89658be). However, it does not work in Docker hub due to this problem
/etc/ssl/certs/AddTrust_Low-Value_Services_Root.pem is encountered a second time at /usr/share/perl/5.22/File/Find.pm line 79.
Which seems to happen in a line where I do perl Makefile.PL && make install
I really have no idea what could be the cause for that, of why it works locally and fails there. Any idea?

I think the problem is that you do WORKDIR /test too late. In the beginning of Dockerfile you do ADD . . this will copy all files from the current directory of your local filesystem to the root directory of the image. The problem is that there can be conflicts between the directories already present in the root of the image, like /lib. Try instead something like this:
FROM ubuntu:16.04
LABEL version="1.0" maintainer="JJ Merelo <jjmerelo#GMail.com>" perl5version="5.22"
WORKDIR /test
ADD data/* ./
ADD . .
RUN apt-get update \
&& apt-get install -y build-essential curl hunspell-en-us libtext-hunspell-perl myspell-es libencode-perl cpanminus libfile-slurp-tiny-perl libversion-perl\
&& curl https://raw.githubusercontent.com/SublimeText/Dictionaries/master/Spanish.dic -o Spanish.dic
RUN perl --version
RUN cpanm Test::More
RUN cpanm .
VOLUME /test
# Will run this
ENTRYPOINT prove

The problem is that we don't use a working directory for copying the files we want to have included in the directory, and then we do ADD . .. That includes a lib directory, so we are then copying the contents of our module to the lib directory, that is, the image /lib directory.
That's not bad per se. The problem is that the Perl installation procedure is then copying the contents of its own lib directory (which is the system's lib directory) onto blib, and then again, copyint that to some directory of /usr/lib. Somehow there's installation happening in other directories, which means that that certificate mentioned in the error is eventually copied somewhere else, hence the duplicate.
That does not explain why it was working locally, and as a matter of fact working in subsequent builds in Docker Hub. Maybe some slightly different version of runc. But even if it works, you end up with stuff duplicated, and a bigger image than you should have.
Baseline is: don't use root as working directory. Use WORKDIR from the very beginning, even more so if that directory is going to have stuff that's going to be installed somewhere else.

Related

Using cache for cmake build process in Docker build

I need one application in docker image which requires some specific version of libraries that have to be built from source.
So I am building it during the Docker build process.
Problem is, that it takes so long time (about 30mins).
I am wondering if it's possible to save it to the cache layer and skip it if the build process is done next time.
Here is the critical part of code from Dockerfile:
ADD https://sqlite.org/2022/sqlite-autoconf-3380200.tar.gz sqlite-autoconf-3380200.tar.gz
RUN tar -xvzf sqlite-autoconf-3380200.tar.gz
WORKDIR sqlite-autoconf-3380200
RUN ./configure
RUN make
RUN make install
WORKDIR /tmp
ADD https://download.osgeo.org/proj/proj-9.0.0.tar.gz proj-9.0.0.tar.gz
RUN tar -xvzf proj-9.0.0.tar.gz
WORKDIR proj-9.0.0
RUN mkdir build
WORKDIR build
RUN cmake ..
RUN cmake --build .
RUN cmake --build . --target install
RUN projsync --system-directory --list-files
The important detail about Docker layer caching is that, if any of the previous steps have changed, then all of the following steps will be rebuilt. So for your setup, if you change anything in one of the earlier dependencies, it will cause all of the later steps to be rebuilt again.
This is a case where Docker multi-stage builds can help. The idea is that you'd build each library in its own image, and therefore each library build can be independently cached. You can then copy all of the build results into a final image.
The specific approach I'll describe here assumes (a) all components install into /usr/local, (b) /usr/local is initially empty, and (c) there aren't conflicts between the different library installations. You should be able to adapt it to other filesystem layouts.
Everything below is in the same Dockerfile.
I'd make a very first stage selecting a base Linux-distribution image. If you know you'll always need to install something – TLS CA certificates, mandatory package updates – you can put it here. Having this helps ensure that everything is being built against a consistent base.
FROM ubuntu:20.04 AS base
# empty in this example
Since you have multiple things you need to build, a next stage will install any build-time dependencies. The C toolchain and its dependencies are large, so having this separate saves time and space since the toolchain can be shared across the later stages.
FROM base AS build-deps
RUN apt-get update \
&& DEBIAN_FRONTEND=noninteractive \
apt-get install --no-install-recommends --assume-yes \
build-essential \
cmake
# libfoo-dev
Now for each individual library, you have a separate build stage that downloads the source, builds it, and installs it into /usr/local.
FROM build-deps AS sqlite
WORKDIR /sqlite
ADD https://sqlite.org/2022/sqlite-autoconf-3380200.tar.gz sqlite-autoconf-3380200.tar.gz
...
RUN make install
FROM build-deps AS proj
WORKDIR /proj
ADD https://download.osgeo.org/proj/proj-9.0.0.tar.gz proj-9.0.0.tar.gz
...
RUN cmake --build . --target install
To actually build your application, you'll need the C toolchain, plus you'll also need these various libraries.
FROM build-deps AS app
COPY --from=sqlite /usr/local/ /usr/local/
COPY --from=proj /usr/local/ /usr/local/
WORKDIR /app
COPY ./ ./
RUN ./configure && make && make install
Once you've done all of this, in the app image, the /usr/local tree will have all of the installed libraries (COPYed from the previous image) plus your application. So for the final stage, start from the original OS image (without the C toolchain) and COPY the /usr/local tree in (without the original sources).
FROM base
COPY --from=app /usr/local/ /usr/local/
EXPOSE 12345
CMD ["myapp"] # in `/usr/local/bin`
Let's say you update to a newer patch version of proj. In the sqlite path, the base and build-deps layers haven't changed and the ADD and RUN commands are the same, so this stage runs entirely from cache. proj is rebuilt. That will cause the COPY --from=proj step to invalidate the cache in the app stage, and you'll rebuild your application against the newer library.

How to use docker to generate grpc code based on go.mod versions?

Using the official golang docker image, I can use the protoc command to generate the x.pb.go and x_grpc.pb.go files. The problem is that it uses the latest versions, while I want to generate those using whichever version that is part of the go.mod file.
I tried to start from the golang image, then get my project's go.mod file, get the dependencies and try to generate from there. Here is my dockerfile:
FROM golang:1.15
WORKDIR /app
RUN apt-get update
RUN apt install -y protobuf-compiler
COPY go.* .
RUN go mod download
RUN go get all
RUN export PATH="$PATH:$(go env GOPATH)/bin"
RUN mkdir /api
Then I try to bind the volume of the .proto file and the /pb folder to output them, and use the protoc command again (I'm trying directly from the docker right now). Something like this:
protoc --proto_path=/api --go_out=/pb --go-grpc_out=/pb /api/x.proto
I'm getting this error though:
protoc-gen-go: program not found or is not executable
--go_out: protoc-gen-go: Plugin failed with status code 1.
My go.sum file has google.golang.org/protobuf v1.25.0 in it, so how come it is not found?
go.mod & go.sum are used for versioning when building go programs. This is not what you need here. You want the protoc compiler to use the correct plugin versions when running it against your .proto file(s).
To install the desired protoc-gen-go (and protoc-gen-go-grpc if using gRPC) plugins, install them directly. Update your Dockerfile like so:
FROM golang:1.15
WORKDIR /app
RUN apt-get update
RUN apt install -y protobuf-compiler
RUN GO111MODULE=on \
go get google.golang.org/protobuf/cmd/protoc-gen-go#v1.25.0 \
google.golang.org/grpc/cmd/protoc-gen-go-grpc#v1.1.0
# export is redundant here `/go/bin` is already in `golang` image's path
# (and actual any env change here is lost once the command completes)
# RUN export PATH="$PATH:$(go env GOPATH)/bin"
RUN mkdir /api
If you want the latest versions of either plugin, either use #latest - or drop the # suffix

Run protoc command into docker container

I'm trying to run protoc command into a docker container.
I've tried using the gRPC image but protoc command is not found:
/bin/sh: 1: protoc: not found
So I assume I have to install manually using RUN instructions, but is there a better solution? An official precompiled image with protoc installed?
Also, I've tried to install via Dockerfile but I'm getting again protoc: not found.
This is my Dockerfile
#I'm not using "FROM grpc/node" because that image can't unzip
FROM node:12
...
# Download proto zip
ENV PROTOC_ZIP=protoc-3.14.0-linux-x86_32.zip
RUN curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v3.14.0/${PROTOC_ZIP}
RUN unzip -o ${PROTOC_ZIP} -d ./proto
RUN chmod 755 -R ./proto/bin
ENV BASE=/usr/local
# Copy into path
RUN cp ./proto/bin/protoc ${BASE}/bin
RUN cp -R ./proto/include/* ${BASE}/include
RUN protoc -I=...
I've done RUN echo $PATH to ensure the folder is in path and is ok:
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Also RUN ls -la /usr/local/bin to check protoc file is into the folder and it shows:
-rwxr-xr-x 1 root root 4849692 Jan 2 11:16 protoc
So the file is in /bin folder and the folder is in the path.
Have I missed something?
Also, is there a simple way to get the image with protoc installed? or the best option is generate my own image and pull from my repository?
Thanks in advance.
Edit: Solved downloading linux-x86_64 zip file instead of x86_32. I downloaded the lower architecture requirements thinking a x86_64 machine can run a x86_32 file but not in the other way. I don't know if I'm missing something about architecture requirements (It's probably) or is a bug.
Anyway in case it helps someone I found the solution and I've added an answer with the neccessary Dockerfile to run protoc and protoc-gen-grpc-web.
The easiest way to get non-default tools like this is to install them through the underlying Linux distribution's package manager.
First, look at the Docker Hub page for the node image. (For "library" images like node, construct the URL https://hub.docker.com/_/node.) You'll notice there that there are several variations named "alpine", "buster", or "stretch"; plain node:12 is the same as node:12-stretch and node:12.20.0-stretch. The "alpine" images are based on Alpine Linux; the "buster" and "stretch" ones are different versions of Debian GNU/Linux.
For Debian-based packages, you can then look up the package on https://packages.debian.org/ (type protoc into the "Search the contents of packages" form at the bottom of the page). That leads you to the protobuf-compiler package. Knowing that contains the protoc binary, you can install it in your Dockerfile with:
FROM node:12 # Debian-based
RUN apt-get update \
&& DEBIAN_FRONTEND=noninteractive \
apt-get install --no-install-recommends --assume-yes \
protobuf-compiler
# The rest of your Dockerfile as above
COPY ...
RUN protoc ...
You generally must run apt-get update and apt-get install in the same RUN command, lest a subsequent rebuild get an old version of the package cache from the Docker build cache. I generally have only a single apt-get install command if I can manage it, with the packages list alphabetically one to a line for maintainability.
If the image is Alpine-based, you can do a similar search on https://pkgs.alpinelinux.org/contents to find protoc, and similarly install it:
FROM node:12-alpine
RUN apk add --no-cache protoc
# The rest of your Dockerfile as above
Finally I solved my own issue.
The problem was the arch version: I was using linux-x86_32.zip but works using linux-x86_64.zip
Even #David Maze answer is incredible and so complete, it didn't solve my problem because using apt-get install version 3.0.0 and I wanted 3.14.0.
So, the Dockerfile I have used to run protoc into a docker container is like this:
FROM node:12
...
# Download proto zip
ENV PROTOC_ZIP=protoc-3.14.0-linux-x86_64.zip
RUN curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v3.14.0/${PROTOC_ZIP}
RUN unzip -o ${PROTOC_ZIP} -d ./proto
RUN chmod 755 -R ./proto/bin
ENV BASE=/usr
# Copy into path
RUN cp ./proto/bin/protoc ${BASE}/bin/
RUN cp -R ./proto/include/* ${BASE}/include/
# Download protoc-gen-grpc-web
ENV GRPC_WEB=protoc-gen-grpc-web-1.2.1-linux-x86_64
ENV GRPC_WEB_PATH=/usr/bin/protoc-gen-grpc-web
RUN curl -OL https://github.com/grpc/grpc-web/releases/download/1.2.1/${GRPC_WEB}
# Copy into path
RUN mv ${GRPC_WEB} ${GRPC_WEB_PATH}
RUN chmod +x ${GRPC_WEB_PATH}
RUN protoc -I=...
Because this is currently the highest ranked result on Google and the above instructions above won't work, if you want to use docker/dind for e.g. gitlab, this is the way how you can get the glibc-dependency working for protoc there:
#!/bin/bash
# install gcompat, because protoc needs a real glibc or compatible layer
apk add gcompat
# install a recent protoc (use a version that fits your needs)
export PB_REL="https://github.com/protocolbuffers/protobuf/releases"
curl -LO $PB_REL/download/v3.20.0/protoc-3.20.0-linux-x86_64.zip
unzip protoc-3.20.0-linux-x86_64.zip -d $HOME/.local
export PATH="$PATH:$HOME/.local/bin"

Docker isn't caching Alpine apk add command

Everytime I build the container I have to wait for apk add docker to finish which takes a long time.
Since everytime it downloads the same thing, can I somehow force Docker to cache apk's downloads for development purposes?
Here's my Dockerfile:
FROM golang:1.13.5-alpine
WORKDIR /go/src/app
COPY src .
RUN go get -d -v ./...
RUN go install -v ./...
RUN apk add --update docker
CMD ["app"]
BTW, I am using this part volumes: - /var/run/docker.sock:/var/run/docker.sock in my docker-compose.yml to use sibling containers, if that matters.
EDIT: I've found google to copy docker.tgz in Chromium:
# add docker client -- do not install docker via apk -- it will try to install
# docker engine which takes a lot of space as well (we don't need it, we need
# only the small client to communicate with the host's docker server)
ADD build/docker/docker.tgz /
What is that docker.tgz? How can I get it?
Reorder your Dockerfile and it should work.
FROM golang:1.13.5-alpine
RUN apk add --update docker
WORKDIR /go/src/app
COPY src .
RUN go get -d -v ./...
RUN go install -v ./...
CMD ["app"]
As you are copying before installation, so whenever you change something in src the cache will invalidate for docker installtion.
Whenever you have a COPY command, if any of the files involve change, it causes every command after that to get re-run. If you move your RUN apk add ... command to the start of the file before it COPYs anything, it will get cached across runs.
A fairly generic recipe for most Dockerfiles to accommodate this pattern looks like:
FROM some-base-image
# Install OS-level dependencies
RUN apk add or apt-get install ...
WORKDIR /app
# Install language-level dependencies
COPY requirements.txt requirements.lock ./
RUN something install -r requirements.txt
# Install the rest of the application
COPY main.app ./
COPY src src/
# Set up standard run-time metadata
EXPOSE 12345
CMD ["/app/main.app"]
(Go and Java applications need the additional step of compiling the application, which often lends itself to a multi-stage build, but this same pattern can be repeated in both stages.)
You can download Docker x86_64 binaries for mac, linux, windows and unzip/untar and make it executable.
Whenever you are installing any packages in Docker container those should go at the beginning of Dockerfile, so it won’t ask you again to install same packages and COPY command part must be at the end of Dockerfile.

How to do configure,make and make install in docker build

Problem Statement
I am building a docker of my computational bioinformatics pipeline which contains many tools that will be called at different steps of pipelines. In this process, I am trying to add one tool The ViennaRNA Package which will be downloaded and compliled using source code. I have tried many ways to compile it in docker build (as shown below) but none of them is working.
Failed attempts
Code-1 :
FROM jupyter/scipy-notebook
USER root
MAINTAINER Vivek Ruhela <vivekr#iiitd.ac.in>
# Copy the application folder inside the container
ADD . /test1
# Set the default directory where CMD will execute
WORKDIR /test1
# Set environment variable
ENV HOME /test1
# Install RNAFold
RUN wget https://www.tbi.univie.ac.at/RNA/download/sourcecode/2_4_x/ViennaRNA-2.4.14.tar.gz -P ~/Tools
RUN tar xvzf ~/Tools/ViennaRNA-2.4.14.tar.gz -C ~/Tools
WORKDIR "~/Tools/ViennaRNA-2.4.14/"
RUN ./configure
RUN make && make check && make install
Error : configure file not found
Code-2 :
FROM jupyter/scipy-notebook
USER root
MAINTAINER Vivek Ruhela <vivekr#iiitd.ac.in>
# Copy the application folder inside the container
ADD . /test1
# Set the default directory where CMD will execute
WORKDIR /test1
# Set environment variable
ENV HOME /test1
# Install RNAFold
RUN wget https://www.tbi.univie.ac.at/RNA/download/sourcecode/2_4_x/ViennaRNA-2.4.14.tar.gz -P ~/Tools
RUN tar xvzf ~/Tools/ViennaRNA-2.4.14.tar.gz -C ~/Tools
RUN bash ~/Tools/ViennaRNA-2.4.14/configure
WORKDIR "~/Tools/ViennaRNA-2.4.14/"
RUN make && make check && make install
Error : make: *** No targets specified and no makefile found. Stop.
I also tried another way to tell the file location explicitly e.g.
RUN make -C ~/Tools/ViennaRNA-2.4.14/
Sill this approach is not working.
Expected Procedure
I have installed this tool in my system many times using the standard procedure as mentioned in tool documentation as
./configure
make
make check
make install
Similarly for docker, the following code should work
WORKDIR ~/Tools/ViennaRNA-2.4.14/
RUN ./configure && make && make check && make install
But this code is not working because I don't see any effect of workdir. I have checked that configure is creating makefile properly in my system. So it should create the make file in docker also.
Any suggestions on why this code is not working.
you are extract all the files in Tools folder which is in home ,try this:
WORKDIR $HOME/Tools/ViennaRNA-2.4.14
RUN ./configure
RUN make && make check && make install
the problem is WORKDIR ~/Tools/ViennaRNA-2.4.14/ is translated to exactly ~/Tools/ViennaRNA-2.4.14/ which is created a folder named ~ , you may also use $HOME instead

Resources