Why might the host behave more deterministic than a docker container? - docker

We use Docker to well define the build environment and help with deterministic builds but on my machine I get a tiny change in the build results using Docker but not when not using Docker.
I did pretty extensive testing and am out of ideas :(
I tested on the following systems:
A: My new PC without Docker
AD1: My new PC with Docker, using our Dockerfile based on ubuntu:18.04 compiled "a year ago"
AD2: My new PC with Docker, using our Dockerfile based on ubuntu:19:10 compiled now
B: My laptop (that I had copied the disk from to my new PC) without Docker
BD: My laptop with Docker
CD1: Co-worker's laptop with Docker, using our Dockerfile based on ubuntu:18.04 compiled "a year ago"
CD2: Co-worker's laptop with Docker, using our Dockerfile based on ubuntu:19:10 compiled now
DD: A Digital Ocean VPS with our Dockerfile based on ubuntu:18.04 compiled now
In all scenarios we got either of two build results I will name variant X and Y.
We got variant X using A, B, CD1, CD2 and DD.
We got variant Y using AD1, AD2 and BD.
The issue keeps being 100% reproducible since several releases of our Android app. It did not go away when I updated my Docker from 19.03.6 to 19.03.8 to match my co-worker's version. We both had Ubuntu 19.10 back then and I now keep getting the issue with Ubuntu 20.04.
I always freshly cloned our project into a new folder, used disorderfs to eliminate file system sorting issues and mounted the folder into the docker container.
I doubt it's relevant but we are using this Dockerfile:
FROM ubuntu:18.04
RUN dpkg --add-architecture i386 && \
apt-get update -y && \
apt-get install -y software-properties-common && \
apt-get update -y && \
apt-get install -y wget \
openjdk-8-jre-headless=8u162-b12-1 \
openjdk-8-jre=8u162-b12-1 \
openjdk-8-jdk-headless=8u162-b12-1 \
openjdk-8-jdk=8u162-b12-1 \
git unzip && \
rm -rf /var/lib/apt/lists/* && \
apt-get autoremove -y && \
apt-get clean
# download and install Android SDK
ARG ANDROID_SDK_VERSION=4333796
ENV ANDROID_HOME /opt/android-sdk
RUN mkdir -p /opt/android-sdk && cd /opt/android-sdk && \
wget -q https://dl.google.com/android/repository/sdk-tools-linux-${ANDROID_SDK_VERSION}.zip && \
unzip *tools*linux*.zip && \
rm *tools*linux*.zip && \
yes | $ANDROID_HOME/tools/bin/sdkmanager --licenses
Also here are the build instructions I run and get different results. The diff itself is can be found here.
Edit: I also filed it as a bug on the docker repo.

Docker is not fully architecture-independent. For different architecture you may have more or less minute differences. Usually it should not affect anything important but may change some optimisation decisions of a compiler and such things. It is more visible if you try very different CPUs like AMD64 vs ARM. For Java it should not matter but it seems that at least sometimes it matters.
Another thing is network and DNS. When you do apt-get, wget and other such things then it downloads code or binary from network. It may differ depending on which DNS you use (which may lead to different server or different repo url) and there can be some minute differences. Theoretically there should be no difference but practically there can be difference sometimes like when they roll out new version and it's visible only on some nodes or something bad happened or you have some cache/proxy in between and connect through that and it caches etc.
Also the latter can create differences that appear in time. Like app is compiled on one month and someone tries to verify few weeks or months later and apt-get installs other versions of libraries and in effect there are minute differences.
I'm not sure which applies here but I have some ideas:
may try to make some small changes to the app so in effect it will again build same on most of popular CPU's, do extensive testing, and then list architectures on which it can be verified
make verification process a little more complex and non-free so users should have to run a server instance (on AWS or Google or Azure or Rackspace or other) with specified architecture and build and verify there - may try and specify on which types of machines exacly result will be the same and what are minimal requirements (as it may or may not run on free-plan instances)
check diff of created images content (not only apk but full system image), maybe there will be something important that differs between docker images on different machines producing different results
try to find as small as possible initial image and don't allow apt-get or other things automatically install dependencies with newest version but specify all dependencies and their versions

Related

gRPC service definitions: containerize .proto compilation?

Let's say we have a services.proto with our gRPC service definitions, for example:
service Foo {
rpc Bar (BarRequest) returns (BarReply) {}
}
message BarRequest {
string test = 1;
}
message BarReply {
string test = 1;
}
We could compile this locally to Go by running something like
$ protoc --go_out=. --go_opt=paths=source_relative \
--go-grpc_out=. --go-grpc_opt=paths=source_relative \
services.proto
My concern though is that running this last step might produce inconsistent output depending on the installed version of the protobuf compiler and the Go plugins for gRPC. For example, two developers working on the same project might have slightly different versions installed locally.
It would seem reasonable to me to address this by containerizing the protoc step. For example, with a Dockerfile like this...
FROM golang:1.18
WORKDIR /src
RUN apt-get update && apt-get install -y protobuf-compiler
RUN go install google.golang.org/protobuf/cmd/protoc-gen-go#v1.26
RUN go install google.golang.org/grpc/cmd/protoc-gen-go-grpc#v1.1
CMD protoc --go_out=. --go_opt=paths=source_relative --go-grpc_out=. --go-grpc_opt=paths=source_relative services.proto
... we can run the protoc step inside a container:
docker run --rm -v $(pwd):/src $(docker build -q .)
After wrapping the previous command in a shell script, developers can run it on their local machine, giving them deterministic, reproducible output. It can also run in a CI/CD pipeline.
My question is, is this a sound approach and/or is there an easier way to achieve the same outcome?
NB, I was surprised to find that the official grpc/go image does not come with protoc preinstalled. Am I off the beaten path here?
My question is, is this a sound approach and/or is there an easier way to achieve the same outcome?
It is definitely a good approach. I do the same. Not only to have a consistent across the team, but also to ensure we can produce the same output in different OSs.
There is an easier way to do that, though.
Look at this repo: https://github.com/jaegertracing/docker-protobuf
The image is in Docker hub, but you can create your image if you prefer.
I use this command to generate Go:
docker run --rm -u $(id -u) \
-v${PWD}/protos/:/source \
-v${PWD}/v1:/output \
-w/source jaegertracing/protobuf:0.3.1 \
--proto_path=/source \
--go_out=paths=source_relative,plugins=grpc:/output \
-I/usr/include/google/protobuf \
/source/*

Lightweight GCC for Alpine

Is there a lightweight GCC distribution that I can install in Alpine?
I am trying to make a small Docker image. For that reason, I am using Alpine as the base image (5MB). The standard GCC install dwarfs this in comparison (>100MB).
So is there a lightweight GCC distribution that I can install on Alpine?
Note: Clang is much worse (475MB last I checked).
There isn't such an image available, AFAIK, but you can make GCC slimmer by deleting unneeded GCC binaries.
It very much depends on what capabilities are required from GCC.
As a starting point, I'm assuming you need C support only, which means the gcc and musl-dev packages (for standard headers) are installed, which result with a ~100MB image with Alpine 3.8.
If you don't need Objective-C support, you could remove cc1obj, which is the Objective-C backend. On Alpine 3.8, it would be located at /usr/libexec/gcc/x86_64-alpine-linux-musl/6.4.0/cc1obj, and takes up 17.6MB.
If you don't need link time optimization (LTO), you could remove the LTO wrapper and main executables, lto-wrapper and lto1, which take up 700kb and 16.8MB respectively.
While LTO optimization may be powerful, on most applications it's likely to result with only minor speed and size improvements (few percents). Plus, you have to opt-in for LTO, which is not done by most applications, so it may be a good candidate for removal.
You could remove the Java front end, gcj, which doesn't seem to be working anyways. It is located at /usr/bin/x86_64-alpine-linux-musl-gcj, and weights 812kb.
By removing these, and squashing the resulting image, it would shrink into 64.4MB, which is still considerably large. You may be able to shrink further by removing additional files, but then you may loose some desired functionality and with a less appealing tradeoff.
Here's an example Dockerfile:
FROM alpine:3.8
RUN set -ex && \
apk add --no-cache gcc musl-dev
RUN set -ex && \
rm -f /usr/libexec/gcc/x86_64-alpine-linux-musl/6.4.0/cc1obj && \
rm -f /usr/libexec/gcc/x86_64-alpine-linux-musl/6.4.0/lto1 && \
rm -f /usr/libexec/gcc/x86_64-alpine-linux-musl/6.4.0/lto-wrapper && \
rm -f /usr/bin/x86_64-alpine-linux-musl-gcj
Tested using:
sudo docker image build --squash -t alpine-gcc-minimal .

docker build using previous build caches from registry

I'm configuring a bamboo build plan to build docker images. Using AWS ECS as registry. Build plan is something like this;
pull the latest tag
docker pull xxx.dkr.ecr.eu-west-1.amazonaws.com/myimage:latest
build image with latest tag
docker build -t myimage:latest .
tag the image (necessary for ECS)
docker tag -f myimage:latest xxx.dkr.ecr.eu-west-1.amazonaws.com/myimage:latest
Push the image to the registry
docker push xx.dkr.ecr.eu-west-1.amazonaws.com/myimage:latest
Because build tasks run on different and fresh build engines/servers every time, It doesn't have local cache.
When I don't change anything at Dockerfile and execute it again(at another server), I would expect docker to use local cache(comes from docker pull) and doesn't execute each line again. But it tries to build image everytime. I was also expecting that when I change something at the bottom of the file, it's going to use cache and executes only the latest line, but I'm not sure about this.
Do I know something wrong or are there any opinions on approach?
are you considering using squid proxy
?
edit : in case you dont want to go to the official website above, here is quick setup on squid proxy (debian based)
apt-get install squid-deb-proxy
and then change the squid configuration to create a larger space by open up
/etc/squid/squid.conf
and replace #cache_dir ufs /var/spool/squid with cache_dir ufs /var/spool/
squid 10000 16 256
and there you go,, a 10.000 MB worth of cache space
and then point the proxy address in dockerfile,,
here is a example of dockerfile with squid proxy
yum and apt-get based distro:
apt-get based distro
`FROM debian
RUN apt-get update -y && apt-get install net-tools
RUN echo "Acquire::http::Proxy \"http://$( \
route -n | awk '/^0.0.0.0/ {print $2}' \
):8000\";" \ > /etc/apt/apt.conf.d/30proxy
RUN echo "Acquire::http::Proxy::ppa.launchpad.net DIRECT;" >> \
/etc/apt/apt.conf.d/30proxy
CMD ["/bin/bash"]`
yum based distro
`FROM centos:centos7
RUN yum update -y && yum install -y net-tools
RUN echo "proxy=http://$(route -n | \
awk '/^0.0.0.0/ {print $2}'):3128" >> /etc/yum.conf
RUN sed -i 's/^mirrorlist/#mirrorlist/' \
/etc/yum.repos.d/CentOS-Base.repo
RUN sed -i 's/^#baseurl/baseurl/' \
/etc/yum.repos.d/CentOS-Base.repo
RUN rm -f /etc/yum/pluginconf.d/fastestmirror.conf
RUN yum update -y
CMD ["/bin/bash"]`
lets say you install squid proxy in your aws registry,,only the first build would fetch the data from the internet the rest (another server) build should be from the squid proxy cached. .
this technique based on book docker in practice technique 57 with tittle set up a package cache for faster build
i dont think there is a cache feature in docker without any third party software.. maybe there is and i just dont know it. .im not sure,,
just correct me if im wrong. .

building in docker using buildpack-deps, but dependencies don't seem to be installed?

I'm trying to write a Dockerfile to build Kaldi (an open source speech recognition system) based on the "buildpack-deps:jessie-scm" image. This is my Dockerfile:
FROM buildpack-deps:jessie-scm
RUN apt-get update
RUN apt-get install -y python2.7 libtool python libtool-bin make
RUN mkdir /opt/kaldi
RUN git clone https://github.com/kaldi-asr/kaldi.git /opt/kaldi --depth=1
RUN ln -s -f bash /bin/sh
WORKDIR /opt/kaldi
RUN cd tools/extras && ./check_dependencies.sh
RUN cd tools && ./install_portaudio.sh
RUN cd tools && make -j 4 && make clean
RUN cd src && ./configure --shared --use-cuda=no && make depend && make -j 4 && make -j 4 online onlinebin online2 && make clean
This fails at the "check_dependencies.sh" script, which is complaining that various base dependencies aren't installed (g++, zlib, automake, autoconf, patch, bzip2) ... but the description of the image that I'm basing this on (https://github.com/docker-library/buildpack-deps/blob/587934fb063d770d0611e94b57c9dd7a38edf928/jessie/Dockerfile) suggests that all of these dependencies should be available in the base image. Why is my build failing here?
I should note that I've attempted these build steps on a bare Debian Jessie system with the required dependencies installed and they were successful there, so I don't think it's a problem with the build scripts provided with Kaldi, but definitely a Docker-related issue.
Looks like I've misunderstood the different tags for the buildpack-deps image. The tags *-scm don't add source control tools to the bundled build tools and libraries, they only apply the source control tools, and the build tools are then added on top of those tools. So I should just be using buildpack-deps:jessie not buildpack-deps:jessie-scm (the latter of which is basically a bare Debian system with git etc installed but nothing else).

Debian httpredir mirror system unreliable/unusable in Docker?

Short Version
Debian's httpredir.debian.org mirror service causes my Docker builds to fail very frequently because apt-get can't download a package or connect to a server or things like that. Am I the only one having this problem? Is the problem mine, Debian's, or Docker's? Is there anything I can do about it?
Long Version
I have several Dockerfiles built on debian:jessie, and Debian by default uses the httpredir.debian.org service to find the best mirror when using apt-get, etc. Several months ago, httpredir was giving me continual grief when trying to build images. When run inside a Dockerfile, apt-get using httpredir would almost always mess up on a package or two, and the whole build would fail. The error usually looked like a mirror was outdated or corrupt in some way. I eventually stopped using httpredir in all my Dockerfiles by adding the following lines:
# don't use httpredir.debian.org mirror as it's very unreliable
RUN echo deb http://ftp.us.debian.org/debian jessie main > /etc/apt/sources.list
Today went back to trying httpredir.debian.org again because ftp.us.debian.org is out of date for a package I need, and sure enough it's failing on the Docker Hub:
Failed to fetch http://httpredir.debian.org/debian/pool/main/n/node-retry/node-retry_0.6.0-1_all.deb Error reading from server. Remote end closed connection [IP: 128.31.0.66 80]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
Here's the apt-get command I'm running in this case, though I've encountered it with many others:
RUN apt-get update && apt-get install -y \
build-essential \
chrpath \
libssl-dev \
libxft-dev \
libfreetype6 \
libfreetype6-dev \
libfontconfig1 \
libfontconfig1-dev \
curl \
bzip2 \
nodejs \
npm \
git
Thanks for any help you can provide.
I just had the same problem today, when rebuilding a Dockerfile I had not build in a while.
Adding this line before the apt-get install seems to do the trick:
RUN apt-get clean
Got the idea here:
https://github.com/docker/hub-feedback/issues/556
https://github.com/docker-library/buildpack-deps/issues/40
https://github.com/Silverlink/buildpack-deps/commit/be1f24eb136ba87b09b1dd09cc9a48707484b417
From the discussion on this question, and my experience dealing with this issue repeatedly over a number of months, apt-get clean seems to not in and of itself help, but the fact you're rebuilding (i.e. httpredir usually picks a different mirror) gets it to work. Indeed without exception manually triggering a rebuild or two has resulted in a successful build.
That is obviously not a viable solution, though. So, no, I don't have a solution, but I also don't have enough reputation to mark this as a duplicate.

Resources