How can I let the gitlab-ci-runner DinD image cache intermediate images?

How can I let the gitlab-ci-runner DinD image cache intermediate images? - docker

I have a Dockerfile that starts with installing the texlive-full package, which is huge and takes a long time. If I docker build it locally, the intermedate image created after installation is cached, and subsequent builds are fast.
However, if I push to my own GitLab install and the GitLab-CI build runner starts, this always seems to start from scratch, redownloading the FROM image, and doing the apt-get install again. This seems like a huge waste to me, so I'm trying to figure out how to get the GitLab DinD image to cache the intermediate images between builds, without luck so far.
I have tried using the --cache-dir and --docker-cache-dir for the gitlab-runner register command, to no avail.
Is this even something the gitlab-runner DinD image is supposed to be able to do?
My .gitlab-ci.yml:
build_job:
script:
- docker build --tag=example/foo .
My Dockerfile:
FROM php:5.6-fpm
MAINTAINER Roel Harbers <roel.harbers#example.com>
RUN apt-get update && apt-get install -qq -y --fix-missing --no-install-recommends texlive-full
RUN echo Do other stuff that has to be done every build.
I use GitLab CE 8.4.0 and gitlab/gitlab-runner:latest as runner, started as
docker run -d --name gitlab-runner --restart always \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /usr/local/gitlab-ci-runner/config:/etc/gitlab-runner \
gitlab/gitlab-runner:latest \
; \
The runner is registered using:
docker exec -it gitlab-runner gitlab-runner register \
--name foo.example.com \
--url https://gitlab.example.com/ci \
--cache-dir /cache/build/ \
--executor docker \
--docker-image gitlab/dind:latest \
--docker-privileged \
--docker-disable-cache false \
--docker-cache-dir /cache/docker/ \
; \
This creates the following config.toml:
concurrent = 1
[[runners]]
name = "foo.example.com"
url = "https://gitlab.example.com/ci"
token = "foobarsldkflkdsjfkldsj"
tls-ca-file = ""
executor = "docker"
cache_dir = "/cache/build/"
[runners.docker]
image = "gitlab/dind:latest"
privileged = true
disable_cache = false
volumes = ["/cache"]
cache_dir = "/cache/docker/"
(I have experimented with different values for cache_dir, docker_cache_dir and disable_cache, all with the same result: no caching whatsoever)

I suppose there's no simple answer to your question. Before adding some details, I strongly suggest to read this blog article from the maintainer of DinD, which was originally named "do not use Docker in Docker for CI".
What you might try is declaring /var/lib/docker as a volume for your GitLab runner. But be warned, depending on your file-system drivers you may use AUFS in the container on an AUFS filesystem on your host, which is very likely to cause problems.
What I'd suggest to you is creating a separate Docker-VM, only for the runner(s), and bind-mount docker.sock from the VM into your runner-container.
We are using this setup with GitLab with great success (>27.000 builds in about 12 months).
You can take a look at our runner with docker-compose support which is actually based on the shell-executor of GitLab's runner.

Currently you cannot cache intermediate layers in GitLab Docker-in-Docker. Altough there are plans to add that (that are mentioned in the link below). What you can do today to speed up your DinD build is to use the overlay filesystem. To do this you need to be running a liunx kernel >=3.18 and make sure you load the overlay kernel module. Then you set this variable in your gitlab-ci.yml:
variables:
DOCKER_DRIVER: overlay
For more information see this issue and in particular this comment on "The state of optimising Docker Builds!", see the "Using docker executor with dind" section.
https://gitlab.com/gitlab-org/gitlab-ce/issues/17861#note_12991518

For build dependencies that do not change so ofter you can do kinda manual caching with gitlab image registry.
In CI script you do not explicitely call docker build but rather wrap it in a shell script
# cat build_dependencies.sh
registry=registry.example.com
project=group/project
imagebase=$registry/$project/linux
docker pull $imagebase/devbase:1.0
if [ $? -ne 0 ]; then
docker build -f devbase.dockerfile -t $imagebase/devbase:1.0 .
docker push $imagebase/devbase:1.0
fi
...
and call that script in your CI
...
script:
- ./build_dependencies.sh
The downside to this is that when your devbase.dockerfile is updated this would get unnoticed by CI, so you need to force build and push of a new image. So for dynamicly changing images this does not work well, but for your use case this seems like a possible way to go.

Related

Is it possible to add an installer, run it and delete it during one build step in Docker?

I'm trying to create a Docker image from a pretty large installer binary (300+ MB). I want to add the installer to the image, install it, and delete the installer. This doesn't seem to be possible:
COPY huge-installer.bin /tmp
RUN /tmp/huge-installer.bin
RUN rm /tmp/huge-installer.bin # <- has no effect on the image size
Using multiple build stages doesn't seem to solve this, since I need to run the installer in the final image. If I could execute the installer directly from a previous build stage, without copying it, that would solve my problem, but as far as I know that's not possible.
Is there any way to avoid including the full weight of the installer in the final image?

I ended up solving this by using the built-in HTTP server in Python to make the project directory available to the image over HTTP.
Inside the Dockerfile, I can run commands like this, piping scripts directly to bash using curl:
RUN curl "http://127.0.0.1:${SERVER_PORT}/installer-${INSTALLER_VERSION}.bin" | bash
Or save binaries, run them and delete them in one step:
RUN curl -O "http://127.0.0.1:${SERVER_PORT}/binary-${INSTALLER_VERSION}.bin" && \
./binary-${INSTALLER_VERSION}.bin && \
rm binary-${INSTALLER_VERSION}.bin
I use a Makefile to start the server and stop it after the build, but you can use a build script instead.
Here's a Makefile example:
SHELL := bash
IMAGE_NAME := app-test
VERSION := 1.0.0
SERVER_PORT := 8580
.ONESHELL:
.PHONY: build
build:
# Kills the HTTP server when the build is done
function cleanup {
pkill -f "python3 -m http.server.*${SERVER_PORT}"
}
trap cleanup EXIT
# Starts a HTTP server that makes the contents of the project directory
# available to the image
python3 -m http.server -b 127.0.0.1 ${SERVER_PORT} &>/dev/null &
sleep 1
EXTRA_ARGS=""
# Allows skipping the build cache by setting NO_CACHE=1
if [[ -n $$NO_CACHE ]]; then
EXTRA_ARGS="--no-cache"
fi
docker build $$EXTRA_ARGS \
--network host \
--build-arg SERVER_PORT=${SERVER_PORT} \
-t ${IMAGE_NAME}:latest \
.
docker tag ${IMAGE_NAME}:latest ${IMAGE_NAME}:${VERSION}

I think the best way is to download the bin from a website then run it:
RUN wget http://myweb/huge-installer.bin && /tmp/huge-installer.bin && rm /tmp/huge-installer.bin
in this way your image layer will not contain the binary you download

I didn't test it thoroughly, but wouldn't such an approach be viable? (Besides LinPy's answer, which is way easier if you have the possibility to just do it that way.)
Dockerfile:
FROM alpine:latest
COPY entrypoint.sh /tmp/entrypoint.sh
RUN \
echo "I am an image that can run your huge installer binary!" \
&& echo "I will only function when you give it to me as a volume mount."
ENTRYPOINT [ "/tmp/entrypoint.sh" ]
entrypoint.sh:
#!/bin/sh
/tmp/your-installer # install your stuff here
while true; do
echo "installer finished, commit me now!"
sleep 5
done
Then run:
$ docker build -t foo-1
$ docker run --rm --name foo-1 --rm -d -v $(pwd)/your-installer:/tmp/your-installer
$ docker logs -f foo-1
# once it echoes "commit me now!", run the next command
$ docker commit foo-1 foo-2
$ docker stop foo-1
Since the installer was only mounted as a volume, the image foo-2 should not contain it anymore. You could also go and build another Dockerfile based on foo-2 to change the entrypoint, for example.
Cf. docker commit

How to serve multiple versions of model via standard tensorflow serving docker image?

I'm new to Tensorflow serving,
I just tried Tensorflow serving via docker with this tutorial and succeeded.
However, when I tried it with multiple versions, it serves only the latest version.
Is it possible to do that? Or do I need to try something different?

This require a ModelServerConfig, which will be supported by the next docker image tensorflow/serving release 1.11.0 (available since 5. Okt 2018). Until then, you can create your own docker image, or use tensorflow/serving:nightly or tensorflow/serving:1.11.0-rc0 as stated here.
See that thread for how to implement multiple models.
If you on the other hand want to enable multiple versions of a single model, you can use the following config file called "models.config":
model_config_list: {
config: {
name: "my_model",
base_path: "/models/my_model",
model_platform: "tensorflow",
model_version_policy: {
all: {}
}
}
}
here "model_version_policy: {all:{ } }" make every versions of the model available.
Then run the docker:
docker run -p 8500:8500 8501:8501 \
--mount type=bind,source=/path/to/my_model/,target=/models/my_model \
--mount type=bind,source=/path/to/my/models.config,target=/models/models.config \
-t tensorflow/serving:nightly --model_config_file=/models/models.config
Edit:
Now that version 1.11.0 is available, you can start by pulling the new image:
docker pull tensorflow/serving
Then run the docker image as above, using tensorflow/serving instead of tensorflow/serving:nightly.

I found a way to achieve this by building my own docker image which uses --model_config_file option instead of --model_name and --model_base_path.
So I'm running tensorflow serving with below command.
docker run -p 8501:8501 -v {local_path_of_models.conf}:/models -t {docker_iamge_name}
Of course, I wrote 'models.conf' for multiple models also.
edit:
Below is what I modified from original docker file.
original version:
tensorflow_model_server --port=8500 --rest_api_port=8501 \
--model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} \
modified version:
tensorflow_model_server --port=8500 --rest_api_port=8501 \
--model_config_file=${MODEL_BASE_PATH}/models.conf \

docker build using previous build caches from registry

I'm configuring a bamboo build plan to build docker images. Using AWS ECS as registry. Build plan is something like this;
pull the latest tag
docker pull xxx.dkr.ecr.eu-west-1.amazonaws.com/myimage:latest
build image with latest tag
docker build -t myimage:latest .
tag the image (necessary for ECS)
docker tag -f myimage:latest xxx.dkr.ecr.eu-west-1.amazonaws.com/myimage:latest
Push the image to the registry
docker push xx.dkr.ecr.eu-west-1.amazonaws.com/myimage:latest
Because build tasks run on different and fresh build engines/servers every time, It doesn't have local cache.
When I don't change anything at Dockerfile and execute it again(at another server), I would expect docker to use local cache(comes from docker pull) and doesn't execute each line again. But it tries to build image everytime. I was also expecting that when I change something at the bottom of the file, it's going to use cache and executes only the latest line, but I'm not sure about this.
Do I know something wrong or are there any opinions on approach?

are you considering using squid proxy
?
edit : in case you dont want to go to the official website above, here is quick setup on squid proxy (debian based)
apt-get install squid-deb-proxy
and then change the squid configuration to create a larger space by open up
/etc/squid/squid.conf
and replace #cache_dir ufs /var/spool/squid with cache_dir ufs /var/spool/
squid 10000 16 256
and there you go,, a 10.000 MB worth of cache space
and then point the proxy address in dockerfile,,
here is a example of dockerfile with squid proxy
yum and apt-get based distro:
apt-get based distro
`FROM debian
RUN apt-get update -y && apt-get install net-tools
RUN echo "Acquire::http::Proxy \"http://$( \
route -n | awk '/^0.0.0.0/ {print $2}' \
):8000\";" \ > /etc/apt/apt.conf.d/30proxy
RUN echo "Acquire::http::Proxy::ppa.launchpad.net DIRECT;" >> \
/etc/apt/apt.conf.d/30proxy
CMD ["/bin/bash"]`
yum based distro
`FROM centos:centos7
RUN yum update -y && yum install -y net-tools
RUN echo "proxy=http://$(route -n | \
awk '/^0.0.0.0/ {print $2}'):3128" >> /etc/yum.conf
RUN sed -i 's/^mirrorlist/#mirrorlist/' \
/etc/yum.repos.d/CentOS-Base.repo
RUN sed -i 's/^#baseurl/baseurl/' \
/etc/yum.repos.d/CentOS-Base.repo
RUN rm -f /etc/yum/pluginconf.d/fastestmirror.conf
RUN yum update -y
CMD ["/bin/bash"]`
lets say you install squid proxy in your aws registry,,only the first build would fetch the data from the internet the rest (another server) build should be from the squid proxy cached. .
this technique based on book docker in practice technique 57 with tittle set up a package cache for faster build
i dont think there is a cache feature in docker without any third party software.. maybe there is and i just dont know it. .im not sure,,
just correct me if im wrong. .

Docker: share private key via arguments

I want to share my github private key into my docker container.
I'm thinking about sharing it via docker-compose.yml via ARGs.
Is it possible to share private key using ARG as described here?
Pass a variable to a Dockerfile from a docker-compose.yml file
# docker-compose.yml file
version: '2'
services:
my_service:
build:
context: .
dockerfile: ./docker/Dockerfile
args:
- PRIVATE_KEY=MULTI-LINE PLAIN TEXT RSA PRIVATE KEY
and then I expect to use it in my Dockerfile as:
ARG PRIVATE_KEY
RUN echo $PRIVATE_KEY >> ~/.ssh/id_rsa
RUN pip install git+ssh://git#github.com/...
Is it possible via ARGs?

If you can use the latest docker 1.13 (or 17.03 ce), you could then use the docker swarm secret: see "Managing Secrets In Docker Swarm Clusters"
That allows you to associate a secret to a container you are launching:
docker service create --name test \
--secret my_secret \
--restart-condition none \
alpine cat /run/secrets/my_secret
If docker swarm is not an option in your case, you can try and setup a docker credential helper.
See "Getting rid of Docker plain text credentials". But that might not apply to a private ssh key.
You can check other relevant options in "Secrets and LIE-abilities: The State of Modern Secret Management (2017)", using standalone secret manager like Hashicorp Vault.

Although the ARG itself will not persist in the built image, when you reference the ARG variable somewhere in the Dockerfile, that will be in the history:
FROM busybox
ARG SECRET
RUN set -uex; \
echo "$SECRET" > /root/.ssh/id_rsa; \
do_deploy_work; \
rm /root/.ssh/id_rsa
As VonC notes there's now a swarm feature to store and manage secrets but that doesn't (yet) solve the build time problem.
Builds
Coming in Docker ~ 1.14 (or whatever the equivalent new release name is) should be the --build-secret flag (also #28079) that lets you mount a secret file during a build.
In the mean time, one of the solutions is to run a network service somewhere that you can use a client to pull secrets from during the build. Then if the build puts the secret in a file, like ~/.ssh/id_rsa, the file must be deleted before the RUN step that created it completes.
The simplest solution I've seen is serving a file with nc:
docker network create build
docker run --name=secret \
--net=build \
--detach \
-v ~/.ssh/id_rsa:/id_rsa \
busybox \
sh -c 'nc -lp 8000 < /id_rsa'
docker build --network=build .
Then collect the secret, store it, use it and remove it in the Dockerfile RUN step.
FROM busybox
RUN set -uex; \
nc secret 8000 > /id_rsa; \
cat /id_rsa; \
rm /id_rsa
Projects
There's a number of utilities that have this same premise, but in various levels of complexity/features. Some are generic solutions like Hashicorps Vault.
Dockito Vault
Hashicorp Vault
docker-ssh-exec

How to test the container or image after docker build?

I have the following Dockerfile
############################################################
# Purpose : Dockerize Django App to be used in AWS EC2
# Django : 1.8.1
# OS : Ubuntu 14.04
# WebServer : nginx
# Database : Postgres inside RDS
# Python : 2.7
# VERSION : 0.1
############################################################
from ubuntu:14.04
maintainer Kim Stacks, kimcity#gmail.com
# make sure package repository is up to date
run echo "deb http://archive.ubuntu.com/ubuntu $(lsb_release -sc) main universe" > /etc/apt/sources.list
run apt-get update
# install python
# install nginx
Inside my VM, I did the following:
docker build -t ubuntu1404/djangoapp .
It is successful.
What do I do to run the docker image?
Where is the image or container?
I have already tried running
docker run ubuntu1404/djangoapp
Nothing happens.
What I see when I run docker images
root#vagrant-ubuntu-trusty-64:/var/virtual/Apps/DockerFiles/Django27InUbuntu# docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
ubuntu1404/djangoapp latest cfb161605c8e 10 minutes ago 198.3 MB
ubuntu 14.04 07f8e8c5e660 10 days ago 188.3 MB
hello-world latest 91c95931e552 3 weeks ago 910 B
When I run docker ps, nothing shows up

You have to give a command your container will have to process.
Example : sh
you could try :
docker run -ti yourimage sh
(-ti is used to keep a terminal open)
If you want to launch a daemon (like a server), you will have to enter something like :
docker run -d yourimage daemontolaunch
Use docker help run for more options.
You also can set a default behaviour with CMD instruction in your Dockerfile so you won't have to give this command to your container each time you want to run it.
EDIT - about container removing :
Containers and images are different.
A container is an instance of an image.
You can run several containers from the same image.
The container automatically stops when the process it runs terminates.
But the container isn't deleted (just stopped, so you can restart it).
But if you want to remove it (removing a container doesn't remove the image) you have two ways to do :
automatically removing it at the end of the process by adding --rm option to docker run.
Manually removing it by using the docker rm command and giving it the container ID or its name (a container has to be stopped before being removed, use docker stop for this).
A usefull command :
Use docker ps to list containers. -q to display only the container IDs, -a to display even stopped containers.
More here.
EDIT 2:
This could also help you to discover docker if you didn't try it.

How to test the container or image after docker build?
In order to test you can add write a bash script which will do the job https://blog.brazdeikis.io/posts/docker-image-tests
Btw, from the post, I see that it does not match the question from the title.
So, Added a link for the souls who arrived here based on the title...

Download the latest shaded dist from https://github.com/dgroup/docker-unittests/releases:
wget https://github.com/dgroup/docker-unittests/releases/download/s1.1.1/docker-unittests-app-1.1.1.jar
De fine an *.yml file with tests.
version: 1.1
setup:
- apt-get update
- apt-get install -y tree
tests:
- assume: java version is 1.9, Debian build
cmd: java -version
output:
contains:
- openjdk version "9.0.1"
- build 9.0.1+11-Debian
- assume: curl version is 7.xxx
cmd: curl --version
output:
startsWith: curl 7.
matches:
- "^curl\\s7.*\\n.*\\nProtocols.+ftps.+https.+telnet.*\\n.*\\n$"
contains:
- AsynchDNS IDN IPv6 Largefile GSS-API
- assume: Setup section installed `tree`
cmd: tree --version
output:
contains: ["Steve Baker", "Florian Sesser"]
Run tests for image
java -jar docker-unittests.jar -f image-tests.yml -i openjdk:9.0.1-11
https://i.stack.imgur.com/DSv72.png

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

How can I let the gitlab-ci-runner DinD image cache intermediate images? - docker

Related

Is it possible to add an installer, run it and delete it during one build step in Docker?

How to serve multiple versions of model via standard tensorflow serving docker image?

docker build using previous build caches from registry

Docker: share private key via arguments

How to test the container or image after docker build?

Categories

Resources