How to test ansible roles? - docker

Scenario
I want to develop ansible roles. Those roles should be validated through a CI/CD process with molecule and utilize docker as driver. This validation step should include multiple Linux flavours (e.g. centos/ubuntu/debian) times the supported ansible versions.
The tests should then be executed such that the role is verified with
centos:7 + ansible:2.5
centos:7 + ansible:2.6
centos:7 + ansible:2.7
...
ubuntu:1604 + ansible:2.5
ubuntu:1604 + ansible:2.6
ubuntu:1604 + ansible:2.7
...
The issues at hand
there is no official ansible image availabe
how to best test a role for ansible version compatibility?
Issue 1: no official ansible images
The official images by the ansible team have been deprecated for about 3 years now:
https://hub.docker.com/r/ansible/ubuntu14.04-ansible
https://hub.docker.com/r/ansible/centos7-ansible
In addition, the link the deprecated images refer to in order to find new images supporting ansible is quite useless due to the sheer amount of results
https://hub.docker.com/search/?q=ansible&page=1&isAutomated=0&isOfficial=0&pullCount=1&starCount=0
Is there a well-maintained ansible docker image by the community (or ansible) that fills the void?
Preferable with multiple versions that can be pulled and a CI process that builds and validates the created image regularly.
Why am I looking for ansible images? I do not want to reinvent the wheel (if possible). I would like to use the images to test ansible roles via molecule for version incompatibility.
I searched but could not find anything truly useful. What images are you using to run ansible in a container/orchestrator? Do you build and maintain the images yourself?
e.g. https://hub.docker.com/r/ansibleplaybookbundle/apb-base/tags
looked promising, however, the images in there are also over 7 months old (at least).
Issue 2: how to best test a role for ansible version compatibility?
Is creating docker images for each combination of OS and ansible version the best way to test via molecule and docker as driver? Or is there a smarter way to test backward compatibility of ansible roles with multiple OS times different ansible versions?
I already test my ansible roles with molecule and docker as driver. Those tests currently only testing the functionality of the role on various Linux distros, but not the ansible backward compatibility by running the role with older ansible versions.
Here an example role with travis tests for centos7/ubuntu1604/ubuntu1804 based on geerlingguy's ntp role: https://github.com/Gepardec/ansible-role-ntp

Solution
In order to test ansible roles with multiple versions of ansible, python and various Linux flavors we can use
molecule for our ansible role functionality
docker as our abstraction layer on which we run the target system for our ansible role
tox to setup generic virtualenvs and test our various combinations without side effects
travis to automate it all
This will be quite a long/detailed answer. You can check out an example ansible role with the whole setup here
https://github.com/ckaserer/ansible-role-example
https://travis-ci.com/ckaserer/ansible-role-example
Step 1: test ansible role with molecule
Molecule docs: https://molecule.readthedocs.io/en/stable/
Fixes Issue: 1) no official ansible images
I could create ansible images for every distro I would like to test as jeff geerling describes in his blog posts.
The clear downside of this approach: The images will need maintenance (eventually)
However, with molecule, we can combine base images with a Dockerfile.j2 (Jinja2 Template) to create the images with minimal requirements to run ansible. Through this approach, we are now able to use the official linux distro images from docker hub and do not need to maintain a docker repo for each linux distro and the various versions.
Here the important bits in molecule.yml
platforms:
- name: instance-${TOX_ENVNAME}
image: ${MOLECULE_DISTRO:-'centos:7'}
command: /sbin/init
volumes:
- /sys/fs/cgroup:/sys/fs/cgroup:ro
privileged: true
The default dockerfile.j2 from molecule is already good, but I have a few additions
# Molecule managed
{% if item.registry is defined %}
FROM {{ item.registry.url }}/{{ item.image }}
{% else %}
FROM {{ item.image }}
{% endif %}
{% if item.env is defined %}
{% for var, value in item.env.items() %}
{% if value %}
ENV {{ var }} {{ value }}
{% endif %}
{% endfor %}
{% endif %}
RUN if [ $(command -v apt-get) ]; then apt-get update && apt-get install -y python sudo bash ca-certificates iproute2 && apt-get clean; \
elif [ $(command -v zypper) ]; then zypper refresh && zypper install -y python sudo bash python-xml iproute2 && zypper clean -a; \
elif [ $(command -v apk) ]; then apk update && apk add --no-cache python sudo bash ca-certificates; \
elif [ $(command -v xbps-install) ]; then xbps-install -Syu && xbps-install -y python sudo bash ca-certificates iproute2 && xbps-remove -O; \
elif [ $(command -v swupd) ]; then swupd bundle-add python3-basic sudo iproute2; \
elif [ $(command -v dnf) ] && cat /etc/os-release | grep -q '^NAME=Fedora'; then dnf makecache && dnf --assumeyes install python sudo python-devel python*-dnf bash iproute && dnf clean all; \
elif [ $(command -v dnf) ] && cat /etc/os-release | grep -q '^NAME="CentOS Linux"' ; then dnf makecache && dnf --assumeyes install python36 sudo platform-python-devel python*-dnf bash iproute && dnf clean all && ln -s /usr/bin/python3 /usr/bin/python; \
elif [ $(command -v yum) ]; then yum makecache fast && yum install -y python sudo yum-plugin-ovl bash iproute && sed -i 's/plugins=0/plugins=1/g' /etc/yum.conf && yum clean all; \
fi
# Centos:8 + ansible 2.7 failed with error: "The module failed to execute correctly, you probably need to set the interpreter"
# Solution: ln -s /usr/bin/python3 /usr/bin/python
By default, this will test the role with centos:7. However, we can set the environment variable MOLECULE_DISTRO to whichever image we would like to test and run it via
MOLECULE_DISTRO=ubuntu:18.04 molecule test
Summary
We use official distro images from docker hub to test our ansible role via molecule.
The files used in this step
molecule.yml https://github.com/ckaserer/ansible-role-example/blob/master/molecule/default/molecule.yml
Dockerfile.j2 https://github.com/ckaserer/ansible-role-example/blob/master/molecule/default/Dockerfile.j2
Source
https://molecule.readthedocs.io/en/stable/examples.html#systemd-container
https://www.jeffgeerling.com/blog/2019/how-i-test-ansible-configuration-on-7-different-oses-docker
https://www.jeffgeerling.com/blog/2018/testing-your-ansible-roles-molecule
Step 2: Check compatibility for your role ( python version X ansible version X linux distros )
Fixes Issue 2: how to best test a role for ansible version compatibility?
Let's use tox to create virtual environments to avoid side effects while testing various compatibility scenarios.
Here the important bits in tox.ini
[tox]
minversion = 3.7
envlist = py{3}-ansible{latest,29,28}-{ alpinelatest,alpine310,alpine39,alpine38, centoslatest,centos8,centos7, debianlatest,debian10,debian9,debian8, fedoralatest,fedora30,fedora29,fedora28, ubuntulatest,ubuntu2004,ubuntu1904,ubuntu1804,ubuntu1604 }
# only test currently supported ansible versions
# https://docs.ansible.com/ansible/latest/reference_appendices/release_and_maintenance.html#release-status
skipsdist = true
[base]
passenv = *
deps =
-rrequirements.txt
ansible25: ansible==2.5
...
ansiblelatest: ansible
commands =
molecule test
setenv =
TOX_ENVNAME={envname}
MOLECULE_EPHEMERAL_DIRECTORY=/tmp/{envname}
[testenv]
passenv =
{[base]passenv}
deps =
{[base]deps}
commands =
{[base]commands}
setenv =
...
centoslatest: MOLECULE_DISTRO="centos:latest"
centos8: MOLECULE_DISTRO="centos:8"
centos7: MOLECULE_DISTRO="centos:7"
...
{[base]setenv}
The entirety of requirements.txt
docker
molecule
by simply executing
tox
it will create virtual envs for each compatibility combination defined in tox.ini by
envlist = py{3}-ansible{latest,29,28}-{ alpinelatest,alpine310,alpine39,alpine38, centoslatest,centos8,centos7, debianlatest,debian10,debian9,debian8, fedoralatest,fedora30,fedora29,fedora28, ubuntulatest,ubuntu2004,ubuntu1904,ubuntu1804,ubuntu1604 }
which translates in this particular case to: python3 x ansible version x linux distro
Great! We have created tests for compatibility checks with the added benefit of always testing with ansible latest to notice breaking changes early on.
The files used in this step
tox.ini https://github.com/ckaserer/ansible-role-example/blob/master/tox.ini
requirements.txt https://github.com/ckaserer/ansible-role-example/blob/master/requirements.txt
Source
https://tox.readthedocs.io/en/latest/
https://molecule.readthedocs.io/en/stable/testing.html#tox
Step 3: CI with travis
Running the checks locally is good, running in a CI tool is great. So let's do that.
For this purpose following bits in the .travis.yml are important
---
version: ~> 1.0
os: linux
language: python
python:
- "3.8"
- "3.7"
- "3.6"
- "3.5"
services: docker
cache:
pip: true
directories:
- .tox
install:
- pip install tox-travis
env:
jobs:
# ansible:latest - check for breaking changes
...
- TOX_DISTRO="centoslatest" TOX_ANSIBLE="latest"
- TOX_DISTRO="centos8" TOX_ANSIBLE="latest"
- TOX_DISTRO="centos7" TOX_ANSIBLE="latest"
...
# ansible: check version compatibility
# only test currently supported ansible versions
#
https://docs.ansible.com/ansible/latest/reference_appendices/release_and_maintenance.html#release-status
- TOX_DISTRO="centoslatest" TOX_ANSIBLE="{29,28}"
- TOX_DISTRO="centos8" TOX_ANSIBLE="{29,28}"
- TOX_DISTRO="centos7" TOX_ANSIBLE="{29,28}"
...
script:
- tox -e $(echo py${TRAVIS_PYTHON_VERSION} | tr -d .)-ansible${TOX_ANSIBLE}-${TOX_DISTRO}
# remove logs/pycache before caching .tox folder
- |
rm -r .tox/py*/log/*
find . -type f -name '*.py[co]' -delete -o -type d -name __pycache__ -delete
First we have specified language: python to run builds with multiple versions of python as defined in the python: list.
We need docker, so we add it via services: docker.
The test will take quite some time, let's cache pip and our virtenv created by tox with
cache:
pip: true
directories:
- .tox
We need tox...
install:
- pip install tox-travis
And finally, we define all our test cases
env:
jobs:
# ansible:latest - check for breaking changes
...
- TOX_DISTRO="centoslatest" TOX_ANSIBLE="latest"
...
Note that I have separate jobs for latest and the distinct versions. That is on purpose. I would like to easily see what broke. Is it version compatibility or an upcoming change located in ansible's latest release.
The files used in this step
https://github.com/ckaserer/ansible-role-example/blob/master/.travis.yml
Source
https://docs.travis-ci.com/user/caching/
Bonus: run tox in parallel
You can run the tests in parallel (e.g. 3 test simultaneous) by executing
tox -p 3
However, this will not give the output from molecule. You can enable that with
tox -p 3 -o true
The obvious downside to this approach is the pain of figuring out which line belongs to which process in the parallel execution.
Source
https://tox.readthedocs.io/en/latest/example/basic.html#parallel-mode

No real answer here, but some ideas :
Ansible Silo might have fitted, but no commit for a year.
And it's not exactly what you're looking for, but Ansible Runner is meant to be a fit for the "run ansible" use case. And it's a part of Ansible Tower / AWS, so it should last.
Runner is intended to be most useful as part of automation and tooling that needs to invoke Ansible and consume its results.
They do mention executing from a container here
The design of Ansible Runner makes it especially suitable for controlling the execution of Ansible from within a container for single-purpose automation workflows
But the issue for you is that the official ansible/ansible-runner container is tagged after ansible-runner version, and ansible itself is installed through pip install ansible at container build time.

Related

Using Docker with Snakemake (as motivated by the Snakemake documentation) does not provide a true docker container?

In order to improve the distribution of my snakemake workflow, I am wanting to Docker-ize it. The way I usually deploy my software stack is through conda environments (saved as .yaml files within an envs/ directory of my project folder). They are then called by each rule through the conda: directive and --use-conda flag (at execution), for example:
rule exampleRule:
input: input.file
output: output.file
conda:
os.path.join(workflow.basedir, "envs/<nameOfEnv>.yaml")
shell: "some shell command"
Following the snakemake documentation, I am aware that I can Dockerize my full workflow using the command: snakemake --containerize.
The resulting Dockerfile is something like this (this is a contrived example, in reality I have seven environments that are all built inside this container):
FROM condaforge/mambaforge:latest
LABEL io.github.snakemake.containerized="true"
LABEL io.github.snakemake.conda_env_hash="dc12f3c8b1fb3caed02ab3305e24859bd63f4f1ea0c1ed29d71c857e7d0baaf5"
# Step 1: Retrieve conda environments
# Conda environment:
# source: envs/nameOfEnvironment.yaml
# prefix: /conda-envs/4e57ed29df8b6f849000ab15b5c719f2
# channels:
# - conda-forge
# - bioconda
# - anaconda
# - defaults
# dependencies:
# - wget
# - packageX == 5.3.0
# - packageY == 2.5.5
# - packageZ >= 1.1.1
RUN mkdir -p /conda-envs/4e57ed29df8b6f849000ab15b5c719f2
COPY envs/nameOfEnvironment.yaml /conda-envs/4e57ed29df8b6f849000ab15b5c719f2/environment.yaml
# Step 2: Generate conda environments
RUN mamba env create --prefix /conda-envs/4e57ed29df8b6f849000ab15b5c719f2 --file /conda-envs/4e57ed29df8b6f849000ab15b5c719f2/environment.yaml && \
mamba clean --all -y
I have already succeeded in following this method, and built an image that I host on Dockerhub that is then referenced in the Snakefile, e.g. container: "docker://myDockerHub/myWorkflow_dockerimage:latest"
Then, when executing with the --use-singularity flag, snakemake will pull this image and build the environments.
The problem is that the rest of the execution seems to follow a similar procedure as if I had not used singularity/docker whatsoever.
Meaning that in order to run the workflow in this way, I still require some base packages being installed, most especially Snakemake itself. Doesn't this sort of defeat the purpose of containerizing the workflow in the first place, or do I misunderstand something fundamental?
The only real solution I can think of is to create an image/container with Snakemake installed. Then I would run this image using singularity - but from then on I would be running the workflow from within the snakemake image as if I had not containerized the environments themselves. When executing snakemake from within this image, I cannot use the --use-singularity flag, as singularity itself is not available from within the image.
It seems completely counterintuitive to run snakemake without the --use-singularity flag, when my intention is to utilize singularity.

gRPC service definitions: containerize .proto compilation?

Let's say we have a services.proto with our gRPC service definitions, for example:
service Foo {
rpc Bar (BarRequest) returns (BarReply) {}
}
message BarRequest {
string test = 1;
}
message BarReply {
string test = 1;
}
We could compile this locally to Go by running something like
$ protoc --go_out=. --go_opt=paths=source_relative \
--go-grpc_out=. --go-grpc_opt=paths=source_relative \
services.proto
My concern though is that running this last step might produce inconsistent output depending on the installed version of the protobuf compiler and the Go plugins for gRPC. For example, two developers working on the same project might have slightly different versions installed locally.
It would seem reasonable to me to address this by containerizing the protoc step. For example, with a Dockerfile like this...
FROM golang:1.18
WORKDIR /src
RUN apt-get update && apt-get install -y protobuf-compiler
RUN go install google.golang.org/protobuf/cmd/protoc-gen-go#v1.26
RUN go install google.golang.org/grpc/cmd/protoc-gen-go-grpc#v1.1
CMD protoc --go_out=. --go_opt=paths=source_relative --go-grpc_out=. --go-grpc_opt=paths=source_relative services.proto
... we can run the protoc step inside a container:
docker run --rm -v $(pwd):/src $(docker build -q .)
After wrapping the previous command in a shell script, developers can run it on their local machine, giving them deterministic, reproducible output. It can also run in a CI/CD pipeline.
My question is, is this a sound approach and/or is there an easier way to achieve the same outcome?
NB, I was surprised to find that the official grpc/go image does not come with protoc preinstalled. Am I off the beaten path here?
My question is, is this a sound approach and/or is there an easier way to achieve the same outcome?
It is definitely a good approach. I do the same. Not only to have a consistent across the team, but also to ensure we can produce the same output in different OSs.
There is an easier way to do that, though.
Look at this repo: https://github.com/jaegertracing/docker-protobuf
The image is in Docker hub, but you can create your image if you prefer.
I use this command to generate Go:
docker run --rm -u $(id -u) \
-v${PWD}/protos/:/source \
-v${PWD}/v1:/output \
-w/source jaegertracing/protobuf:0.3.1 \
--proto_path=/source \
--go_out=paths=source_relative,plugins=grpc:/output \
-I/usr/include/google/protobuf \
/source/*

Check architecture in dockerfile to get amd/arm

We're working with Windows and Mac M1 machines to develop locally using Docker and need to fetch and install a .deb package within our docker environment.
The package needs amd64/arm64 depending on the architecture being used.
Is there a way to determine this in the docker file i.e.
if xyz === 'arm64'
RUN wget http://...../arm64.deb
else
RUN wget http://...../amd64.deb
First you need to check if there is no other (easier) way to do it with the package manager.
You can use the arch command to get the architecture (equivalent to uname -m). The problem is that it does not return the values you expect (aarch64 instead of arm64 and x86_64 instead of amd64). So you have to convert it, I have done it through a sed.
RUN arch=$(arch | sed s/aarch64/arm64/ | sed s/x86_64/amd64/) && \
wget "http://...../${arch}.deb"
Note: You should add the following code before running this command to prevent unexpected behavior in case of failure in the piping chain. See Hadolint DL4006 for more information.
SHELL ["/bin/bash", "-o", "pipefail", "-c"]
Later versions of Docker expose info about the current architecture using global variables:
BUILDPLATFORM — matches the current machine. (e.g. linux/amd64)
BUILDOS — os component of BUILDPLATFORM, e.g. linux
BUILDARCH — e.g. amd64, arm64, riscv64
BUILDVARIANT — used to set ARM variant, e.g. v7
TARGETPLATFORM — The value set with --platform flag on build
TARGETOS - OS component from --platform, e.g. linux
TARGETARCH - Architecture from --platform, e.g. arm64
TARGETVARIANT
Note that you may need to export DOCKER_BUILDKIT=1 on headless machines for these to work, and they are enabled by default on Docker Desktop.
Use in conditional scripts like so:
RUN wget "http://...../${TARGETARCH}.deb"
You can even branch Dockerfiles by using ONBUILD like this:
FROM alpine as build_amd
# Intel stuff goes here
ONBUILD RUN apt-get install -y intel_software.deb
FROM alpine as build_arm
# ARM stuff goes here
ONBUILD RUN apt-get install -y arm_software.deb
# choose the build to use for the rest of the image
FROM build_${TARGETARCH}
# rest of the dockerfile follows

Docker run vs build - build gstreamer Different behaviour

I'm trying to build a docker image that uses nvidia hardware decoding in gstreamer and have encountered a strange problem with making the image.
The build process does not find the nvidia cuda related stuff while running docker build (or nvidia-docker build), but when I spin up the failed image as a container and do those very same steps from within the container everything works. I even saved the container as image which gave me a persistent image that works as intended.
Has anyone experienced similar problem and can shed some light on it?
Dockerfile:
FROM nvcr.io/nvidia/deepstream:3.0-18.11 AS base
ENV DEBIAN_FRONTEND noninteractive
#install some dependencies. NOTE - not removing apt cache for the MWE
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
libdc1394-22 \
tmux \
vim \
libjpeg-dev \
libpng-dev \
libpng12-dev \
cuda-toolkit-10-0 \
python3-setuptools \
python3-pip ninja-build pkg-config gobject-introspection gnome-devel bison flex libgirepository1.0-dev liborc-0.4-dev
RUN pip3 install meson && ldconfig
FROM base
#pull and make gstreamer:
RUN cd /tmp && mkdir gstreamer
RUN git clone https://github.com/GStreamer/gst-build.git /tmp/gstreamer \
&& cd /tmp/gstreamer \
&& git checkout tags/1.16.0 \
&& ./setup.py -Dgtk_doc=disabled -Dgst-plugins-bad:nvdec=enabled -Dgst-plugins-bad:nvenc=enabled -Dgst-plugins-bad:iqa=disabled -Dgst-plugins-bad:bluez=disabled --reconfigure \
&& ninja -C build \
&& ninja install -C build
Testing:
build and run the container. Inside the container:
$ gst-inspect-1.0 nvdec
No such element or plugin 'nvdec'
$ cd /tmp/gstreamer
$ ./setup.py -Dgtk_doc=disabled -Dgst-plugins-bad:nvdec=enabled -Dgst-plugins-bad:nvenc=enabled -Dgst-plugins-bad:iqa=disabled -Dgst-plugins-bad:bluez=disabled --reconfigure
$ ninja -C build
$ ninja install -C build
$ gst-inspect-1.0 nvdec
Factory Details:
Rank primary (256)
[... all plugin parameters show up]
GObject
+----GInitiallyUnowned
+----GstObject
+----GstElement
+----GstVideoDecoder
+----GstNvDec
EDIT1
The image builds with no errors, only when I try to call gstreamer it is built with no acceleration. I noticed that in the build process the major difference is
meson.build:109:2: Exception: Problem encountered: The nvdec plugin was enabled explicitly, but required CUDA dependencies were not found.
which does not happen when building from within the container.
Lack of error is related, most likely, to the ninja+meson build system which looks for compatible packages, reports the exception, but doesn't throw it and continues as if nothing wrong happened
EDIT2
Answering comment:
To build it and get the error, just build the attached docker image:
sudo docker build -t gst16:latest . > build.log
This will dump all the output into the build.log file.
I don't have a docker registry that I could use for this and the docker image gets quite big by docker standards (~8 Gigs), but to produce successfully, it's fairly simple:
sudo docker run --runtime="nvidia" -ti gst16:latest /bin/bash
or
sudo nvidia-docker run -ti gst16:latest /bin/bash
which seems to work the same for me. Notice no --rm flag! From within the container:
#check if nvidia decoder plugin is there:
gst-inspect-1.0 nvdec
#fail!
#now build it from within:
cd /opt/gstreamer
./setup.py -Dgtk_doc=disabled -Dgst-plugins-bad:nvdec=enabled -Dgst-plugins-bad:nvenc=enabled -Dgst-plugins-bad:iqa=disabled -Dgst-plugins-bad:bluez=disabled --reconfigure
ninja -C build
ninja install -C build
gst-inspect-1.0 nvdec
#success reported
Now to get the image, exit the container (ctrl+d) and in the host shell:
sudo docker container ls -a to view all containers including stopped ones
from gst16:latest get the CONTAINER_ID and copy it
sudo docker commit <CONTAINER_ID> gst16:manual and after a few seconds you should have the container saved as an image. Verify with sudo docker images
run the new image with sudo docker run --runtime=`nvidia` --rm -ti gst16:manual /bin/bash
from within the container try again the gst-inspect-1.0 nvdec to verify it's working
EDIT3
$ nvidia-docker --version
Docker version 18.09.0, build 4d60db4
I think I found the solution/reason
Writing it here in case someone finds themselves in similar situation, plus I hate finding old threads with similar problem and no answer or "nevermind, I solved it" as the only follow up
Docker build does not have any ties to nvidia runtime and gstreamer requires access to the full nvidia toolchain in order to build the plugins that need it. This is to be resolved with gstreamer 1.18 but until then, there is no way to build gstreamer with nvidia codecs in docker build.
The workaround:
Build image with all dependencies.
Run a container of said image using runtime="nvidia" but don't use --rm flag
In the container, build gstreamer and install it as normally.
Verify with gst-inspect-1.0
Commit the container as new image: docker commit <container_name> <temporary_image_name>
Tag the temporary image properly.

docker build using previous build caches from registry

I'm configuring a bamboo build plan to build docker images. Using AWS ECS as registry. Build plan is something like this;
pull the latest tag
docker pull xxx.dkr.ecr.eu-west-1.amazonaws.com/myimage:latest
build image with latest tag
docker build -t myimage:latest .
tag the image (necessary for ECS)
docker tag -f myimage:latest xxx.dkr.ecr.eu-west-1.amazonaws.com/myimage:latest
Push the image to the registry
docker push xx.dkr.ecr.eu-west-1.amazonaws.com/myimage:latest
Because build tasks run on different and fresh build engines/servers every time, It doesn't have local cache.
When I don't change anything at Dockerfile and execute it again(at another server), I would expect docker to use local cache(comes from docker pull) and doesn't execute each line again. But it tries to build image everytime. I was also expecting that when I change something at the bottom of the file, it's going to use cache and executes only the latest line, but I'm not sure about this.
Do I know something wrong or are there any opinions on approach?
are you considering using squid proxy
?
edit : in case you dont want to go to the official website above, here is quick setup on squid proxy (debian based)
apt-get install squid-deb-proxy
and then change the squid configuration to create a larger space by open up
/etc/squid/squid.conf
and replace #cache_dir ufs /var/spool/squid with cache_dir ufs /var/spool/
squid 10000 16 256
and there you go,, a 10.000 MB worth of cache space
and then point the proxy address in dockerfile,,
here is a example of dockerfile with squid proxy
yum and apt-get based distro:
apt-get based distro
`FROM debian
RUN apt-get update -y && apt-get install net-tools
RUN echo "Acquire::http::Proxy \"http://$( \
route -n | awk '/^0.0.0.0/ {print $2}' \
):8000\";" \ > /etc/apt/apt.conf.d/30proxy
RUN echo "Acquire::http::Proxy::ppa.launchpad.net DIRECT;" >> \
/etc/apt/apt.conf.d/30proxy
CMD ["/bin/bash"]`
yum based distro
`FROM centos:centos7
RUN yum update -y && yum install -y net-tools
RUN echo "proxy=http://$(route -n | \
awk '/^0.0.0.0/ {print $2}'):3128" >> /etc/yum.conf
RUN sed -i 's/^mirrorlist/#mirrorlist/' \
/etc/yum.repos.d/CentOS-Base.repo
RUN sed -i 's/^#baseurl/baseurl/' \
/etc/yum.repos.d/CentOS-Base.repo
RUN rm -f /etc/yum/pluginconf.d/fastestmirror.conf
RUN yum update -y
CMD ["/bin/bash"]`
lets say you install squid proxy in your aws registry,,only the first build would fetch the data from the internet the rest (another server) build should be from the squid proxy cached. .
this technique based on book docker in practice technique 57 with tittle set up a package cache for faster build
i dont think there is a cache feature in docker without any third party software.. maybe there is and i just dont know it. .im not sure,,
just correct me if im wrong. .

Resources