Creating Docker image hierarchy with private registry - docker

TL;DR Must you put your private Docker URI in your Dockerfile FROM command if the parent image is in a private registry?
I thought this should be an easily answerable question, but cannot find a good set of Google keywords...
Detail:
I have three repos, all built with separate CI calls; the order of these CI executions is correct for this DAG (i.e. parent image will be available when a child needs it):
Repo 1 holds a Dockerfile that constructs a base image with dependencies. It is slow moving
Repo 2 & 3 hold applications that are built in a Dockerfile that pulls FROM the image built in Repo 1. These change frequently.
As I understand it, if you don't specify a repo URI in your FROM command, docker assumes you are pulling from DockerHub. Is this correct?
If these images are stored in a private registry, is it true that the private registry must be explicitly included in the child Dockerfiles? Is there another way?

Docker looks for the image name, on your local repository, if it does not find the image there, it would pull from the docker hub.
At first thought, it feels intuitive that we should be able to configure a private repository as the default, and then use it just as we would use docker hub.
However, this seems to be a topic of lengthy discussion, you can follow the discussion here.
Unfortunately, at the time of writing this, to build from a private repo, you will need to specify the complete URI in your Dockerfile:
FROM <aws_account_id>.dkr.ecr.<aws region>.amazonaws.com/<private image name>:latest
You will need to configure your Docker daemon to authenticate and pull from your private ECR repository:
aws ecr get-login-password --region region | docker login --username AWS --password-stdin aws_account_id.dkr.ecr.region.amazonaws.com
and then go for
docker build .
Alternatively, you can use arguments, to construct your ECR URI. This will keep things clean and parameterised in your Dockerfile while being explicit that you are using a private repo.
Eg:
In your Dockerfile
ARG PRIVATE_REPO
FROM ${PRIVATE_REPO}your_image_name
And build the docker image with:
docker build . --build-arg PRIVATE_REPO=aws_account_id.dkr.ecr.region.amazonaws.com/

Your basic assumptions are correct: if your base image is in a private registry, you generally must include the registry name in downstream Dockerfiles' FROM lines, docker run commands, Compose and Kubernetes image: lines, and so on. If the registry part of the image name is absent, it defaults to docker.io.
# Expands to docker.io/library/alpine:latest
FROM alpine
This only actually matters if the image isn't already present on the local machine, in which case Docker will automatically pull it. If it's already present, Docker doesn't worry about where it came from and just uses it as-is. If you have a manual multi-stage build, but don't actually push any of the intermediate stages anywhere, it doesn't actually matter if you tag it for any particular repository.
# "step 2" Dockerfile
# Build with: `docker build -t step2 -f step2.Dockerfile .`
# Starts from image in internal repository
FROM registry.example.com/img/step1
# Application Dockerfile
# Depends on "step 2" image being built first
# Since we expect step2:latest image to be present locally, won't pull
FROM step2

Related

Use cache docker image for gitlab-ci

I was wondering is it possible to use cached docker images in gitlab registry for gitlab-ci?
for example, I want to use node:16.3.0-alpine docker image, can I cache it in my gitlab registry and pull it from that and speed up my gitlab ci instead of pulling it from docker hub?
Yes, GitLab's dependency proxy features allow you to configure GitLab as a "pull through cache". This is also beneficial for working around rate limits of upstream sources like dockerhub.
It should be faster in most cases to use the dependency proxy, but not necessarily so. It's possible that dockerhub can be more performant than a small self-hosted server, for example. GitLab runners are also remote with respect to the registry and not necessarily any "closer" to the GitLab registry than any other registry over the internet. So, keep that in mind.
As a side note, the absolute fastest way to retrieve cached images is to self-host your GitLab runners and hold images directly on the host. That way, when jobs start, if the image already exists on the host, the job will start immediately because it does not need to pull the image (depending on your pull configuration). (that is, assuming you're using images in the image: declaration for your job)
I'm using a corporate Gitlab instance where for some reason the Dependency Proxy feature has been disabled. The other option you have is to create a new Docker image on your local machine, then push it into the Container Registry of your personal Gitlab project.
# First create a one-line Dockerfile containing "FROM node:16.3.0-alpine"
docker pull node:16.3.0-alpine
docker build . -t registry.example.com/group/project/image
docker login registry.example.com -u <username> -p <token>
docker push registry.example.com/group/project/image
where the image tag should be constructed based on the example given on your project's private Container Registry page.
Now in your CI job, you just change image: node:16.3.0-alpine to image: registry.example.com/group/project/image. You may have to run the docker login command (using a deploy token for credentials, see Settings -> Repository) in the before_script section -- I think maybe newer versions of Gitlab will have the runner authenticate to the private Container Registry using system credentials, but that could vary depending on how it's configured.

Modifying Docker Registry based on environment

My local development environment is behind a corporate proxy, with our own Docker registry, etc. However, we deploy on public infrastructure, meaning we can't access the corporate registries, and so have to pull from a public one (eg DockerHub).
Is there any way (eg via environment variables) for me to configure Docker to pull from a private registry when developing locally, and from a public registry when it goes through our CI/CD pipeline?
For example, let's say we're deploying a Node.JS application - locally, I would want the FROM node:16 line to get interpreted as FROM corporate.proxy/node:16.
There are a couple methods that would probably work - having two separate Dockerfiles, eg Dockerfile.dev and Dockerfile.prod, or wrapping it in some sort of script that will take care of making the change. I'm looking for a way to do it via Docker's configuration, if it's possible at all.
You can use ARG instruction to change the FROM line in Dockerfile.
ARG IMG
FROM ${IMG}
Then you can build image like this:
docker build --build-arg IMG=node:16 .
or
docker build --build-arg IMG=corporate.proxy/node:16 .
From Dockerfile reference document:
ARG is the only instruction that may precede FROM in the Dockerfile
https://docs.docker.com/engine/reference/builder/#understand-how-arg-and-from-interact

How to indicate a private registry not using the Dockerfile?

I have a Git repo with a simple Dockerfile. First row goes like this:
FROM python:3.7
My company has an internal registry with the base images. Because of this, the DevOps guys want me to change the Dockerfile to:
FROM registry.company.com:5000/python:3.7
I don't want this infrastructure detail baked in my code. URLs may change, I may want to build this image in another environment, etc. If possible, I would rather indicate the server in the pipeline, but the documentation regarding docker build has no parameter for this.
Is there a way to avoid editing the Dockerfile in this situation?
You would use a build arg for this:
ARG registry=docker.io/library
FROM ${registry}/python:3.7
Then for the build process:
docker build --build-arg registry=registry.company.com:5000 ...
Use docker.io for the registry name for the default Docker Hub, and library is the repository for official docker images, both of which you normally don't see when using the short format. Note that I usually include the library part in the local mirror so that official docker images and other repos that are mirrored can all use the same registry variable:
ARG registry=docker.io
FROM ${registry}/library/python:3.7
That means your local registry would need to have registry.company.com:5000/library/python:3.7.
To force users to specify the registry as part of the build, then don't provide a default value to the arg (or you could default the value of registry to something internal if that's preferred):
ARG registry
FROM ${registry}/python:3.7
You can work around the situation by manually pulling and re-tagging the image. docker build (and docker run) won't try to pull an image that already appears to be present locally, but that also means there's no verification that it actually matches what Docker Hub has. That means you can pull the image from your mirror, then docker tag it to look like a Docker Hub image:
docker pull registry.company.com:5000/python:3.7
docker tag registry.company.com:5000/python:3.7 python:3.7

Docker pull private registery first and docker hub if not found image

I try to find a solution with docker pull to pull my image from my private registry first,and then docker hub if not found in my private registry.
Currently i can pull like this if i want to go to my private registry: docker pull #hostname_private_registery/#image_name
i don't want to use #hostname_private_registery in the command, because i already i will have a big trouble with the dev.
As of now, the from command does not include a fallback on fail option. You could, however, check the availability of your private registry beforehand in some kind of script, then use string replaces on your dockerfile ARG values to chose your respective active registry.
You can use the following shell script to achieve this.
if docker pull #hostname_private_registery/#image_name ; then
echo "Image pulled from local registry"
else
docker pull #image_repo/#image_name
echo "Image pulled from DockerHub"
fi
You can replace the echo with whatever you need to do after the pull.

Pull docker images from a private repository during docker build?

Is there any way of pulling images from a private registry during a docker build instead of docker hub?
I deployed a private registry and I would like to be able to avoid naming its specific ip:port in the Dockerfile's FROM instruction. I was expecting a docker build option or a docker environment variable to change the default registry.
The image name should include the FQDN of the registry host.
So if you want to FROM <some private image> you must specifiy it as FROM registry_host:5000/foo/bar
In the future this won't be a requirement, but unfortunately for now it is.
I was facing the same issue in 2019. I solved this using arguments (ARG).
https://docs.docker.com/engine/reference/builder/#understand-how-arg-and-from-interact
Arguments allow you to set optional parameters (with defaults) that can be used in your FROM line.
Dockerfile-project-dev
ARG REPO_LOCATION=privaterepo.company.net/
ARG BASE_VERSION=latest
FROM ${REPO_LOCATION}project/base:${BASE_VERSION}
...
For my use-case I normally want to pull from the private repo, but if I'm working on the Dockerfiles I may want to be able to build from an image on my own machine, without having to modify the FROM line in my Dockerfile. To tell Docker to search my local machine for the image at build time I would do this:
docker build -t project/dev:latest -f ./Dockerfile-project-dev --build-arg REPO_LOCATION='' .
The docker folks generally want to ensure that if you run docker pull foo/bar you'll get the same thing (i.e., the foo/bar image from Docker Hub) regardless of your local environment.
This means that there are no options available to have Docker use anything else without an explicit hostname/port.

Resources