What's the purpose of "docker build --pull"?

What's the purpose of "docker build --pull"? - docker

When building a docker image you normally use docker build ..
But I've found that you can specify --pull, so the whole command would look like docker build --pull .
I'm not sure about the purpose of --pull. Docker's official documentation says "Always attempt to pull a newer version of the image", and I'm not sure what this means in this context.
You use docker build to build a new image, and eventually publish it somewhere to a container registry. Why would you want to pull something that doesn't exist yet?

it will pull the latest version of any base image(s) instead of reusing whatever you already have tagged locally
take for instance an image based on a moving tag (such as ubuntu:bionic). upstream makes changes and rebuilds this periodically but you might have a months old image locally. docker will happily build against the old base. --pull will pull as a side effect so you build against the latest base image
it's ~usually a best practice to use it to get upstream security fixes as soon as possible (instead of using stale, potentially vulnerable images). though you have to trade off breaking changes (and if you use immutable tags then it doesn't make a difference)

Docker allows passing the --pull flag to docker build, e.g. docker build . --pull -t myimage. This is the recommended way to ensure that the build always uses the latest container image despite the version available locally. However one additional point worth mentioning:
To ensure that your build is completely rebuilt, including checking the base image for updates, use the following options when building:
--no-cache - This will force rebuilding of layers already available.
The full command will therefore look like this:
docker build . --pull --no-cache --tag myimage:version
The same options are available for docker-compose:
docker-compose build --no-cache --pull

Simple answer. docker build is used to build from a local dockerfile. docker pull is used to pull from docker hub. If you use docker build without a docker file it throws an error.
When you specify --pull or :latest docker will try to download the newest version (if any)
Basically, if you add --pull, it will try to pull the newest version each time it is run.

Related

docker dont generate new image from docker build

I'm in low cost project that we send to container registry (DigitalOcean) only latest image.
But all time, after running:
docker build .
Is generating the same digest, every time.
This is my script for build:
docker build .
docker tag {image}:latest registry.digitalocean.com/{company}/{image}:latest;
docker push registry.digitalocean.com/{company}/{image}
I tried:
BUILD_VERSION=`date '+%s'`;
docker build -t {image}:"$NOW" -t {image}:latest .
docker tag {image}:latest registry.digitalocean.com/{company}/{image}:latest;
docker push registry.digitalocean.com/{company}/{image}
but not worked.

Editing my answer, what David said is correct - the push with out the tag should pick up latest tag.
If you provide what you have in your local repository and the output of the above commands, it would shed more light to your problem.
Edit 2:
I think I have figured out on why:
Is generating the same digest, every time.
This means, although you are running your docker build - there has been no change to the underlying artifacts which are being packaged into the image and hence it results into the same digest.

Sometimes layers are cached but there are changes that aren't detected so you can delete the image or use 'docker system prune' to force clearing cache here

docker pull image policy/settings

My situation is I have two images with the same tag(hash different), one at local and another at the registry. When I build dockerfile, docker always compares the hash of the two images find not equal then will pull the registry one.
I know there has an imagePullPolicy in k8s. My question is docker has any settings like imagePullPolicy?
Thanks a lot.

The Docker tooling by and large either assumes you're going to manually pull images, or provides a --pull option to integrate it with other commands. For example:
docker build has a --pull option to try to retrieve a newer version of FROM images
docker run does not; it will always reuse the image you already have, or pull one if you don't have it
Neither core docker-compose nor docker-compose up has a --pull option, but there is a docker-compose pull command that pulls every image listed in a docker-compose.yml file
docker-compose build does have a --pull option
Core Docker always tries to pull an image if it is not present; there is no equivalent to imagePullPolicy: Never. Conversely, it never tries to communicate with an image registry outside of an explicit "pull" operation; you also cannot make docker run act like imagePullPolicy: Always.
It's good practice in Kubernetes to use a unique tag per build, so you can specify an explicit build and don't have to worry about imagePullPolicy. If you do this, in plain Docker, the implicit "pull if missing" will get you the correct behavior as well.

As a potential addendum to #David Maze's answer, it looks like docker run now also has a --pull option.

Does the order of --cache-from arguments matter when building an image with Docker Buildkit?

Suppose I am building an image using Docker Buildkit. My image is from a multistage Dockerfile, like so:
FROM node:12 AS some-expensive-base-image
...
FROM some-expensive-base-image AS my-app
...
I am now trying to build both images. Suppose that I push these to Docker Hub. If I were to use Docker Buildkit's external caching feature, then I would want to try to save build time on my CI pipeline by pulling in the remote some-expensive-base-image:latest image as the cache when building the some-expensive-base-image target. And, I would want to pull in both the just-built some-expensive-base-image image and the remote my-app:latest image as the caches for the latter image. I believe that I need both in order to prevent requiring the steps of some-expensive-base-image from needing to be rebuilt, since...well...they are expensive.
This is what my build script looks like:
export DOCKER_BUILDKIT=1
docker build --build-arg BUILDKIT_INLINE_CACHE=1 --cache-from some-expensive-base-image:latest --target some-expensive-base-image -t some-expensive-base-image:edge .
docker build --build-arg BUILDKIT_INLINE_CACHE=1 --cache-from some-expensive-base-image:edge --cache-from my-app:latest --target my-app -t my-app:edge .
My question: Does the order of the --cache-from arguments matter for the second docker build?
I have been getting inconsistent results on my CI pipeline for this build. There are cache misses that are happening when building that latter image, even though there hasn't been any code changes that would have caused cache busting. The Cache Minefest can be pulled without issue. There are times when the cache image is pulled, but other times when all steps of that latter target need to be rerun. I don't know why.
By chance, should I instead try to docker pull both images before running the docker build commands in my script?
Also, I know that I referred to Docker Hub in my example, but in real life, my application uses AWS ECR for its remote Docker repository. Would that matter for proper Buildkit functionality?

Yes, the order of --cache-from matters!
See the explanation on Github from the person who implemented the feature, quoting here:
When using multiple --cache-from they are checked for a cache hit in the order that user specified. If one of the images produces a cache hit for a command only that image is used for the rest of the build.
I've had similar problems in the past, you might find useful to check ths answer, where I've shared about using Docker cache in the CI.

Is it possible to run a Dockerfile without building an image (or to immediately/transparently discard the image)?

I have a use case where I call docker build . on one of our build machines.
During the build, a volume mount from the host machine is used to persist intermediate artifacts.
I don't care about this image at all. I have been tagging it with -t during the build and calling docker rmi after it's been created, but I was wondering if there was a one-liner/flag that could do this.
The docker build steps don't seem to have an appropriate flag for this behavior, but it may be simply because build is the wrong term.

Is it possible to cache multi-stage docker builds?

I recently switched to multi-stage docker builds, and it doesn't appear that there's any caching on intermediate builds. I'm not sure if this is a docker limitation, something which just isn't available or whether I'm doing something wrong.
I am pulling down the final build and doing a --cache-from at the start of the new build, but it always runs the full build.

This appears to be a limitation of docker itself and is described under this issue - https://github.com/moby/moby/issues/34715
The workaround is to:
Build the intermediate stages with a --target
Push the intermediate images to the registry
Build the final image with a --target and use multiple --cache-from paths, listing all the intermediate images and the final image
Push the final image to the registry
For subsequent builds, pull the intermediate + final images down from the registry first

Since the previous answer was posted, there is now a solution using the BuildKit backend: https://docs.docker.com/engine/reference/commandline/build/#specifying-external-cache-sources
This involves passing the argument --build-arg BUILDKIT_INLINE_CACHE=1 to your docker build command. You will also need to ensure BuildKit is being used by setting the environment variable DOCKER_BUILDKIT=1 (on Linux; I think BuildKit might be the default backend on Windows when using recent versions of Docker Desktop). A complete command line solution for CI might look something like:
export DOCKER_BUILDKIT=1
# Use cache from remote repository, tag as latest, keep cache metadata
docker build -t yourname/yourapp:latest \
--cache-from yourname/yourapp:latest \
--build-arg BUILDKIT_INLINE_CACHE=1 .
# Push new build up to remote repository replacing latest
docker push yourname/yourapp:latest
Some of the other commenters are asking about docker-compose. It works for this too, although you need to additionally specify the environment variable COMPOSE_DOCKER_CLI_BUILD=1 to ensure docker-compose uses the docker CLI (with BuildKit thanks to DOCKER_BUILDKIT=1) and then you can set BUILDKIT_INLINE_CACHE: 1 in the args: section of the build: section of your YAML file to ensure the required --build-arg is set.

I'd like to add another important point to the answer
--build-arg BUILDKIT_INLINE_CACHE=1 caches only the last layer, and works in case nothing changed.
So, to enable the caching of layers for the whole build, this argument should be replaced by --cache-to type=inline,mode=max. See the documentation

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart