Rebuild docker image by reusing the same tag? - docker

I've gone thru multiple questions posted on the forum but didn't get a clarity regarding my requirement.
I'm building a docker image after every successful CI build, there will be hardly 1 to 2 lines of changes in Dockerfile for every successful build.
Docker Build Command:
$(docker_registry)/$(Build.Repository.Name):azul
Docker Push Command
$(docker_registry)/$(Build.Repository.Name):azul
I wanted to overwrite the current docker image with the latest one(from the latest CI build changes) but retain the same tag - azul. Does docker support this ?

Yes, docker supports it. Every line you execute results in a new layer in image, that contains the changes compared to the previous layer. After modifying the Dockerfile, new layers will be created and the same preceding layers will be reused.
If you want to clean build the whole image with no cached layers, you can use the —no-cache parameter.

Mechanically this works. The new image will replace the old one with that name. The old image will still be physically present on the build system but if you look at the docker images output it will say <none> for its name; commands like docker system prune can clean these up.
The problems with this approach are on the consumer end. If I docker run registry.example.com/image:azul, Docker will automatically pull the image only if it's not already present. This can result in you running an older version of the image that happens to be on a consumer's system. This is especially a problem in cluster environments like Kubernetes, where you need a change in the text of the image name in a Kubernetes deployment specification to trigger an update.
In a CI system especially, I'd recommend assigning some sort of unique tag to every build. This could be based on the source control commit ID, or the branch name and bind number, or the current date, or something else. You can create a fixed tag like this as a convenience to developers (an image is allowed to have multiple tags) but I'd plan to not use this for actual deployments.

Related

Is it possible to make a FROM instruction in a Dockerfile pull the most recent image?

I want to know whether it’s possible to make a FROM instruction in a Dockerfile pull the most recent image (e.g. image:latest) before proceeding with the build?
Currently, the image is only pulled if it’s not already stored locally.
docker build --pull OTHER_OPTIONS PATH
From https://docs.docker.com/engine/reference/commandline/build/
--pull Always attempt to pull a newer version of the image
Although there might be genuine use case for this for development pupose, I strongly suggest avoid depending on this option in production builds. Docker images must be immutable. Using this option can lead to situations where different images are generated from same source code and any behaviour changes resulting from such builds without corresponding changes in code are hard to debug.
Say there is a project called "derived project" which uses the base image myBaseImage:latest
FROM myBaseImage:latest
<snipped>
CMD xyz
docker build --pull -t myDerivedImage:<version of source code> .
Assuming the tag of derived image is based on it's source code version (a git commit hash for example) which is the most common way to tag the images, if a new base image under latest tag is published while there are no changes in derived project, the build of derived project will produce different images under same name before and after the base image change. Once an image is published under a name, it should not be mutated.
In order to build a docker image by updating the base image, you must use the option:
--pull
I leave you the official documentation where this option is discussed and many more: official docker documentation

How can I revert my last push on hub.docker.com?

I have damaged my working docker image by pushing and overwriting by a faulty docker image on hub.docker.com. How can I revert the last push?
There is no revert option that I'm aware of. If you have a good copy of your image somewhere, you can repush that to the registry. To avoid this issue in the future, follow one or more of these steps:
Avoid using the latest tag and give each build a unique tag.
Use a reproducible build process with a Dockerfile that is saved in version control which uses specific versions for all dependencies. This allows you to checkout a previous state of the Dockerfile to rerun a previous build.
Maintain a private registry of your own for your images and any dependencies you have on other images. Make sure to maintain those dependencies (updating periodically) and backup your registry.
You can use the Advanced Image Management page in docker hub to copy the digest of the good image, pull it to your system, overwrite the tag, then push back. Use these commands:
docker image pull myname/example#sha256:1234
docker tag myname/example#sha256:1234 myname/example:mytag
docker push myname/example:mytag
Assuming myname/example#sha256:1234 is your copied digest of the good image from docker hub, myname/example:mytag is what you want to tag the image as.

What is the purpose of pushing an image in a CI/CD pipeline?

Context: Reading through this blog post.
Pushing images to a registry seems to be the "right thing to do" ... but I don't understand why.
What purpose does this serve? Is it because the server I ssh into needs to have a local copy of the image? And to do that, one approach is to pull an image from a registry?
What purpose does this serve? Is it because the server I ssh into needs to have a local copy of the image? And to do that, one approach is to pull an image from a registry?
From the CI/CD perspective, a docker registry is the equivalent of an artifact repository for images. You want a central source of these images to download from as you go from one docker host to another since your build server is most likely different than your dev and prod servers.
Couldn't I just upload an image from one machine (say a CI/CD server) via ssh? using dockerhub seems needlessly ceremonious to me. Like in this example (I know this api is deprecated but it illustrates my point).
It is possible to save/load images directly to a docker host, but there a few major downsides. First, you lose any benefit from docker's layered filesystem. When building an app in CI/CD, most of the time only the last few layers should need to be rebuilt with your application changes. There should be the same previous base image and various common layers to build your app that remain identical. With a registry, these common layers are seen, only the difference is pushed and pulled, making your deploys faster and saving you disk space. With a save/load command, all layers are sent every time since you do not know the state of the remote server when you run the save.
Second, this doesn't scale as you add hosts to run images. Every host would need the image copied on the chance you want to run it on that host, e.g. to handle failover or load balancing. It also won't work if you move to swarm mode or kubernetes since you could easily add new nodes to the cluster that won't have your image. Swarm mode defaults to looking up the sha256 of the image on the registry to guarantee the same image is always used even if the tag is modified on the registry after the initial deploy.
Keep in mind you can run your own registry server (there's a docker image and the api is open). Many artifact repositories (e.g. artifactory and nexus) include support for a docker registry. And many cloud providers include a registry with their container offerings. So you do not need to push to a remote docker hub to deploy locally.
One last point is that a registry server is useful to developers who can now pull the same image used in dev and prod to test against other microservices they are writing locally without the need to build everything locally or ssh to a CI/CD server or even prod to save and scp images back to their laptops.
Usually, you use a CI, CD pipeline when you want to streamline your build / test/ deploy process, and usually this happens if you have a production infrastructure to maintain that is actually critical to your business.
There is no need for a CI/CD pipeline if you're just playing around / prototyping IMO, in which case you can build you docker images on the machine directly, or ssh an image over. That's perfectly reasonable.
Look at the 'registry' as a repository of your binary image (i.e. a fixed version of your code that ideally is versioned and you know works)
Then deploying is as simple as telling your servers to pull the image and run it, from anywhere.
On a flexible architecture, you might have nodes coming up or going down at any time, and they need to be able to pull the latest code from somewhere to get back up and running automatically, at any time, without intervention.
Registry is single source of truth in this case. It means, that you can have multiple nodes (servers), cluster(s) and have the single place from where you can get your images. Also if of your nodes drop-down - you can fast start your image in the new one. Also you can automate your image's updating using registry's webhook, for example when you add new version of image registry gonna send webhook to any service that can upgrade your containers to the newest version.
Consider docker image as a new way of distribution of your software to your servers and docker-registry as a centralized storage of shared images(the like npm.org for js, maven.org for java).
For example,
if you develop java application, years before docker you may use .jar files to do it. The way docker image is better is that also include all OS level dependencies like JDK/JRE and system configurations. So this helps you to avoid "it works on my machine" effect.
To distribute docker image you can also use just docker file and build it all the time on every machine. Docker-Repository allows you to have centralized storage of pre-build images.
Pushing to docker-repository in your CI/CD allows to build your distributive once and further work with the same distributive both on integration and prod environments.
Using just Dockerfile will not guarantee you the same state on every build in every moment of time because you may install external dependencies in your Dockerfile script which may be updated or even removed between two sequential builds.

Advantages of a Dockerfile

We can create Docker images and all push them to Hub without a Dockerfile. Why is it useful, to have a Dockerfile? What are advantages of it? Dockerfile creation is a process with high consumption of time and can made only by a human.
I would like to know what is the main difference between a base image based, committed image and a Dockerfile based image.
Dockerfile is used for automation of work by specifying all step that we want on docker image.
A Dockerfile is a text document that contains all the commands a user
could call on the command line to assemble an image. Using docker
build users can create an automated build that executes several
command-line instructions in succession.
yes , we can create Docker images but every time when we want to make any change then you have to change manually and test and push it .
or if you use Dockerfile with dockerhub then it will rebuild automatically and make change on every modification and if something wrong then rebuild will fail.
Advantages of Dockerfile
Dockerfile is automated script of Docker images
manual image creation will become complicated when you want to test same setup on different OS flavor then you have to create image for all flavor but by small changing in dockerfile you can create images for different flavor
it have simple syntax for image and do many change automatically that will take more time while doing manually.
Dockerfile have systematic step that can be understand by others easily and easy to know what exact configuration changed in base image.
Advantage of Dockerfile with dockerhub
Docker Hub provide private repository for Dockerfile.
Dockerfile can share among team and organization.
Automatic image builds
Webhooks that are attached to your repositories that allow you to trigger an event when an image or updated image is pushed to the repository
we can put Dockerfile on Github or Bitbucket
Difference between committed image and Dockerfile image
Committed image : it commit a container’s file changes or settings into a new image.
Usage: docker commit [OPTIONS] CONTAINER [REPOSITORY[:TAG]]
Create a new image from a container's changes
-a, --author= Author (e.g., "John Hannibal Smith <hannibal#a-team.com>")
-c, --change=[] Apply Dockerfile instruction to the created image
--help=false Print usage
-m, --message= Commit message
-p, --pause=true Pause container during commit
It is good option to debug container and export changed setting into another image.but docker suggest to use dockerfile see here or we can say commit is versioning of docker or backup of image.
The commit operation will not include any data contained in volumes
mounted inside the container.
By default, the container being committed and its processes will be
paused while the image is committed. This reduces the likelihood of
encountering data corruption during the process of creating the
commit. If this behavior is undesired, set the ‘p’ option to false.
Dockerfile based image:
it always use base image for creating new image. let suppose if you made any change in dockerfile then it will apply all dockerfile steps on fresh image and create new image. but commit use same image.
my point of view we have to use dockerfile that have all step that we want on image but if we create image from commit then we have to document all change that we made that may be needed if we want to create new image and we can say dockerfile is a documentation of image.
The advantage is that, even if you do not have a shared image registry to which you could push your images, you still can exchange said images with a "recipe" (the Dockerfile used by docker build), which is only a couple KB of text, and can be passed around very easily (light and small).
That declarative format ensure that you will be able to re-build an identical image, and allows reproducible result.
Docker commit
Using Docker commit command approach to create new images is error prone, one need to remember and update image for small changes and commit every time.
Dockerfile
Dockerfile provides the ability to automate all the steps with set
of Directives which gets executed during build (refer docker build command) to create final image along with commit of the image.
Dockerfile is use anywhere, everything configured and ready to run approach.
Dockerfile can be shared with others and updated easily by others. It
allows to change the image easily depending on the requirement easily
like security hardening, add or update user details etc.

Images are being cached even if there are changes

I have on Docker an automatic build for an image based on ubuntu with some custom configurations to re-use then as base image on other specific Dockerfiles for particular projects. This works okay.
I made a change to it, committed to github which then started and did the automatic build on Docker.
From one of these other projects, I'm calling at the beginning of the Dockerfile FROM myuser/myimage but its not getting the last image with the changes, but rather it keeps caching the old one.
Shouldn't this be automatically?
You need to docker pull the latest version. Docker looks for the image from FROM locally. It doesn't notice if that tag has been updated in the registry where it came from. I have a script that runs docker pull before building images.

Resources