docker base image: how to upgrade - docker

I'm just starting with docker, and this question probably clearly shows that I'm not really understanding the base concepts yet, but I can't figure it out.
So, I want to use this image as my "base" image: https://registry.hub.docker.com/u/phusion/baseimage/
Now, this base image has a number of tags (versions). The most recent one is 0.9.11.
So, let's say I'll spin up a number of images based on this "base" image and push those to production.
Then Phusion guys will push some updates to that image and I would want to upgrade not just the actual base image but also all of the images I already use on prod (based on the "base" image).
So how would I do that?
=================
Extra question:
The other case I assume should be perfectly possible:
The base image has some common lib, openssl, for example. Now there's a new bug discovered and I need to upgrade to newer openssl version.
Is this possible to upgrade the openssl on the base image, commit it to my local registry, and pull that change on all images that are based on that "base" image?

When you build a Dockerfile, instructions are read from top to bottom using cache as much as possible.
The first time it encounters a command that changed or a new command, the cache is busted.
The FROM directive is usually at the top of the Dockerfile, so if you change the tag of your base image, the whole Dockerfile will be re-built from scratch.
And that's how you "update" a base image, you rebuild all your containers from their Dockerfiles, you don't "push" the changes.

Related

How to instruct docker or docker-compose to automatically build image specified in FROM

When processing a Dockerfile, how do I instruct docker build to build the image specified in FROM locally using another Dockerfile if it is not already available?
Here's the context. I have a large Dockerfile that starts from base Ubuntu image, installs Apache, then PHP, then some custom configuration on top of that. Whether this is a good idea is another point, let's assume the build steps cannot be changed. The problem is, every time I change anything in the config, everything has to be rebuilt from scratch, and this takes a while.
I would like to have a hierarchy of Dockerfiles instead:
my-apache : based on stock Ubuntu
my-apache-php: based on my-apache
final: based on my-apache-php
The first two images would be relatively static and can be uploaded to dockerhub, but I would like to retain an option to build them locally as part of the same build process. Only one container will exist, based on the final image. Thus, putting all three as "services" in docker-compose.yml is not a good idea.
The only solution I can think of is to have a manual build script that for each image checks whether it is available on Dockerhub or locally, and if not, invokes docker build.
Are there better solutions?
I have found this article on automatically detecting dependencies between docker files and building them in proper order:
https://philpep.org/blog/a-makefile-for-your-dockerfiles
Actual makefile from Philippe's git repo provides even more functionality:
https://github.com/philpep/dockerfiles/blob/master/Makefile

Docker - Upgrading base Image

I have a base which is being used by 100 applications. All 100 applications have the common base image in their Dockerfile. Now, I am upgrading the base image for OS upgrade or some other upgrade and bumping up the version and I also tag with the latest version.
Here, the problem is, whenever I change the base image, all 100 application needs to change the base image in their dockerfile and rebuild the app for using the latest base image.
Is there any better way to handle this?
Note :- I am running my containers in Kubernetes and Dockerfile is there in GIT for each application.
You can use a Dockerfile ARG directive to modify the FROM line (see Understand how ARG and FROM interact in the Dockerfile documentation). One possible approach here would be to have your CI system inject the base image tag.
ARG base=latest
FROM me/base-image:${base}
...
This has the risk that individual developers would build test images based on an older base image; if the differences between images are just OS patches then you might consider this a small and acceptable risk, so long as only official images get pushed to production.
Beyond that, there aren't a lot of alternatives beyond modifying the individual Dockerfiles. You could script it
# Individually check out everything first
BASE=$(pwd)
TAG=20191031
for d in *; do
cd "$BASE/$d"
sed -i.bak "s#FROM me/base-image.*#FROM:me/base-image:$TAG/" Dockerfile
git checkout -b "base-image-$TAG"
git commit -am "Update Dockerfile to base-image:$TAG"
git push
hub pull-request --no-edit
done
There are automated dependency-update tools out there too and these may be able to manage the scripting aspects of it for you.
You don't need to change the Dockerfile for each app if it uses base-image:latest. You'll have to rebuild app images though after base image update. After that you'll need to update the apps to use the new image.
For example using advises from this answer
whenever I change the base image, all 100 application needs to change the base image in their dockerfile and rebuild the app for using the latest base image.
That's a feature, not a bug; all 100 applications need to run their tests (and potentially fix any regressions) before going ahead with the new image...
There are tools out there to scan all the repos and automatically submit pull requests to the 100 applications (or you can write a custom one, if you don't have just plain "FROM" lines in Dockerfiles).
If you need to deploy that last version of base image, yes, you need to build, tag, push, pull and deploy each container again. If your base image is not properly tagged, you'll need to change your dockerfile on all 100 files.
But you have some options, like using sed to replace all occurrences in your dockerfiles, and execute all build commands from a sh file pointing to every app directory.
With docker-compose you may update your running 100 apps with one command:
docker stack deploy --compose-file docker-compose.yml
but still needs to rebuild the containers.
edit:
with docker compose you can build too your 100 containers with one command, you need to define all of them in a compose file, check the docks for the compose file.

Docker base image update in Kubernetes deployment

We have a base image with the tag latest. This base image is being used for bunch of applications. There might be some update on the base image like ( OS upgrade, ...).
Do we need to rebuild and redeploy all applications when there is a change in the base image? Or, since the tag is latest and the new base image also will be with the tag latest, it will be updating in the docker layer and will be taken care without a restart?
Kubernetes has an imagePullPolicy: setting to control this. The default is that a node will only pull an image if it doesn’t already have it, except that if the image is using the :latest tag, it will always pull the image.
If you have a base image and then some derived image FROM my/base:latest, the derived image will include a specific version of the base image as its lowermost layers. If you update the base image and don’t rebuild the derived images, they will still use the same version of the base image. So, if you update the base image, you need to rebuild all of the deployed images.
If you have a running pod of some form and it’s running a :latest tag and the actual image that tag points at changes, Kubernetes has no way of noticing that, so you need to manually delete pods to force it to recreate them. That’s bad. Best practice is to use some explicit non-latest version tag (a date stamp works fine) so that you can update the image in the deployment and Kubernetes will redeploy for you.
There are two levels to this question.
Docker
If you use something like FROM baseimage:latest, this exact image is pulled down on your first build. Docker caches layers on consecutive builds, so not only will it build from the same baseimage:latest, but it will also skip execution of the Dockerfile elements untill first changed/not-cached one. To make the build notice changes to your baseimage you need to run docker pull baseimage:latest prior to the build, so that next run uses new content under latest tag.
The same goes for versioned tags when they aggregate minor/patch versions like when you use baseimage:v1.2 but the software is updated from baseimage:v1.2.3 to v1.2.4, and by the same process content of v1.2.4 is published as v1.2. So be aware of how versioning for particular image is handled.
Kubernetes
When you use :latest to deploy to Kubernetes you usually have imagePullPolicy: Always set. Which as for Docker build above, means that the image is always pulled before run. This is far from ideal, and far from immutable. Depending on the moment of container restart you might end up with two pods running at the same time, both the same :latest image yet the :latest for both of them will mean different actual image underneath it.
Also, you can't really change image in Deployment from :latest to :latest cause that's no change obviously, meaning you're out of luck for triggering rolling update, unless you pass version in label or something.
The good practice is to version your images somehow and push updates to cluster with that version. That is how it's designed and intended to use in general. Some versioning schemas I used were :
semantic (ie. v1.2.7) : nice if your CI/CD tool supports it well, I used it in Concourse CI
git_sha : works in many cases but is problematic for rebuilds that are not triggered by code changes
branch-buildnum or branch-sha-buildnum : we use it quite a lot
that is not to say I completely do not use latest. In fact most of my builds are built as branch-num, but when they are released to production that are also tagged and pushed to registry as branch-latest (ie. for prod as master-latest), which is very helpful when you want to deploy fresh cluster with current production versions (default tag values in our helm charts are pointing to latest and are set to particular tag when released via CI)

How stable are version-tagged docker baseimages? Should I make my own copy?

I am creating docker images based on base-images from docker.io (for example ubuntu:14.04).
I want my docker-builds to be 100% reproducible. One requirement for this is, that the baseimage does not change (or if it changes, it is a decision by me to use the changed baseimage).
Can I be sure that a version tagged base image (like ubuntu:14.04) will always be exactly the same?
Or should I make my own copy in my own private repository?
Version tags like ubuntu:14.04 can be expected to change with bug fixes. If you want to be sure you get the exact same image (still containing the fixed bugs) you can use the hash of the image:
FROM ubuntu#sha256:4a725d3b3b1c
But you can not be sure this exact version will be hosted forever by docker hub.
Safest way is to create your own docker repository server. Push the images you are using to that repository. Use the hash notation to pull the images from your local repos.
FROM dockerrepos.yourcompany.com/ubuntu#sha256:4a725d3b3b1c

Docker image size not matching the container size after a commit

Recently I created a docker container from a centos base image which ran a jboss. Originally I had installed the jdk (and committed) which made the container bulky (about 850M). Later, I uninstalled jdk and installed jre. From inside the container a
du -xsh /
shows only 440M. But after committing the changes to the image it still shows 711M. Should the image size not match (or be close to) the container's du? Or while committing, does docker keep adding to the old ones (like an SCM)?
Thanks
Answering my own question. It seems that docker keeps adding layer to the base image whenever one commits to the image. However little you make your container committing it to the base is going only to add a new layer over the existing image which is always going to inflate the image. The problem with this approach is when you make modifications and commit to the base image, you end up taking unwanted layers to the production. I can see no official way to merge the layers. But there seems to be some workarounds in this link. However, I just recreated everything from the base and committed only once which made the image size very close to the container size.
I think you will find this issue easily resolvable if you are using Dockerfiles to build your images, rather than the commit method. If you are using Dockerfiles, you can simply remove the RUN command that added the JDK, and rebuild. This will get rid of the previous layers.

Resources