Use a certain FROM layer in Dockerfile

Use a certain FROM layer in Dockerfile - docker

I want to base my container off centos:centos6. But for some reason, centos:centos6 is updated somehow on the registry. This results in possible unfavorable different images when being built on different machines at different time. Such change caused segmentation fault in our application recently.
Is there a way I can specify the exact version for the from declaration so the build shall be the same even when the container being built on different machines at different time?

No, you can't tell a Dockerfile to be pinned to a particular image / layer ID, you have to use a tag (and if you don't use a tag, the tag latest is assumed and used as the default.
If you're worried that a remote registry image will change, you should take a copy of the Dockerfile and build your own version of the image yourself. You can either host it on the Docker Hub under your account or run your own private registry.
That way, you're in complete control over what's in there and when it gets updated (i.e., if you need to pin a particular package to an old version).

Related

Docker registry storage

Let's say I have a Docker private registery and a Dockerfile for an application.
When I build my image with a specific tag and push it to the registry, then re-build it with different code ( for example I copy a different JAR or I re-run npm install && ng build ) with that same tag, does the registry keep the old image, even though I don't need it anymore since I replaced it with a new one ? I am asking because I am concerned about the storage memory for the Docker registery, and maybe pushing so much layers that I don't need anymore and intented to replace will result in losing storage memory for nothing.

Yes, it will keep it.
A tag is just an easy way to have access to an image, but what ultimately matters is the hash of the image, and that will be different.
Probably, the registry you want to use will have some maintenance task to remove older images, or you can just write a simple script yourself.

How to improve automation of running container's base image updates?

I want all running containers on my server to always use the latest version of an official base image e.g. node:16.3 in order to get security updates. To achieve that I have implemented an image update mechanism for all container images in my registry using a CI workflow which has some limitations described below.
I have read the answers to this question but they either involve building or inspecting images on the target server which I would like to avoid.
I am wondering whether there might be an easier way to achieve the container image updates or to alleviate some of the caveats I have encountered.
Current Image Update Mechanism
I build my container images using the FROM directive with the minor version I want to use:
FROM node:16.13
COPY . .
This image is pushed to a registry as my-app:1.0.
To check for changes in the node:16.3 image compared to when I built the my-app:1.0 image I periodically compare the SHA256 digests of the layers of the node:16.3 with those of the first n=(number of layers of node:16.3) layers of my-app:1.0 as suggested in this answer. I retrieve the SHA256 digests with docker manifest inpect <image>:<tag> -v.
If they differ I rebuild my-app:1.0 and push it to my registry thus ensuring that my-app:1.0 always uses the latest node:16.3 base image.
I keep the running containers on my server up to date by periodically running docker pull my-app:1.0 on the server using a cron job.
Limitations
When I check for updates I need to download the manifests for all my container images and their base images. For images hosted on Docker Hub this unfortunately counts against the download rate limit.
Since I always update the same image my-app:1.0 it is hard to track which version is currently running on the server. This information is especially important when the update process breaks a service. I keep track of the updates by logging the output of the docker pull command from the cron job.
To be able to revert the container image on the server I have to keep previous versions of the my-app:1.0 images as well. I do that by pushing incremental patch version tags along with the my-app:1.0 tag to my registry e.g. my-app:1.0.1, my-app:1.0.2, ...
Because of the way the layers of the base image and the app image are compared it is not possible to detect a change in the base image where only the uppermost layers have been removed. However I do not expect this to happen very frequently.
Thank you for your help!

There are a couple of things I'd do to simplify this.
docker pull already does essentially the sequence you describe, of downloading the image's manifest and then downloading layers you don't already have. If you docker build a new image with an identical base image, an identical Dockerfile, and identical COPY source files, then it won't actually produce a new image, just put a new name on the existing image ID. So it's possible to unconditionally docker build --pull images on a schedule, and it won't really use additional space. (It could cause more redeploys if neither the base image nor the application changes.)
[...] this unfortunately counts against the download rate limit.
There's not a lot you can do about that beyond running your own mirror of Docker Hub or ensuring your CI system has a Docker Hub login.
Since I always update the same image my-app:1.0 it is hard to track which version is currently running on the server. [...] To be able to revert the container image on the server [...]
I'd recommend always using a unique image tag per build. A sequential build ID as you have now works, date stamps or source-control commit IDs are usually easy to come up with as well. When you go to deploy, always use the full image tag, not the abbreviated one.
docker pull registry.example.com/my-app:1.0.5
docker stop my-app
docker rm my-app
docker run -d ... registry.example.com/my-app:1.0.5
docker rmi registry.example.com/my-app:1.0.4
Now you're absolutely sure which build your server is running, and it's easy to revert should you need to.
(If you're using Kubernetes as your deployment environment, this is especially important. Changing the text value of a Deployment object's image: field triggers Kubernetes's rolling-update mechanism. That approach is much easier than trying to ensure that every node has the same version of a shared tag.)

How to apply a security patch to an existing docker image?

IF there is a docker image using a particular base image is running as a container and there is a new security upgrade for the base image. What is the best practice to apply that security patch to the docker image.
Also how to know if there is a security patch available for the base image .

Let's say that you have a Dockerfile that is based on an image called "Base:latest" and you've built an image called "MyImage:latest:latest"
If "Base:latest" has updated with security updates, you need to rebuild your "MyImage:latest".
Containers are image instances, so if you need the security updates that need to be reflected in a container, the container should be re-created based on the "MyImage:latest" image.
Notice that you wouldn't want to use the "latest" tag for base images in production, because you won't be able to reproduce the same deployed environment, so the best practice is to use a specific version tag like "1.0". If an update is available, you'll need to updated your Dockerfile from "Base:1.0" to "Base:1.1".
So if your image is based on another image and you want to run security updates without waiting for a new and updated version of the base image, you can run a security update command in your Dockerfile and make sure to rebuild your image occasionally and recreate the container.
You could probably automate this process using tools like Watchtower by automatically rebuild your image on a regular basis and then recreate your container.
Another option is to run automated updates in the container level, probably by using a script that runs every day, but you should take into consideration the impact on the running process load-wise (networking, cpu, etc.)

what is the best practicies of storing images in container registry

I need different images for dev,stage, and prod environments, how should I store images in dokckerhub?
should I use tags
my_app:prod
my_app:dev
my_app:stage
or maybe include env name in image like this
my_app_stage
my_app_stage
my_app_stage

Tags are primarily meant for versioning, as the default tag latest implies. If you use it for other meaning without versioning info, like tagging environment as my_app:dev and my_app:prod, there's no strict rule to prohibit that, but it could cause problem for deployment of the containers.
Imagine you have a container defined in docker-compose.yml that specifies my_app:prod as image. It's fine when you're developing locally, but when you deploy to production with Docker Compose or an orchestration service like Kubernetes, depending on policy, the controller can choose to reuse images from its local cache instead of pulling from registry every time. Now you just completed a new version of the image, and pushed it to Docker Hub feeling assured. Too bad it's still under the same name and tag, so the controller considers it's the same and uses the cached image, causing your old version to be deployed.
It could be worse than that. Not all nodes or clusters are configured the same, some will pull the latest version from the registry while some don't. Your swarm or deployment now contains a mixed set of old and new container versions, producing erratic behavior at best.
Now you know better and push your new version as my_app/prod:v2.0 and update the config. All controllers see the new version and pull down to use for replacing and scaling containers. Everything is consistent.
A simple version number as tag may sound a bit too simple, as practically you could have many properties that you find useful to add to an image, to help with documentation or query maybe. Or you need a specific name and tag so you can push to a certain cloud provider. Luckily you don't have to sacrifice versioning to do that, as Docker allows you to apply as many tags as you like:
docker build -t my_app:latest -t my_app:v2.0 -t my_app:prod -t cloud_user/app_image_id:v2.0 .

Docker base image update in Kubernetes deployment

We have a base image with the tag latest. This base image is being used for bunch of applications. There might be some update on the base image like ( OS upgrade, ...).
Do we need to rebuild and redeploy all applications when there is a change in the base image? Or, since the tag is latest and the new base image also will be with the tag latest, it will be updating in the docker layer and will be taken care without a restart?

Kubernetes has an imagePullPolicy: setting to control this. The default is that a node will only pull an image if it doesn’t already have it, except that if the image is using the :latest tag, it will always pull the image.
If you have a base image and then some derived image FROM my/base:latest, the derived image will include a specific version of the base image as its lowermost layers. If you update the base image and don’t rebuild the derived images, they will still use the same version of the base image. So, if you update the base image, you need to rebuild all of the deployed images.
If you have a running pod of some form and it’s running a :latest tag and the actual image that tag points at changes, Kubernetes has no way of noticing that, so you need to manually delete pods to force it to recreate them. That’s bad. Best practice is to use some explicit non-latest version tag (a date stamp works fine) so that you can update the image in the deployment and Kubernetes will redeploy for you.

There are two levels to this question.
Docker
If you use something like FROM baseimage:latest, this exact image is pulled down on your first build. Docker caches layers on consecutive builds, so not only will it build from the same baseimage:latest, but it will also skip execution of the Dockerfile elements untill first changed/not-cached one. To make the build notice changes to your baseimage you need to run docker pull baseimage:latest prior to the build, so that next run uses new content under latest tag.
The same goes for versioned tags when they aggregate minor/patch versions like when you use baseimage:v1.2 but the software is updated from baseimage:v1.2.3 to v1.2.4, and by the same process content of v1.2.4 is published as v1.2. So be aware of how versioning for particular image is handled.
Kubernetes
When you use :latest to deploy to Kubernetes you usually have imagePullPolicy: Always set. Which as for Docker build above, means that the image is always pulled before run. This is far from ideal, and far from immutable. Depending on the moment of container restart you might end up with two pods running at the same time, both the same :latest image yet the :latest for both of them will mean different actual image underneath it.
Also, you can't really change image in Deployment from :latest to :latest cause that's no change obviously, meaning you're out of luck for triggering rolling update, unless you pass version in label or something.
The good practice is to version your images somehow and push updates to cluster with that version. That is how it's designed and intended to use in general. Some versioning schemas I used were :
semantic (ie. v1.2.7) : nice if your CI/CD tool supports it well, I used it in Concourse CI
git_sha : works in many cases but is problematic for rebuilds that are not triggered by code changes
branch-buildnum or branch-sha-buildnum : we use it quite a lot
that is not to say I completely do not use latest. In fact most of my builds are built as branch-num, but when they are released to production that are also tagged and pushed to registry as branch-latest (ie. for prod as master-latest), which is very helpful when you want to deploy fresh cluster with current production versions (default tag values in our helm charts are pointing to latest and are set to particular tag when released via CI)

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart