Best practice for dockerfile maintain? - docker

I have a Dockerfile something like follows:
FROM openjdk:8u151
# others here
I have 2 questions about the base image:
1. How to get the tags?
Usually, I get it from dockerhub, let's say openjdk:8u151, I can get it from dockerhub's openjdk repository.
If I could get all tags from any local docker command, then I no need to visit web to get the tags, really a little low efficiency?
2. Will the base image safe?
I mean if my base image always there?
Look at the above openjdk repo, it is an offical repo.
I found there is only 8u151 left for me to choose. But I think there should be a lots of jdk8 release during the process, so should also a lots of jdk8 images there, something like 8u101, 8u163 etc.
So can I guess the maintainer will delete some old images for openjdk?
Then if this happen, how my Dockerfile work? I should always change my base image if my upstream delete there image? Really terrible for me to maintain such kind of thing.
Even if the openjdk really just generate one release of jdk8. My puzzle still cannot be avoided, as dockerhub really afford the delete button for users.
What's the best practice, please suggest, thanks.

How to get the tags?
See "How to list all tags for a Docker image on a remote registry?".
The API is enough
For instance, visit:
https://registry.hub.docker.com/v2/repositories/library/java/tags/?page_size=100&page=2
Will the base image safe?
As long as you save your own built image in a registry (eithe rpublic one, or a self-hosted one), yes: you will be able to at least build new images based on the one you have done.
Or, even if the base image disappears, you still have its layers in your own image, and can re-tag it (provided the build cache is available).
See for instance "Is there a way to tag a previous layer in a docker image or revert a commit?".
See caveats in "can I run an intermediate layer of docker image?".

Related

Docker - workflow for updating container

I'm just getting to grips with Docker. I need to update a base image for my image.
Questions
Do I need to completely recreate all the changes I made on top of the
new base image and save it as a new image?
What do people do to remember the changes they've made to their
image?
Do I need to completely recreate all the changes I made on top of the new base image and save it as a new image?
You don't. It is up to you whether you want to completely rebuild the image or to use your old one as a new base but unless we are talking about generic base image, such as one where you just preinstall things that you want available to all the derived images, it is probably better to just rebuild the image from scratch, otherwise you might end up cluttering images with stuff they don't need which is never a good thing (both from the perspective of size and security).
What do people do to remember the changes they've made to their image?
Right out of the box you can use history command to see what went into the image
docker image history <image>
which lists image's filesystem layers.
Personally, when I build images, I copy Dockerfile to the image so that I can quickly cat it.
docker exec <image> cat Dockerfile
It is more convenient for me than listing through the history output (I don't include anything sensitive in a dockerfile and all the information that it has is already available within the container if someone breaks in).

Is it good practice to commit docker container frequently?

I'm using WebSphere Liberty inside. As WebSphere Liberty requires frequent xml editing, which is impossible with Dockerfile commands. I have to docker-commit the container from time to time, for others to make use of my images.
The command is like:
docker commit -m "updated sa1" -a "Song" $id company/wlp:v0.1
Colleges are doing similar things to the image, they continue to docker commit the container several times every day.
One day we're going to deploy the image on production.
Q1: Is the practice of frequent docker-committing advised?
Q2: Does it leave any potential problem behind?
Q3: Does it create an extra layer? I read the docker-commit document, which didn't mention if it creates another layer, I assume it means no.
I wouldn't use docker commit,
It seems like a really good idea but you can't reproduce the image at will like you can with a Dockerfile and you can't change the base image once this is done either, so makes it very hard to commit say for example a security patch to the underlying os base image.
If you go the full Dockerfile approach you can re-run docker build and you'll get the same image again. And you are able to change the base image.
So my rule of thumb is if you are creating a temporary tool and you don't care about reuse or reproducing the image at will then commit is convenient to use.
As I understand Docker every container image has two parts this is a group of read-only layers making up the bulk of the image and then a small layer which is writeable where any changes are committed.
When you run commit docker goes ahead and creates a new image this is the base image plus changes you made (the image created is a distinct image), it copies up the code to the thin writable layer. So a new read-only layer is not created it merely stores the deltas you make into the thin writeable layer.
Don't just take my word for it, take Redhats advice
For clarity that article in step 5 says:
5) Don’t create images from running containers – In other terms, don’t
use “docker commit” to create an image. This method to create an image
is not reproducible and should be completely avoided. Always use a
Dockerfile or any other S2I (source-to-image) approach that is totally
reproducible, and you can track changes to the Dockerfile if you store
it in a source control repository (git).

After deleting image from Google Container Registry and uploading another with the same tag, deleted one is still pulled

I don't know if this is intended behavior or bug in GCR.
Basically I tried do it like that:
Create image from local files using Docker on Windows (Linux based image).
Before creating image I delete all local images with the same name/tag.
Image is tagged like repostiory/project/name:v1
When testing locally image have correct versions of executables (docker run imageID).
Before pushing image to GCR I delete all images from GCR with the same tag/name.
When Trying to pull new image from GCR to example kubernetes it pull the first (ever) image uploaded under particular tag.
I want to reuse the same tag to not change config file with every test and I don't really need to store previous versions of images.
It sounds like you're hitting the problem described in kubernetes/kubernetes#42171.
tl;dr, the default pull policy of kubernetes is broken by design such that you cannot reuse tags (other than latest). I believe the guidance from the k8s community is to use "immutable tags", which is a bit of an oxymoron.
You have a few options:
Switch to using the latest tag, since kubernetes has hardcoded this in their default pull policy logic (I believe in an attempt to mitigate the problem you're having).
Never reuse a tag.
Switch to explicitly using the PullAlways ImagePullPolicy. If you do this, you will incur a small overhead, since your node will have to check with the registry that the tag has not changed.
Switch to deploying by image digest with the PullIfNotPresent ImagePullPolicy. A more detailed explanation is in the PR I linked, but this gets you the best of both worlds.

How to delete docker image data or layer in nexus3

I'm trying out nexus oss 3.0.1-01. I have a docker repository setup and I'm able to push and pull images successfully. But I need a way delete images. For docker, deleting a component won't actually delete the actual image layers from the file system because it maybe referred to by other components. So, what is the proper way to handle it?
I even deleted every single components and then ran a scheduled task to compact blob store. But that didn't seem to do much in terms of free up storage space.
My understanding is that there isn't a feature in nexus3 at the moment. If there is, could you please point me to some documentation on it? Otherwise, how is everyone else managing their storage space for docker repository?
We had a user contribute this recently:
https://gist.github.com/lukewpatterson/bf9d19410094ea8bced1d4bb0523b67f
You can read about usage here: https://issues.sonatype.org/browse/NEXUS-9293
As well, a supported feature for this will be coming soon from Sonatype.
This is something that needs to be provided at the Docker Registry level. Currently it appears to be broken on v3.1
Did you try to go to assets and delete the layers? If that did not remove the files from the blob store, along with compact blob store, then it is a Nexus problem.
Make sure to tack this issues and confirm that this is the desired behavior for 3.2
See issues
https://issues.sonatype.org/browse/NEXUS-9497
https://issues.sonatype.org/browse/NEXUS-9293
In Nexus 3.14 you go to WebUI -> Tasks -> Create -> Docker - Delete unused manifests and images
Then another job Admin - Compact blob store to actually rm the files from the Nexus directory.
Before that you need to delete the Nexus components (using the cleanup policy+job), as original poster did.

Where do untagged Docker images come from?

I'm creating some very simple Docker containers. I understand that after each step a new container is created. However, when using other Dockerfiles from the Hub I don't wind up with untagged images. So where do they come from? After browsing around online I have found out how to remove them but I want to gain a better understanding where they come from. Ideally I would like to prevent them from ever being created.
From their documentation
This will display untagged images, that are the leaves of the images
tree (not intermediary layers). These images occur when a new build of
an image takes the repo:tag away from the IMAGE ID, leaving it
untagged. A warning will be issued if trying to remove an image when a
container is presently using it. By having this flag it allows for
batch cleanup.
I don't quite understand this. Why are builds taking the repo:tag away from the IMAGE ID?
Whenever you assign a tag that is already in use to a new image (say, by building image foo, making a change in its Dockerfile, and then building foo again), the old image will lose that tag but will still stay around, even if all of its tags are deleted. These older versions of your images are the untagged entries in the output of docker images, and you can safely delete them with docker rmi <IMAGE HASH> (though the deletion will be refused if there's an extant container still using that image).
Docker uses a file system called AUFS, which stands for Augmented File System. Pretty much each line of a Docker file will create a new image and when you stack or augment them all on top of each other you'll get your final docker image. This is essentially a way of caching, so if you change only the 9th line of your Docker file it wont rebuild the entire image set. (Well depends on what commands you have on your Docker file, if you have a COPY or ADD nothing after that point is cached for ex)
The final image will get tagged with whatever label it has, but all these intermediary images are necessary in order to create the final image so it doesn't make sense to delete them or prevent them from being created. Hope that makes sense.

Resources