How to see tree view of docker images? - docker

I know docker has deprecated --tree flag from docker images command. But I could not find any handy command to get same output like docker images --tree. I found dockviz. But it seems to be another container to run. Is there any built in cli command to see tree view of images without using dockviz

Update Nov. 2021: for online public image, you have the online service contains.dev.
Update Nov. 2018, docker 18.09.
You now have wagoodman/dive, A tool for exploring each layer in a docker image
To analyze a Docker image simply run dive with an image tag/id/digest:
dive <your-image-tag>
or if you want to build your image then jump straight into analyzing it:
dive build -t <some-tag> .
The current (Sept 2015, docker 1.8) workaround mentioned by issue 5001 remains dockviz indeed:
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock nate/dockviz images -t
The -t allows to remain in CLI only (no graphics needed)
Update Sept. 2016 (post docker 1.10: docker 1.11 soon 1.12), one year later, as mentioned in the same issue 5001, by Michael Härtl:
Since 1.10 the way layer IDs worked has changed fundamentally. For a lengthy explanation of this topic see #20399. There's also #20451 but I'm not sure, if this could be used by the nate/dockviz image.
Personally I find the way the new layers work very very confusing and much less transparent than before. And it's not really well documented either.
AFAIK #tonistiigi's comments in the issue above are the only public explanation available.
Tõnis Tiigi:
Pre v1.10 there was no concept of layers or the other way to think about it is that every image only had one layer. You built a chain of images and you pushed and pulled a chain. All these images in the chain had their own config.
Now there is a concept of a layer that is a content addressable filesystem diff. Every image configuration has an array of layer references that make up the root filesystem of the image and no image requires anything from its parent to run. Push and pull only move a single image, the parent images are only generated for a local build to use for the cache.
If you build an image with the Dockerfile, every command adds a history item into the image configuration. This stores to command so you can see it in docker history. As this is part of image configuration it also moves with push/pull and is included in the checksum verification.
Here are some examples of content addressable configs:
https://gist.github.com/tonistiigi/6447977af6a5c38bbed8
Terms in v1.10: (the terms really have not changed in implementation but previously our docs probably simplified things).
Layer is a filesystem diff. Bunch of files that when stacked on top of each other make up a root filesystem. Layers are managed by graphdrivers, they don't know anything about images.
Image is something you can run and that shows up in docker images -a. Needs to have a configuration object. When container starts it needs some kind of way to generate a root filesystem from image info. On build every Dockerfile command creates a new image.
You can refer to the more recent project TomasTomecek/sen, which:
had to understand 1.10 new layer format (commit 82b224e)
include an image tree representation:

Related

How to improve automation of running container's base image updates?

I want all running containers on my server to always use the latest version of an official base image e.g. node:16.3 in order to get security updates. To achieve that I have implemented an image update mechanism for all container images in my registry using a CI workflow which has some limitations described below.
I have read the answers to this question but they either involve building or inspecting images on the target server which I would like to avoid.
I am wondering whether there might be an easier way to achieve the container image updates or to alleviate some of the caveats I have encountered.
Current Image Update Mechanism
I build my container images using the FROM directive with the minor version I want to use:
FROM node:16.13
COPY . .
This image is pushed to a registry as my-app:1.0.
To check for changes in the node:16.3 image compared to when I built the my-app:1.0 image I periodically compare the SHA256 digests of the layers of the node:16.3 with those of the first n=(number of layers of node:16.3) layers of my-app:1.0 as suggested in this answer. I retrieve the SHA256 digests with docker manifest inpect <image>:<tag> -v.
If they differ I rebuild my-app:1.0 and push it to my registry thus ensuring that my-app:1.0 always uses the latest node:16.3 base image.
I keep the running containers on my server up to date by periodically running docker pull my-app:1.0 on the server using a cron job.
Limitations
When I check for updates I need to download the manifests for all my container images and their base images. For images hosted on Docker Hub this unfortunately counts against the download rate limit.
Since I always update the same image my-app:1.0 it is hard to track which version is currently running on the server. This information is especially important when the update process breaks a service. I keep track of the updates by logging the output of the docker pull command from the cron job.
To be able to revert the container image on the server I have to keep previous versions of the my-app:1.0 images as well. I do that by pushing incremental patch version tags along with the my-app:1.0 tag to my registry e.g. my-app:1.0.1, my-app:1.0.2, ...
Because of the way the layers of the base image and the app image are compared it is not possible to detect a change in the base image where only the uppermost layers have been removed. However I do not expect this to happen very frequently.
Thank you for your help!
There are a couple of things I'd do to simplify this.
docker pull already does essentially the sequence you describe, of downloading the image's manifest and then downloading layers you don't already have. If you docker build a new image with an identical base image, an identical Dockerfile, and identical COPY source files, then it won't actually produce a new image, just put a new name on the existing image ID. So it's possible to unconditionally docker build --pull images on a schedule, and it won't really use additional space. (It could cause more redeploys if neither the base image nor the application changes.)
[...] this unfortunately counts against the download rate limit.
There's not a lot you can do about that beyond running your own mirror of Docker Hub or ensuring your CI system has a Docker Hub login.
Since I always update the same image my-app:1.0 it is hard to track which version is currently running on the server. [...] To be able to revert the container image on the server [...]
I'd recommend always using a unique image tag per build. A sequential build ID as you have now works, date stamps or source-control commit IDs are usually easy to come up with as well. When you go to deploy, always use the full image tag, not the abbreviated one.
docker pull registry.example.com/my-app:1.0.5
docker stop my-app
docker rm my-app
docker run -d ... registry.example.com/my-app:1.0.5
docker rmi registry.example.com/my-app:1.0.4
Now you're absolutely sure which build your server is running, and it's easy to revert should you need to.
(If you're using Kubernetes as your deployment environment, this is especially important. Changing the text value of a Deployment object's image: field triggers Kubernetes's rolling-update mechanism. That approach is much easier than trying to ensure that every node has the same version of a shared tag.)

What is docker's scratch image?

I'm new to docker and I was trying out the first hello world example in the docs. As I understand the hello-world image is based on top of the scratch image. Could someone please explain how the scratch image works? As I understand it is essentially blank. How is the binary executed in the hello-world image then?
The scratch image is the most minimal image in Docker. This is the base ancestor for all other images. The scratch image is actually empty. It doesn't contain any folders/files ...
The scratch image is mostly used for building other base images. For instance, the debian image is built from scratch as such:
FROM scratch
ADD rootfs.tar.xz /
CMD ["bash"]
The rootfs.tar.xz contains all the files system files. The Debian image adds the filesystem folders to the scratch image, which is empty.
As I understand it is essentially blank. How is the binary executed in
the hello-world image then?
The scratch image is blank.The hello-world executable added to the scratch image is actually statically compiled, meaning that it is self-contained and doesn't need any additional libraries to execute.
As stated in the offical docker docs:
Assuming you built the “hello” executable example from the Docker
GitHub example C-source code, and you compiled it with the -static
flag, you can then build this Docker image using: docker build --tag
hello
This confirms that the hello-world executable is statically compiled. For more info about static compiling, read here.
A bit late to the party, but adding to the answer of #yamenk.
Scratch isn't technically an image, but it's merely a reference. The way container images are constructed is that it makes use of the underlying Kernel providing only the tools and system calls that are present inside the kernel. Because in Linux everything is a file you can add any self-contained binary or an entire operating system as a file in this filesystem.
This means that when creating an image from Scratch, technically refers to the Kernel of the host system and all the files on top of it are loaded. That's why building from Scratch is no also a no-op operation and when adding just a single binary the size of the image is only the size of that binary plus a bit of overhead.
The resources that you can assign when executing an image in a container is by leveraging the cgroups functionality and the networking makes use of the linux network namespacing technique.
In a short, The official scratch image contains nothing, totally zero bytes.
But the container instance is not what the container image looks like. Even the scratch image is empty. When the container like runC run up a instance from a image built from scratch, It need more things (like rootfs etc.) than what you can see in the dockfile.

How can I see Dockerfile for each docker image?

I have the following docker images.
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
hello-world latest 48b5124b2768 2 months ago 1.84 kB
docker/whalesay latest 6b362a9f73eb 22 months ago 247 MB
Is there a way I can see the Dockerfile of each docker image on my local system?
The answer at Where to see the Dockerfile for a docker image? does not help me because it does not exactly show the Dockerfile but the commands run to create the image. I want the Dockerfile itself.
As far as I know, no, you can't. Because a Dockerfile is used for building the image, it is not packed with the image itself. That means you should reverse engineer it. You can use docker inspect on an image or container, thus getting some insight and a feel of how it is configured. The layers an image are also visible, since you pull them when you pull a specific image, so that is also no secret.
However, you can usually see the Dockerfile in the repository of the image itself on Dockerhub. I can't say most repositories have Dockerfiles attached, but the most of the repositories I seen do have it.
Different repository maintainers may opt for different ways to document the Dockerfiles. You can see the Dockerfile tab on the repository page if automatic builds are set up. But when multiple parallel versions are available (like for Ubuntu), maintainers usually opt to put links the Dockerfiles for different versions in the description. If you take a look here: https://hub.docker.com/_/ubuntu/, under the "Supported tags" (again, for Ubuntu), you can see there are links to multiple Dockerfiles, for each respective Ubuntu version.
As the images are downloaded directly from the Dockerhub, only the image is pulled from the docker hub into your machine. If you want to see the dockerfile, then you can go to docker hub and type the image name and version name in the tag format (e.g ubuntu:14.04) this will open the image along with Docker file details. Also keep in mind, only if the owner of the image shared their Dockerfile, you can see it. Otherwise not. Most official images will not provide you with Dockerfile.
Hope it helps!
You can also regenerate the dockerfile from an image or use the docker history <image name> command to see what is inside.
check this: Link to answer
TL;DR
So if you have a docker image that was built by a dockerfile, you can recover this information (All except from the original FROM command, which is important, I’ll grant that. But you can often guess it, especially by entering the container and asking “What os are you?”). However, the maker of the image could have manual steps that you’d never know about anyways, plus they COULD just export an image, and re-import it and there would be no intermediate images at that point.
One approach could be to save the image in a image.tar file. Next extract the file and try to explore if you can find Dockerfile in any of the layer directories.
docker image save -o hello.tar hello-world
This will output a hello.tar file.
hello.tar is the compressed output image file and hello-world is the name of the image you are saving.
After that, extract the compressed file and explore the image layer directories. You may find Dockerfile in one of the directories.
However, there is one thing to note, if the image was built while ignoring the Dockerfile in the .dockerignore. Then you will not find the Dockerfile by this approach.

Different images in containers

I want to create separated containers with a single service in each (more or less). I am using the php7-apache image which seems to use a base image of debian:jessie, php7 and apache. Since apache and php in this case are pretty intertwined I don't mind using this container.
I want to start adding other services to their own containers (git for example) and was considering using a tiny base image like busybox or alpinebox for these containers to keep image size down.
That said, I have read that using the same base image as other containers only gives you the 'penalty' of the one time image download of the base OS (debian jessie) which is then cached - while using tiny OSes in other containers will download those OSes on top of the base OS.
What is the best practice in this case? Should I use the same base image (debian jessie) for all the containers in this case?
You may want to create a base image from scratch. Create a base image from scratch.
From docker documentation
You can use Docker’s reserved, minimal image, scratch, as a starting point for building containers. Using the scratch “image” signals to the build process that you want the next command in the Dockerfile to be the first filesystem layer in your image.
While scratch appears in Docker’s repository on the hub, you can’t pull it, run it, or tag any image with the name scratch. Instead, you can refer to it in your Dockerfile. For example, to create a minimal container using scratch:
This example creates the hello-world image used in the tutorials. If you want to test it out, you can clone the image repo

Docker hub image cache doesn't seem to be working

We have a continuous integration pipeline on circleci that does the following:
Loads repo/image:mytag1 from the cache directory to be able to use cached layers
Builds a new version: docker build -t repoimage:mytag2
Saves the new version to the cache directory with docker save
Runs tests
Pushes to docker hub: docker push repo/image:mytag2
The problem is with step 5. The push step takes 5 minutes every time. If I understand it correctly, docker hub is meant to cache layers so we don't have to re-push things like the base image and dependencies if they are not updated.
I ran the build twice in a row, and I see a lot of crossover in the hash of the layers being pushed. Yet rather than "Image already exists" I see "Image successfully pushed".
Here's the output of build 1's docker push, and here's build 2
If you diff those two files you'll see that only 2 layers differ in each build:
< ca44fed88be6: Buffering to Disk
< ca44fed88be6: Image successfully pushed
< 5dbd19bfac8a: Buffering to Disk
< 5dbd19bfac8a: Image successfully pushed
---
> 9136b10cfb72: Buffering to Disk
> 9136b10cfb72: Image successfully pushed
> 0388311b6857: Buffering to Disk
> 0388311b6857: Image successfully pushed
So why is it that all the images have to re-push every time?
Using a different tag creates a different image which, when pushed, cannot rely on the cache.
For example the two commands:
$ docker commit -m "thing" -a "me" db65bf421f96 me/thing:v1
$ docker commit -m "thing" -a "me" db65bf421f96 me/thing:v2
yield utterly distinctimages even though they were created from identical images (db65bf421f96). When pushed, dockerhub must treat them as completely separate images as can be seen with:
$ docker images
REPOSITORY TAG IMAGE ID
me/thing v2 f14aa8ac6bae
me/thing v1 c7d72ccc1d71
The image IDs are unique and thus the images are unique even only if they vary in tags.
You could say "docker should recognize them as being bit for bit identical" and thus treat them as cachable. But it doesn't (yet).
The only surprise for me in your example is that you got any duplicate image IDs at all.
Authoritative (if less explanatory) documentation can be found at docker in "Build your own images".
The process should work as you described. In fact we're building all of our images in this way without problems. Usually there are just a few changes to the topmost layers and only those are pushed to the registry - otherwise the whole concept of image layers would be useless.
See here for an example. Only the two topmost layers have changed, are pushed for :latest and for :4.0.2 there's no push at all. We're tagging images with git tags and for some projects we even tag images with git describe - to get the rollback functionality, just in case.
You can get the project source-code also from GitHub to try it out.
A few things to note about the setup: We're using a self-hosted GitLab CI with a customized runner which runs docker and docker-compose on an isolated host with Docker 1.9.1, but that should not make any difference.
There may be also differences in the registry version, I had the feeling (but I am not 100% sure) that some older repos on DockerHub are still running on registry v1, newer ones always on v2 - so you may try creating a new repo and see if the issue still occurs.
Please note that the behavior for tags described above does only apply when pushing the same image-name, if you push the same image layers with another name, you always need to push all layers, despite the fact that all layers should already exists on the registry, so I guess repo/image:mytag1 and repoimage:mytag2 actually go to repo/image and the missing slash is just a typo.
Another cause could be that your images are built on different hosts on Circle CI, but then you should also get different layer IDs, so I think this is not very likely.
I suggest to build an image manually and try to reproduce the problem or contact Circle CI about this issue.

Resources