What is docker's scratch image? - docker

I'm new to docker and I was trying out the first hello world example in the docs. As I understand the hello-world image is based on top of the scratch image. Could someone please explain how the scratch image works? As I understand it is essentially blank. How is the binary executed in the hello-world image then?

The scratch image is the most minimal image in Docker. This is the base ancestor for all other images. The scratch image is actually empty. It doesn't contain any folders/files ...
The scratch image is mostly used for building other base images. For instance, the debian image is built from scratch as such:
FROM scratch
ADD rootfs.tar.xz /
CMD ["bash"]
The rootfs.tar.xz contains all the files system files. The Debian image adds the filesystem folders to the scratch image, which is empty.
As I understand it is essentially blank. How is the binary executed in
the hello-world image then?
The scratch image is blank.The hello-world executable added to the scratch image is actually statically compiled, meaning that it is self-contained and doesn't need any additional libraries to execute.
As stated in the offical docker docs:
Assuming you built the “hello” executable example from the Docker
GitHub example C-source code, and you compiled it with the -static
flag, you can then build this Docker image using: docker build --tag
hello
This confirms that the hello-world executable is statically compiled. For more info about static compiling, read here.

A bit late to the party, but adding to the answer of #yamenk.
Scratch isn't technically an image, but it's merely a reference. The way container images are constructed is that it makes use of the underlying Kernel providing only the tools and system calls that are present inside the kernel. Because in Linux everything is a file you can add any self-contained binary or an entire operating system as a file in this filesystem.
This means that when creating an image from Scratch, technically refers to the Kernel of the host system and all the files on top of it are loaded. That's why building from Scratch is no also a no-op operation and when adding just a single binary the size of the image is only the size of that binary plus a bit of overhead.
The resources that you can assign when executing an image in a container is by leveraging the cgroups functionality and the networking makes use of the linux network namespacing technique.

In a short, The official scratch image contains nothing, totally zero bytes.
But the container instance is not what the container image looks like. Even the scratch image is empty. When the container like runC run up a instance from a image built from scratch, It need more things (like rootfs etc.) than what you can see in the dockfile.

Related

Purpose of FROM command - Docker file

Main purpose of Docker container is to avoid carrying guest OS in every container, as shown below.
As mentioned here, The FROM instruction initializes a new build stage and sets the Base Image for subsequent instructions. As such, a valid Dockerfile must start with a FROM instruction.
My understanding is, FROM <image> allow a container to run on its own OS.
Why a valid Docker file must have FROM instruction?
Containers don't run a full OS, they share the kernel of the host OS (typically, the Linux kernel). That's the "Host Operating System" box in your right image.
They do provide what's called "user space isolation" though - roughly speaking, this means that every container manages its own copy of the part of the OS which runs in user mode- typically, that's a Linux distribution such as Ubuntu. In your right image, that would be contained in the "Bins/Libs" box.
You can leave out the FROM line in your Dockerfile, or use FROM scratch, to create a blank base image, then add all the user mode pieces on top of a blank kernel yourself.
Another common use of FROM is to chain builds together to form a multi-stage build of smaller images.
This would be useful for instance to limit redundant rebuilding during failed auto-builds.
FROM instruction specifies the underlying OS architecture that you are gonna use to build the image. You have to use some form of base image for you to get started with building an image. It can be ubuntu, centos or any minimal linux image like ALPINE which is only 5MB!. The idea is to install only the packages you need rather than having everything bundled and packaged as a distribution. This makes the size of the docker images very small as compared to the full blown OS distribution. I hope this answers your question. Let me know if you have any questions.

Override a volume when Building docker image from another docker image

sorry if the question is basic but would it be possible to build a docker image from another one with a different volume in the new image? My use case is the following:
Start From image library/odoo (cfr. https://hub.docker.com/_/odoo/)
upload folders into the volume "/mnt/extra-addons"
build a new image, tag it then put it in our internal image repo
how can we achieve that? I would like to avoid putting extra folders into the host filesystem
thanks a lot
This approach seems to work best until the Docker development team adds the capability you are looking for.
Dockerfile
FROM percona:5.7.24 as dbdata
MAINTAINER monkey#blackmirror.org
FROM centos:7
USER root
COPY --from=dbdata / /
Do whatever you want . This eliminates the VOLUME issue. Heck maybe I'll write tool to automatically do this :)
You have a few options, without involving the host OS that runs the container.
Make your own Dockerfile, inherit from the library/odoo Docker image using a FROM instruction, and COPY files into the /mnt/extra-addons directory. This still involves your host OS somewhat, but may be acceptable since you wouldn't necessarily be building the Docker image on the same host you were running it.
Make your own Dockerfile, as in (1), but use an entrypoint script to download the contents of /mnt/extra-addons at runtime. This would increase your container startup time since the download would need to take place before running your service, but no host directories would need be involved.
Personally I would opt for (1) if your build pipeline supports it. That would bake the addons right into the image, so the image itself would be a complete, ready-to-go build artifact.

How can I see Dockerfile for each docker image?

I have the following docker images.
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
hello-world latest 48b5124b2768 2 months ago 1.84 kB
docker/whalesay latest 6b362a9f73eb 22 months ago 247 MB
Is there a way I can see the Dockerfile of each docker image on my local system?
The answer at Where to see the Dockerfile for a docker image? does not help me because it does not exactly show the Dockerfile but the commands run to create the image. I want the Dockerfile itself.
As far as I know, no, you can't. Because a Dockerfile is used for building the image, it is not packed with the image itself. That means you should reverse engineer it. You can use docker inspect on an image or container, thus getting some insight and a feel of how it is configured. The layers an image are also visible, since you pull them when you pull a specific image, so that is also no secret.
However, you can usually see the Dockerfile in the repository of the image itself on Dockerhub. I can't say most repositories have Dockerfiles attached, but the most of the repositories I seen do have it.
Different repository maintainers may opt for different ways to document the Dockerfiles. You can see the Dockerfile tab on the repository page if automatic builds are set up. But when multiple parallel versions are available (like for Ubuntu), maintainers usually opt to put links the Dockerfiles for different versions in the description. If you take a look here: https://hub.docker.com/_/ubuntu/, under the "Supported tags" (again, for Ubuntu), you can see there are links to multiple Dockerfiles, for each respective Ubuntu version.
As the images are downloaded directly from the Dockerhub, only the image is pulled from the docker hub into your machine. If you want to see the dockerfile, then you can go to docker hub and type the image name and version name in the tag format (e.g ubuntu:14.04) this will open the image along with Docker file details. Also keep in mind, only if the owner of the image shared their Dockerfile, you can see it. Otherwise not. Most official images will not provide you with Dockerfile.
Hope it helps!
You can also regenerate the dockerfile from an image or use the docker history <image name> command to see what is inside.
check this: Link to answer
TL;DR
So if you have a docker image that was built by a dockerfile, you can recover this information (All except from the original FROM command, which is important, I’ll grant that. But you can often guess it, especially by entering the container and asking “What os are you?”). However, the maker of the image could have manual steps that you’d never know about anyways, plus they COULD just export an image, and re-import it and there would be no intermediate images at that point.
One approach could be to save the image in a image.tar file. Next extract the file and try to explore if you can find Dockerfile in any of the layer directories.
docker image save -o hello.tar hello-world
This will output a hello.tar file.
hello.tar is the compressed output image file and hello-world is the name of the image you are saving.
After that, extract the compressed file and explore the image layer directories. You may find Dockerfile in one of the directories.
However, there is one thing to note, if the image was built while ignoring the Dockerfile in the .dockerignore. Then you will not find the Dockerfile by this approach.

Different images in containers

I want to create separated containers with a single service in each (more or less). I am using the php7-apache image which seems to use a base image of debian:jessie, php7 and apache. Since apache and php in this case are pretty intertwined I don't mind using this container.
I want to start adding other services to their own containers (git for example) and was considering using a tiny base image like busybox or alpinebox for these containers to keep image size down.
That said, I have read that using the same base image as other containers only gives you the 'penalty' of the one time image download of the base OS (debian jessie) which is then cached - while using tiny OSes in other containers will download those OSes on top of the base OS.
What is the best practice in this case? Should I use the same base image (debian jessie) for all the containers in this case?
You may want to create a base image from scratch. Create a base image from scratch.
From docker documentation
You can use Docker’s reserved, minimal image, scratch, as a starting point for building containers. Using the scratch “image” signals to the build process that you want the next command in the Dockerfile to be the first filesystem layer in your image.
While scratch appears in Docker’s repository on the hub, you can’t pull it, run it, or tag any image with the name scratch. Instead, you can refer to it in your Dockerfile. For example, to create a minimal container using scratch:
This example creates the hello-world image used in the tutorials. If you want to test it out, you can clone the image repo

How to see tree view of docker images?

I know docker has deprecated --tree flag from docker images command. But I could not find any handy command to get same output like docker images --tree. I found dockviz. But it seems to be another container to run. Is there any built in cli command to see tree view of images without using dockviz
Update Nov. 2021: for online public image, you have the online service contains.dev.
Update Nov. 2018, docker 18.09.
You now have wagoodman/dive, A tool for exploring each layer in a docker image
To analyze a Docker image simply run dive with an image tag/id/digest:
dive <your-image-tag>
or if you want to build your image then jump straight into analyzing it:
dive build -t <some-tag> .
The current (Sept 2015, docker 1.8) workaround mentioned by issue 5001 remains dockviz indeed:
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock nate/dockviz images -t
The -t allows to remain in CLI only (no graphics needed)
Update Sept. 2016 (post docker 1.10: docker 1.11 soon 1.12), one year later, as mentioned in the same issue 5001, by Michael Härtl:
Since 1.10 the way layer IDs worked has changed fundamentally. For a lengthy explanation of this topic see #20399. There's also #20451 but I'm not sure, if this could be used by the nate/dockviz image.
Personally I find the way the new layers work very very confusing and much less transparent than before. And it's not really well documented either.
AFAIK #tonistiigi's comments in the issue above are the only public explanation available.
Tõnis Tiigi:
Pre v1.10 there was no concept of layers or the other way to think about it is that every image only had one layer. You built a chain of images and you pushed and pulled a chain. All these images in the chain had their own config.
Now there is a concept of a layer that is a content addressable filesystem diff. Every image configuration has an array of layer references that make up the root filesystem of the image and no image requires anything from its parent to run. Push and pull only move a single image, the parent images are only generated for a local build to use for the cache.
If you build an image with the Dockerfile, every command adds a history item into the image configuration. This stores to command so you can see it in docker history. As this is part of image configuration it also moves with push/pull and is included in the checksum verification.
Here are some examples of content addressable configs:
https://gist.github.com/tonistiigi/6447977af6a5c38bbed8
Terms in v1.10: (the terms really have not changed in implementation but previously our docs probably simplified things).
Layer is a filesystem diff. Bunch of files that when stacked on top of each other make up a root filesystem. Layers are managed by graphdrivers, they don't know anything about images.
Image is something you can run and that shows up in docker images -a. Needs to have a configuration object. When container starts it needs some kind of way to generate a root filesystem from image info. On build every Dockerfile command creates a new image.
You can refer to the more recent project TomasTomecek/sen, which:
had to understand 1.10 new layer format (commit 82b224e)
include an image tree representation:

Resources