Reducing image size and layers in Dockerfile - docker

I have Docker image layer stack that looks like this:
My question is why did it add 400+ MBs on the last command alone and why there are two CMD commands ?
First CMD command comes from base image and second one is mine, I assumed that only mine CMD command will be present. I started from this image and added installation for specific python version and few additional tools.

Related

Dockerfiles sharing the same instructions but built from a different image

What would be the best practice when I need to create multiple docker images that share the same instructions EXCEPT the FROM image?
For example, I want to build 3 different images - a Java stack, a Python stack, and a Rust stack. So I have 3 Dockerfile's each referencing a different FROM image. Then, in each of these Dockerfile, I have a long list of instructions that are exactly the same. I would rather not duplicate the instructions.
You can try passing image name as arguments if only image name is changing
Dockerfile:
ARG img
FROM $img
RUN echo “Building $img”
Then run build command on terminal:
sudo docker build . --build-arg img=busybox

DockerFile: Is it use to create an image or To the docker host how to create the container

I am confused with some terms.
Is Dockerile designed to create an image or a set of instruction of how to create a container from an image?
Because there are command e.g. FROM (to get the base image), RUN (To run executable in the container) etc. These command looks like an instruction to how to create the container.
Docker images are static, and are built from the instructions specified in the Dockerfile. They use Union File-System (UnionFS), so that the changes made when building an image are stacked on top of each other, generating a DAG (Directed Acyclic Graph) of build history. The FROM directive at the top of the Dockerfile simply points to an existing image, and starts building on top of that.
A container is simply an instantiated version of an image, basically just this UnionFS with a read/write layer dropped on top of it.
Interestingly, if you watch the output when you run docker build (in a directory with a Dockerfile) you'll see that what is happening is each instruction starts up a container based on the current state of the image, runs the command (apt-get install ... or whatever) and then commits that change to the image. That's why it's good to batch up commands in a Dockerfile - because each one will start a new container.
Dockerfile is used to create an image which you can later use to create a container using docker build.
From the docs
Docker can build images automatically by reading the instructions from a Dockerfile. A Dockerfile is a text document that contains all the commands a user could call on the command line to assemble an image. Using docker build users can create an automated build that executes several command-line instructions in succession.
Also RUN will instruction will execute any commands in a new layer on top of the current image and commit the results. The resulting committed image will be used for the next step in the Dockerfile and not "Run (To run executable in the container)". For details see this.
Image:
Docker engine use Dockerfile reference to build up Image from Dockerfile instruction like (FROM, RUN etc.)
Container:
Docker engine start container from Image and we can say Container is RUN time instance of Image

Docker create independent image

Hy i build a little image for docker on top of the debian:jessie image form the Docer Hub.
First i got debian:jessie from Docker Hub:
docker pull debian:jessie
Then I startet this image with a bash:
docker run -it debian:jessie
Then I installed my stuff e.g. ssh server and configured it.
Next from a second shell, i commitet the changes:
docker commit <running container id> debian-sshd
Now i have two images:
debian:jessie and debian-sshd
If i now want to delete debian:jessie, docker tells me i can't delete this because it has child-images(debian-sshd)
Is There a way I can make debian-sshd an independent image?
Most Dockerfiles start from a parent image. If you need to completely
control the contents of your image, you might need to create a base
image instead. Here’s the difference:
A parent image is the image that your image is based on. It refers to the contents of the FROM directive in the Dockerfile. Each
subsequent declaration in the Dockerfile modifies this parent image.
Most Dockerfiles start from a parent image, rather than a base image.
However, the terms are sometimes used interchangeably.
A base image either has no FROM line in its Dockerfile, or has FROM scratch.
Having quoted from docs, I would say that images are made up of layers, and since you have based your image on debian:jessie, one of the layers of debian-sshd is the debian:jessie image. If you want your independent image, build from scratch.
Other then that, all docker images are open source, so you can browse the dockerfile and modify it to suit your needs. Also, you could build from scratch if you want your own base image.

Is it possible to use a "blank" docker container without any install on it?

I'm new to Docker and I think having understood that Docker is a Software virtualization tool (by opposition to OS virtualization). I understand, by this image, that Docker provides a very blank environment with a given file structure and is executing on the kernel Host. What we need to do is to put our application and its dependencies (with no OS) to have a very light portable container of our app.
But it seems there is a dark side of Docker : each Dockerfile begins with a "FROM ".
I saw this and this but I'm not sure to understand. It sounds that Docker is near an kind of simplified OS virtualizer.
I was interesting in the advantage of images size. But if we have to install an OS on each image my "portable" application will be quite heavy quickly.
Is there really no way to use a "blank image" ?
You can start with FROM scratch which is an empty filesystem.
Please see the section on Creating a Base Image if you'd like to spin up your own minimal root file system.
You might be surprised how many dependencies your application actually has on the root file system, and in the end, it is usually more efficient to use one of the standard root file systems in your FROM statement, as Charles Duffy commented above.
empty/Dockerfile
FROM scratch
WORKDIR /
build and check size
docker build empty/ -t empty
docker images | grep empty
This may be a bit too late. But I just had a use case where I needed to create a bare bone container that I could launch as part of multi-container docker-compose and get into it afterwards via /bin/bash. Keep in mind, a docker container must run a service and the container will be in existence only for as long as the service is running. So, I created this container with just python in it. I copied a 2 line python script that just makes it sleep. Here's what I did.
1. Create the python script wait_service.py with the following code:
import time
time.sleep(1000)
2. Create the Dockerfile with just the following lines:
FROM python:2.7
RUN mkdir -p /test
WORKDIR /test
COPY wait_service.py /test/
CMD python wait_service.py
3. Build and run the container. Using the container id, I could then get inside it. Please adjust the sleep time based on how long you want to keep this container.
Your application haveto have some underlying OS, without, there is no way for it to start..
I think the most basic one in the docker index is busybox, so a FROM busybox will give you a very minimal setup.
Docker is also using a lot of caching for each of its layers. So every docker container that uses FROM centos:centos7 at the top will only use 1 single set of minimal centos7 image.
The base images are very minimalistic, so it is nothing to worry about..

Why do you need a base image with Docker?

I have went through every single page of the documentation of Docker.
I do not understand, yet still, why a "base image" (for example, the Ubuntu Base Image) is necessary to furnish the containers before installing/creating an application environment.
My questions:
What is a base image and why is it required?
Why is it not possible to just to create a container and put the application in it similar to virtualenv of Python?
In fact, Docker works through application of layers that are added to the base image. As you have to maintain coherence between all these layers, you cannot base your first image on a moving target (i.e. your writable file-system). So, you need a read-only image that will stay forever the same.
Here is an excerpt of the documentation of Docker about the images:
Since Docker uses a Union File System, the processes think the whole file system is mounted read-write. But all the changes go to the top-most writable layer, and underneath, the original file in the read-only image is unchanged. Since images don’t change, images do not have state.
An image is just a snapshot of file system and dependencies or a specific set of directories of a particular application/software. By snapshot I mean, a copy of just those files which are required to run that piece of software (for example mysql, redis etc.) with basic configurations in a container environment. When you create a container using an image, a small section of resources from your system are isolated with the help of namespacing and cgroups, and then the files inside the image are copied in this isolated environment of resources.
Let us understand what is a base image:
A base image is a starting point or an initial step for the image that we finally want to create.
Suppose you want an image that runs redis (this is a silly example and you can achieve it another way, but just for the sake of explanation think you will not find that image on docker hub) You would need a starting point to create the image for that. So let us take Alpine image as a base image.
Alpine is the lightest image that contains files just to run basic commands(for example: ls, cd, apk add inside the container).
Create a Dockerfile with following commands:
FROM alpine
RUN apk add --update redis
CMD ["redis-server"]
Now when you run docker build . command, it gives the following output:
Sending build context to Docker daemon 2.048kB
Step 1/3 : FROM alpine
---> a24bb4013296
Step 2/3 : RUN apk add --update redis
---> Running in 535bfd2d1ff1
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/main/x86_64/APKINDEX.tar.gz
fetch http://dl-
cdn.alpinelinux.org/alpine/v3.12/community/x86_64/APKINDEX.tar.gz
(1/1) Installing redis (5.0.9-r0)
Executing redis-5.0.9-r0.pre-install
Executing redis-5.0.9-r0.post-install
Executing busybox-1.31.1-r16.trigger
OK: 7 MiB in 15 packages
Removing intermediate container 535bfd2d1ff1
---> 4c288890433b
Step 3/3 : CMD ["redis-server"]
---> Running in 7f01a4da3209
Removing intermediate container 7f01a4da3209
---> fc26d7967402
Successfully built fc26d7967402
This output shows that in Step 1/3 it takes the base alpine image, in Step 2/3, adds a layer of redis to it and then executes the redis-server command in Step 3/3 whenever the container is started. The RUN command is only executed when the image is is build process.
Further explanation of output is out of the scope of this question.
So when you pull an image from docker hub, it just has the configurations to run the basic requirements. When you need to add your own requirements and configurations to an image, you create a Dockerfile and add dependencies layer by layer on a base image to run it according to your needs.
In simple words I can explain that..as we use certain libraries and node packages for our application in similar way we can use Base Images which are already made and use them with simple search.You can also define your own base image and make use of it.
From Docker docs,
"A container is nothing but a running process, with some added encapsulation features applied to it in order to keep it isolated from the host and from other containers.
One of the most important aspects of container isolation is that *each container interacts with its own private filesystem; this filesystem is provided by a Docker image (like image of any Linux OS - which is also the Base image)." The final image may include multiple layers which are just some other filesystem changes. Like for running a Java application, you put on a JDK layer on top of the Base Linux image.
*Credits: Image taken from Educative.io

Resources