Why provide a Linux distro as a Dockerfile base when my host has all the software I need installed? - docker

I want to start writing a Docker image. I have a .net Core 2.0 Web Api service that I have deployed to an Amazon Linux machine. It runs fine, but I would like to automate the build and deployment process a bit.
As far as I am concerned, there is no need for a Parent image for the image I need to build. I might grab some files from a location, run some dotnet CLI commands, and run the service using Apache as a reverse proxy. I dont really see the need for a parent image in any of that.
I am asking this question because most of the examples I have seen include a base image. Most of the time its something very generic, like "From Ubuntu". I have read that most images will include a parent image. According to Docker's documentation:
A parent image is the image that your image is based on. It refers to the contents of the FROM directive in the Dockerfile. Each subsequent declaration in the Dockerfile modifies this parent image. Most Dockerfiles start from a parent image, rather than a base image. However, the terms are sometimes used interchangeably.
What exactly is the point of inheriting from Ubuntu? Even the Docker docs suggest using Debian "since it’s very tightly controlled and kept minimal". Does that just ensure that your Linux machine has an Ubuntu distribution? Does it even matter if I am using Amazon Linux but use the Debian image as my base?

A Docker image runs in a set of filesystem namespaces which are unconnected from the host's except where you've chosen to bind-mount a volume. This means that tools installed on the host are unavailable to the container: Just because the host runs Amazon Linux doesn't mean that the userspace commands Amazon Linux provides (and the libraries those commands use to run) are available to the guests.
Without a Linux distro available inside the container, you wouldn't have a package management tool (yum, apt-get, etc) with which to install the tools you need to download a file, run software (that presumably needs to be linked to a libc, a copy of OpenSSL, or other shared components). There are also runtime parts of a working Linux system such as the resolver that are provided in userland by your distro and not shared from the host in a Docker install.
Using a base image ensures that you have tools available inside your container -- and it ensures that that container will work consistently on any Linux system with a compatible kernel and hardware architecture.
It's possible in theory to bind-mount many of the tools from the host (as by exposing all of /usr as a volume), but doing so would defeat many of the advantages Docker offers in portability.

Related

Does Docker image include OS?

I have below Dockerfile"
FROM openjdk:12.0.2
EXPOSE 8080
ADD ./build/libs/*.jar app.jar
ENTRYPOINT ["java","-jar","/app.jar"]
The resulting Docker image encapsulates Java program. When I deploy this Docker image to Windows Server or Linux, does the image always include OS like Linux which runs on top of host OS (Windows Server or Linux) ?
I am asking this question in the sense of Docker image being physical box which contains other boxes (one being openjdk), does this box also contain Linux OS box that I can pull out of it ( assuming if this was possible) and install it as Linux OS on empty machine?
That depends on what you call the "OS". It will always contain stuff from the distribution image, it is built on.
For example, a debian based images will include apt and other debian-specific tools. But most of the stuff you need on a "complete" machine (as in non-container), will have been removed to keep the image as small as possible.
It will not contain the kernel, as it is running on the host machine and is controlled by the host's kernel.
The "official" OpenJDK images from the Docker Hub are available in variants based on a number of different Linux distributions. There is a cut-down Debian, an Alpine, and others. There are advantages and disadvantages to each.
The image will need to contain enough operating system dependencies to allow the JVM to run. It may also include basic diagnostic and management tools -- enough to carry out rudimentary troubleshooting in the container, anyway. You can expect all the images to contain at least basic console shell tools like "cp" and "cat", although they differ in implementation. For example, the Alpine variant gets these utilities from BusyBox, not from a conventional GNU/Linux installation.
It's possible to create a Docker image that contains no platform dependencies at all, but there's little incentive to be that minimal -- you'd just have to build more stuff into the application program itself.
It doesn't include the entire operating system, but the image will be dependent on either linux or windows, you can't build an image that runs on both in one Dockerfile.
The reason for the dependency is that a docker container shares resources with it's host machine in a carefully fenced off way, this mechanism is different on windows and linux (though to you, as a docker user, the difference is invisible).

does docker always need an operating system as base image

I have heard that docker doesn't need a separate os in linux, because it shares with the host os, but in hyper-v Windows it can run Windows OS because it can hyper a linux virtual machine so run linux software on it.
But, I get confused about the FROM stage in the dockerfile, all guides said like this:
FROM ubuntu:18.04
cp . /usr/local/bin
RUN make
CMD /usr/local/bin/youapp
I can understand this step, first you need an OS, then you deploy your application; finally you run your app or whatever.
But what does the FROM stage really mean?
Does it always need an OS? Does nginx docker image have an os in it?
If i want to build my own app, I write it, I compile it, I run it; but does my own app need an OS? If not, what should I write in the FROM stage?
i got this picture, it said docker container does not need os,but use the host os,now docker build always need an os
The containers on a host share the (host's) kernel but each container must provide (the subset of) the OS that it needs.
In Windows, there's a 1:1 mapping of kernel:OS but, with Linux, the kernel is bundled into various OSs: Debian, Ubuntu, Alpine, SuSE, CoreOS etc.
The FROM statement often references an operating system but it need not and it is often not necessary (nor a good idea) to bundle an operating system in a container. The container should only include what it needs.
The NGINX image uses Debian (Dockerfile).
In some cases, the container process has no dependencies beyond the kernel. In these cases, a special FROM: scratch may be used that adds nothing else. It's an empty image (link).
No its not like that. To create any docker image using DockerFile, You need to start with a base docker image. That base docker image can be anything, Like an empty image as well, In the docker file in your example the FROM section says ubuntu, it means its assuming ubuntu as the base image. Its not always needed to have an OS as base image.
Follow this link - https://linuxhint.com/create_docker_image_from_scratch/
This will clear your doubts related to base image.
now i got answer
the From stage import the software but not the OS with kernel
it just provide a platform for your application,the ubuntu,debian,centos you write in FROM stage is just a software,the true kernel does not have relationship with them.
so if your application can run dependent ,it must like hello-world ,just a binary-package,dont rely on any other library. but mostly you need an OS,because they have the library you need.
No, the FROM stage is not providing the operating system to the image. The kernel is always provided by the host system where you are running the container. The FROM stage provides the initial file system i.e., files, directories, pre-installed softwares etc for the new image. You can also start FROM scratch which is like a blank slate.
The FROM line need NOT necessarily point to any other OS:
It can be any other container or it could be FROM SCRATCH.
Containers in host share kernel so you can think as it is master process utilizing host kernel.
Generally people see HTTPD, NGINX etc. are utilizing Debian as container OS, since this Debian OS is very thin and serves the purpose of isolation and runs as independent server.
Even you can create a HTTPD, NGINX without using any OS and name with your own version :-)

Docker base images, what do they compose of?

I am trying to wrap my head around the Docker architecture, in particular figuring out what exactly a base image consists of, and in doing I have been exploring some of the images found on the docker hub. Specifically when looking at the following repo it references the centos-7.2.1511-docker.tar.xz file.
I've downloaded and examined the contents of the tar and it has your typical Linux filesystem.
As I understand it, this is not a complete Linux OS and is just a replica of a linux filesystem with all the non essentials removed? Where all other requirements are drawn from the Host OS when a container is run(?)
My question essentially boils down to how one would go about creating that tar file? What exactly do you need. My intention is not to create one but rather understand what portion of files/data/dependencies come from a target OS to create an image and what gets used on the Host OS
A Docker container is a set of processes, running a sandbox enabled by Linux namespaces, on top of the host kernel.
A Docker image is a set of layers, which are often simply tarballs, of files that are unpacked, and made to look as if they are the root of the filesystem when used to start a container.
A Docker image could be just a single statically-linked executable! You can create your own Docker image from scratch by simply creating a tarball of a single executable, and giving it to docker load which wI'll store it as the appropriate internal format and register it as an image.
As you can see then, a Docker image need not be much. It certainly doesn't need a kernel, or any of the components normally used for configuring the system, networking daemons, or even things like cron. Those are all left to the host.
Things that are usually available in an image are a dynamic library runtime, and files like /etc/hosts, /etc/resolv.conf, and other files which are referenced directly by libc. This allows you to add typical dynamically-linked executables which interact with the system as if they're running on a traditionalal OS.
I have successfully "Dockerized" a legacy CentOS 6-based VM by uninstalling as many packages as possible, then tar-ing up the filesystem (excluding directories like /proc, /sys, /dev, etc.) and loading this via docker load. Afterwards, I started a container and (sometimes forcefully) removed additional "system" packages that serve no purpose in a Docker image, like kernel, udev, etc.
This blog post goes into some of the specifics of docker load:
http://tuhrig.de/difference-between-save-and-export-in-docker/

what should be packed as a docker image?

I just get started with docker.
I'm really confused what should be packed as a docker image?
In docker hub, I can find a complete OS as a docker image: ubuntu, centos...
as well as popular platform and database like: nodejs, mongodb...
It seems to me docker hub is just like a software repository.
should everything be packed as an image? how about just a command line tool like: ls, cd, git??? they are also software, what is qualified to be a docker image??
please help clarify
Docker will provide some isolation between the programs in the image, and your host environment. In a docker image, one may package anything from a single binary, to a full environment (everything but the linux kernel).
See it as a convenience. It provides a convenient form to deploy programs that require a complex environment that may conflict with the programs installed on the host. For instance, if you're trying to package a webapp (e.g. some blog software), the docker container will allow you to ship your application code, along with a tested version of its interpreter (php, python, etc), a compatible webserver, and maybe a database environment -- together. From the perspective of the user installing your container/app, nothing else than the container is needed to run the app. It's all self-contained, and it's simpler than setting up a virtual machine.
If your binaries in the image depend on an ls command, then you'd include that as well. Generally, in the image, you include a binary (the entry point), as well as all its dependencies.
If you're familiar with chroots, you may see a docker image as a fancy chroot where the network, and process address space are also isolated, in addition to the file system.
You can think of dockerhub as an app-store.

Docker, I have one folder that contains the application server. What can be used as a container?

I want to ask, if I have one folder that contains the application server (Axis2, Tomcat, WSO2, mongodb, and jms-consumer) What can be used as a container?
Is Docker as an application installer? Which classifies the entire application so 1 is then used as installer file, for example: server.exe for windows, server.deb for ubuntu
Could help to explain it?
Docker as an application installer?
No, docker is a a platform which manages containers (isolated user/process/disk machines running with the host kernel), around building, shipping and running (Containers as a Service).
The best practice is to isolate each part of your global service in its own container, both because of the PID1 zombie reaping issue (detailed in "Use of Supervisor in docker"), but also in term of ease of management and update.
If each component only represents a Tomcat, a MongoDB, a..., each one is easier to manage/debug, instead of having one giant container.
Also you can stop/update one without necessarily ipacting all the other ones.
The installation-like part is rather the description of your environment (both in term of OS and of applications you want to add to a container) with the Dockerfile: a description of what your environment will need to run.
That helps building an image (sort of archive of all the files you need), from which you docker run a container.
Right now, those containers only runs as Linux machines on Linux kernel hosts (or on Windows, through a Linux VM).
You don't have yet pure Windows images/containers that runs on Windows (it is in progress, with Windows Server 2016).
So can you just take what you have in one giant folder and put it in a docker container?
Not directly. The goal of Dockerfile is to describe how you would install what you need.
Then you docker build, and from the image you get, you docker run.
But in order for docker to manage correctly the lifecycle of that container, it is best if the container is limited to one process (instead of trynig to run everything like a webapp server, a mongodb, and so on in the same container space)
That means:
describing in separate Dockerfile (building separate images) for each of the components of your system
running those containers in a way they see each others and communicate with each others.
You have an example of a complex multi-component system in my project: b2d.

Resources