Why ubuntu image size is drastically lower than its iso file? - docker

This may seem like a silly question, but I am a beginner to containerization concept and I was wondering why the ubuntu image size(~80mb) from docker hub is very much lesser than its iso file(~1.8GB)

Container is an isolated space of your Kernel.
On ubuntu:18.04 docker image, it doesn't contain entire Kernel binaries.
It only has some libraries and executions with some configuration required to run ubuntu:18.04, it still uses your host's Kernel.
You can take a look at how does ubuntu:18.04 image created from Dockerfile from here.
I recommend you to search how Docker use cgroups and namepaces to create container.

Docker images contains only necessary minimum library and tools which is in fact needed to be an operating system running.For Ubuntu docker images, its does not have any GUI(which is not used in container or rarely used) and most tools are also not included.Its just a base operating system.Alpine images are too much smaller than the Ubuntu images.

Related

Why does docker hub include images of (what seems like) whole operating systems?

I was browsing docker hub and noticed images of operating systems such as Ubuntu or Alpine.
I am new to Docker but as I understand the whole point of containers is that they don't store the OS. In this case, why do such images exist?
why do such images exist?
Because you need a foundation from which to build your own images.
The base os images (like the alpine or ubuntu) images are typically minimal images that contain only core Unix utilities (like a shell and tools such as ls, cp, mv, grep, sed, awk, etc) and a package manager. These are the building blocks you use to build your own images.
A typical Dockerfile -- the instructions used to build an image -- will often look broadly like:
FROM <some base image>
RUN <install a bunch of packages>
COPY <my local files into the image>
Without these base images, the process of creating new container images would be substantially more difficult.

Are there different images for different OS on docker hub

When i run the following command
docker run mongo
It will download the mongo image and run it on container.
I am running Linux on VM.
My OS details are as follows:
NAME="CentOS Linux"
VERSION="7 (Core)"
In case I am using different OS /Mac Machine / Windows, how does docker determine which image to pull. As I understand there is a single image on docker hub for mongo or is it that we can specify a specific image to run based on our OS.
At least we need take care of downloading specific version of mongo when doing installation on our local machine (when not using containers).
How is this taken care of by dockers.
Thanks.
The OS that you are running is for the most part irrelevant when it comes to pulling a docker image. As long as you are running docker (and the versions of docker are a little different from windows to Mac to Linux) on your host, you can pull any image you want. You can pull the same mongo image are run it in any operating system.
The image hides the host operating system making it easy to build an image an deploy pretty much in any machine.
Having said that you may be getting confused because image makers many times use different OS to build their applications. A quick example is people building application using an Ubuntu image but switching to an alpine based image for deployment because that is so much smaller. However, both images will run pretty much anywhere.
Probably you are confused with terms OS and Architecture?
The OS does not really matter, because, as #camba1 mentioned, the Docker daemon handles all that stuff.
What matters, is architecture, because Linux can run on ARM, AMD64, etc.
So, the Docker daemon must know which image is good for current architecture.
Here is a good article regarding this question.

what is "docker container"?

I understand docker engine sits on top of docker host (which is OS) and docker engine pull docker/container images from docker hub (or any other repo). Docker engine interact with OS to configure and set up container out of image pulled as part of "Docker Run" command.
However I quite often also came across term "Docker Container". Is this some another tool and what is its role in entire architecture ? I know there is windows container or linux containers for respective docker host..but what is it Docker Container itself ? Is it something people use loosely to simply refer to container in general ?
In simple words, when you execute a docker image, it will spawn a docker container.
You can relate it to Java class(as docker image), and when we initialize a class it will create an object(docker container).
So docker container is an executable form of a docker image. You can have multiple Docker containers from a single docker image.
A docker container is an image that is an (think of it as a tarball, or archive) executable package that can stand on its own. The image has everything it needs to run such as software, runtimes, tools, libraries, etc. Check out Docker for more information.
Docker container are nothing but processes which are spawned using image as a source.
The processes are sandboxed(isolated) from other processes in terms of namespaces and controlled in terms of memory, cpu, etc. using control groups. Control groups and namespaces are Linux kernel features which help in creating a sandboxed environment to run processes in isolation.
Container is a name docker uses to indicate these sandboxed processes.
Some trivia - the concept sandboxing process is also present in FreeBSD and it is called Jails.
While the concept isn’t new in terms on core technology. Docker were innovative to imagine entire ecosystem in terms of containers and provide excellent tools on top of kernel features.
First of all you (generally) start with a Dockerfile which is a script where you setup the docker environment in which you are going to work (the OS, the extra packages etc). If you want is like the source code in typical programming languages.
Dockerfiles are built (with the command sudo docker build pathToDockerfile/ and the result is an image. It is basically a built (or compiled if you prefer) and executable version of the environment described in you Dockerfile.
Actually you can download docker images directly from dockerhub.
Continuing the simile it is like the compiled executable.
Now you can run the image assigning to it a name or setting different attributes. This is a container. Think for example to a server environment where you might need the same service to be instantiated the same time more than once.
Continuing again the simile this is like having the same executable program being launched many times at the same time.

How are Packer and Docker different? Which one should I prefer when provisioning images?

How are Packer and Docker different? Which one is easier/quickest to provision/maintain and why? What is the pros and cons of having a dockerfile?
Docker is a system for building, distributing and running OCI images as containers. Containers can be run on Linux and Windows.
Packer is an automated build system to manage the creation of images for containers and virtual machines. It outputs an image that you can then take and run on the platform you require.
For v1.8 this includes - Alicloud ECS, Amazon EC2, Azure, CloudStack, DigitalOcean, Docker, Google Cloud, Hetzner, Hyper-V, Libvirt, LXC, LXD, 1&1, OpenStack, Oracle OCI, Parallels, ProfitBricks, Proxmox, QEMU, Scaleway, Triton, Vagrant, VirtualBox, VMware, Vultr
Docker's Dockerfile
Docker uses a Dockerfile to manage builds which has a specific set of instructions and rules about how you build a container.
Images are built in layers. Each FROM RUN ADD COPY commands modify the layers included in an OCI image. These layers can be cached which helps speed up builds. Each layer can also be addressed individually which helps with disk usage and download usage when multiple images share layers.
Dockerfiles have a bit of a learning curve, It's best to look at some of the official Docker images for practices to follow.
Packer's Docker builder
Packer does not require a Dockerfile to build a container image. The docker plugin has a HCL or JSON config file which start the image build from a specified base image (like FROM).
Packer then allows you to run standard system config tools called "Provisioners" on top of that image. Tools like Ansible, Chef, Salt, shell scripts etc.
This image will then be exported as a single layer, so you lose the layer caching/addressing benefits compared to a Dockerfile build.
Packer allows some modifications to the build container environment, like running as --privileged or mounting a volume at build time, that Docker builds will not allow.
Times you might want to use Packer are if you want to build images for multiple platforms and use the same setup. It also makes it easy to use existing build scripts if there is a provisioner for it.
Expanding on the Which one is easier/quickest to provision/maintain and why? What are the pros and cons of having a docker file?`
From personal experience learning and using both, I found: (YMMV)
docker configuration was easier to learn than packer
docker configuration was harder to coerce into doing what I wanted than packer
speed difference in creating the image was negligible, after development
docker was faster during development, because of the caching
the docker daemon consumed some system resources even when not using docker
there are a handful of processes running as the daemon
I did my development on Windows, though I was targeting LINUX servers for running the images.
That isn't an issue during development, except for a foible of running Docker on Windows.
The docker daemon reserves various TCP port ranges for itself
The ranges might change every time you reboot your system or restart the daemon
The only error message is to the effect: can't use that port! but not why it can't
BTW, The workaround is to:
turn off Hypervisor
reboot
reserve the public ports you want your host system to see
turn on hypervisor
reboot
Running packer on Windows, however, the issue I found is that the provisioner I wanted to use, ansible, doesn't run on Windows.
Sigh.
So I end up having to run packer on a LINUX system after all.
Just because I was feeling perverse, I wrote a Dockerfile so I could run both packer and ansible from my Windows station in a docker container using that image.
Docker builds images using a Dockerfile.
These can be run (Docker containers).
Packer also builds images. But you don't need a Dockerfile. And you get the option of using Provisioners such as Ansible which lets you create vastly more customisable images. It isn't used for running these images.

Docker Base Image - which flavour?

We're running a Java based Webshop on SLES12.
Currently we're deciding if we want to run the Webshop in Future as a Docker Container.
At least in our Test-Environment we will host the Webshop as Docker Container.
My Question is now: How important is it to choose the base image of the Docker Container as Production near as possible? Which means: is it necessary (or recommendable) to build the Docker Container on a SLES (or opensuse) Base Image or is it OK to keep Debian as Base Image?
What's the major difference between Debian and Suse base Images (except Packaging Tool, directoy structure and Base Image Size)
How important is it to choose the base image of the Docker Container as Production near as possible?
It is not important. It is mandatory. You docker image have to be as near as possible from your prod environment if you intend to use docker to develop and not in prod.
Which means: is it necessary (or recommendable) to build the Docker Container on a SLES (or opensuse) Base Image or is it OK to keep Debian as Base Image?
If you are not running your project via docker in prod, you need to be the nearest so if you are running on debian, use debian as base image. If you intend to run on prod with docker, it is better to keep the image as light as possible (the more if you intend to make it public). But keeping debian base is ok.

Resources