I am trying to understand Docker and its related core concepts, I came to know that there is concept of images which forms the basis of container where applications run isolated.
I also came to know that we can download the official images from docker hub, https://hub.docker.com , part of screen shot below:
My question is:
Do respective company create special/custom made OS (the minimal, for example we can see ubuntu image) for docker? If so, what benefit these companies get in creating these custom made images for docker?
One could call them custom images, however, they are just base bare images which are to be used as a starting point for your application.
They are mostly built by people who works at Docker and they are trying to ensure some guarantee of quality.
They are stripped of unnecessary packages in order to keep the image size to a minimum.
To find out more you could read this Docker documentation page or this blog post.
Related
I'm learning about Docker architecture.
I know that images are made to run applications in containers (virtualization). One thing that I stepped on was that there is an entire community hub for posting images. But what is actually the point of doing that?
Isn't the idea of images contain a very specific enviroment with very specific configurations that runs very specific applications?
The idea of images is to have a well-defined environment. The images by the community serve mostly as building blocks or base images for your own, more specific, images. For some applications, you can use an image as-is with maybe a little configuration parameters, but I would guess the more common use case is to start building your specific image based on an already existing, more general image.
Example:
You want to create an image with a certain Java application. So you look for an image that already has the Java version you want, and create an image based on that more general image.
You want to test your application on different OS versions (maybe different Linux versions). So you create a couple of images, each based on a different base image that already has the OS installed that you are interested in.
I am researching shared libraries between containers from security point of view.
In many security resources, the shared library scenario, where containers share dependency reference (i.e. same file) is discussed, I can come up with two scenarios:
De-facto discussed scenario - where some lib directory is mounted from the host machine to container
Invented scenario - where a shared volume is created for different containers (different services, replicate set of same container, or both), and it is populated with libraries which are shared between all of the containers.
Despite the discussions, I was not able to find this kind of behavior in real world, so the question is: How common is this approach?
A reference to an official and known image which uses this technique would be great!
This is generally not considered a best practice at all. A single Docker image is intended to be self-contained and include all of the application code and libraries it needs to run. SO questions aside, I’ve never encountered a Docker image that suggests using volumes to inject any sort of code into containers.
(The one exception is the node image; there’s a frequent SO question about using an anonymous volume for node_modules directories [TL;DR: changes in package.json never update the volume] but even then this is trying to avoid sharing the library tree with other contexts.)
One helpful technique is to build an intermediate base image that contains some set of libraries, and then building an application on top of that. At a mechanical level, for a particular version of the ubuntu:18.04 image, I think all other images based on that use the physically same libc.so.6, but from a Docker point of view this is an implementation detail.
I'm trying to find a recent version of Hadoop available on docker.
Is there an official Hadoop repository created since 2016 (Is there any official Docker images for Hadoop?)?
I found some repositories like :
https://hub.docker.com/r/harisekhon/hadoop/
https://hub.docker.com/r/sequenceiq/hadoop-docker/
https://hub.docker.com/r/uhopper/hadoop/
https://hub.docker.com/r/cloudera/quickstart/
https://hub.docker.com/r/mcapitanio/hadoop/
But I don't know if they are good and updated.
Can you help me to find the best image please?
Thanks
Cloudera images include much much more than only Hadoop. Therefore I wouldn't suggest that as Docker images should do one thing
I've had success with the SequenceIQ and uhopper images, and the last one in your list is deprecated if you see the description, but in truth, they all will probably work for your purposes unless you really specifically need a Hadoop 3 feature
The ones I've used recently are by bde2020
I understand that it is software shipped in some sort of binary format. In simple terms, what exactly is a docker-image? And what problem is it trying to solve? And how is it better than other distribution formats?
As a student, I don't completely get the picture of how it works and why so many companies are opting for it? Even many open source libraries are shipped as docker images.
To understand the docker images, you should better understand the main element of the Docker mechanism the UnionFS.
Unionfs is a filesystem service for Linux, FreeBSD and NetBSD which
implements a union mount for other file systems. It allows files and
directories of separate file systems, known as branches, to be
transparently overlaid, forming a single coherent file system.
The docker images сonsist of several layers(levels). Each layer is a write-protected filesystem, for every instruction in the Dockerfile created own layer, which has been placed over already created. Then the docker run or docker create is invoked, it make a layer on the top with write persmission(also it has doing a lot of other things). This approach to the distribution of containers is very good in my opinion.
Disclaimer:
It is my opinion which I'm found somewhere, feel free to correct me if I'm wrong.
How are they functioning differently?
Which features of the kernel are they using?
You can read all about it in this link
Basically, my impression is that rkt takes pride in being image-agnostic (meaning you can run images that were built using docker or other container engines) and contain less overhead than docker does. This is a nice picture to describe the differences between the two (taken from the link I've attached) -