what is "docker container"? - docker

I understand docker engine sits on top of docker host (which is OS) and docker engine pull docker/container images from docker hub (or any other repo). Docker engine interact with OS to configure and set up container out of image pulled as part of "Docker Run" command.
However I quite often also came across term "Docker Container". Is this some another tool and what is its role in entire architecture ? I know there is windows container or linux containers for respective docker host..but what is it Docker Container itself ? Is it something people use loosely to simply refer to container in general ?

In simple words, when you execute a docker image, it will spawn a docker container.
You can relate it to Java class(as docker image), and when we initialize a class it will create an object(docker container).
So docker container is an executable form of a docker image. You can have multiple Docker containers from a single docker image.

A docker container is an image that is an (think of it as a tarball, or archive) executable package that can stand on its own. The image has everything it needs to run such as software, runtimes, tools, libraries, etc. Check out Docker for more information.

Docker container are nothing but processes which are spawned using image as a source.
The processes are sandboxed(isolated) from other processes in terms of namespaces and controlled in terms of memory, cpu, etc. using control groups. Control groups and namespaces are Linux kernel features which help in creating a sandboxed environment to run processes in isolation.
Container is a name docker uses to indicate these sandboxed processes.
Some trivia - the concept sandboxing process is also present in FreeBSD and it is called Jails.
While the concept isn’t new in terms on core technology. Docker were innovative to imagine entire ecosystem in terms of containers and provide excellent tools on top of kernel features.

First of all you (generally) start with a Dockerfile which is a script where you setup the docker environment in which you are going to work (the OS, the extra packages etc). If you want is like the source code in typical programming languages.
Dockerfiles are built (with the command sudo docker build pathToDockerfile/ and the result is an image. It is basically a built (or compiled if you prefer) and executable version of the environment described in you Dockerfile.
Actually you can download docker images directly from dockerhub.
Continuing the simile it is like the compiled executable.
Now you can run the image assigning to it a name or setting different attributes. This is a container. Think for example to a server environment where you might need the same service to be instantiated the same time more than once.
Continuing again the simile this is like having the same executable program being launched many times at the same time.

Related

Docker container image vs container [duplicate]

When using Docker, we start with a base image. We boot it up, create changes and those changes are saved in layers forming another image.
So eventually I have an image for my PostgreSQL instance and an image for my web application, changes to which keep on being persisted.
What is a container?
An instance of an image is called a container. You have an image, which is a set of layers as you describe. If you start this image, you have a running container of this image. You can have many running containers of the same image.
You can see all your images with docker images whereas you can see your running containers with docker ps (and you can see all containers with docker ps -a).
So a running instance of an image is a container.
From my article on Automating Docker Deployments (archived):
Docker Images vs. Containers
In Dockerland, there are images and there are containers. The two are closely related, but distinct. For me, grasping this dichotomy has clarified Docker immensely.
What's an Image?
An image is an inert, immutable, file that's essentially a snapshot of a container. Images are created with the build command, and they'll produce a container when started with run. Images are stored in a Docker registry such as registry.hub.docker.com. Because they can become quite large, images are designed to be composed of layers of other images, allowing a minimal amount of data to be sent when transferring images over the network.
Local images can be listed by running docker images:
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
ubuntu 13.10 5e019ab7bf6d 2 months ago 180 MB
ubuntu 14.04 99ec81b80c55 2 months ago 266 MB
ubuntu latest 99ec81b80c55 2 months ago 266 MB
ubuntu trusty 99ec81b80c55 2 months ago 266 MB
<none> <none> 4ab0d9120985 3 months ago 486.5 MB
Some things to note:
IMAGE ID is the first 12 characters of the true identifier for an image. You can create many tags of a given image, but their IDs will all be the same (as above).
VIRTUAL SIZE is virtual because it's adding up the sizes of all the distinct underlying layers. This means that the sum of all the values in that column is probably much larger than the disk space used by all of those images.
The value in the REPOSITORY column comes from the -t flag of the docker build command, or from docker tag-ing an existing image. You're free to tag images using a nomenclature that makes sense to you, but know that docker will use the tag as the registry location in a docker push or docker pull.
The full form of a tag is [REGISTRYHOST/][USERNAME/]NAME[:TAG]. For ubuntu above, REGISTRYHOST is inferred to be registry.hub.docker.com. So if you plan on storing your image called my-application in a registry at docker.example.com, you should tag that image docker.example.com/my-application.
The TAG column is just the [:TAG] part of the full tag. This is unfortunate terminology.
The latest tag is not magical, it's simply the default tag when you don't specify a tag.
You can have untagged images only identifiable by their IMAGE IDs. These will get the <none> TAG and REPOSITORY. It's easy to forget about them.
More information on images is available from the Docker documentation and glossary.
What's a container?
To use a programming metaphor, if an image is a class, then a container is an instance of a class—a runtime object. Containers are hopefully why you're using Docker; they're lightweight and portable encapsulations of an environment in which to run applications.
View local running containers with docker ps:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f2ff1af05450 samalba/docker-registry:latest /bin/sh -c 'exec doc 4 months ago Up 12 weeks 0.0.0.0:5000->5000/tcp docker-registry
Here I'm running a dockerized version of the docker registry, so that I have a private place to store my images. Again, some things to note:
Like IMAGE ID, CONTAINER ID is the true identifier for the container. It has the same form, but it identifies a different kind of object.
docker ps only outputs running containers. You can view all containers (running or stopped) with docker ps -a.
NAMES can be used to identify a started container via the --name flag.
How to avoid image and container buildup
One of my early frustrations with Docker was the seemingly constant buildup of untagged images and stopped containers. On a handful of occasions this buildup resulted in maxed out hard drives slowing down my laptop or halting my automated build pipeline. Talk about "containers everywhere"!
We can remove all untagged images by combining docker rmi with the recent dangling=true query:
docker images -q --filter "dangling=true" | xargs docker rmi
Docker won't be able to remove images that are behind existing containers, so you may have to remove stopped containers with docker rm first:
docker rm `docker ps --no-trunc -aq`
These are known pain points with Docker and may be addressed in future releases. However, with a clear understanding of images and containers, these situations can be avoided with a couple of practices:
Always remove a useless, stopped container with docker rm [CONTAINER_ID].
Always remove the image behind a useless, stopped container with docker rmi [IMAGE_ID].
While it's simplest to think of a container as a running image, this isn't quite accurate.
An image is really a template that can be turned into a container. To turn an image into a container, the Docker engine takes the image, adds a read-write filesystem on top and initialises various settings including network ports, container name, ID and resource limits. A running container has a currently executing process, but a container can also be stopped (or exited in Docker's terminology). An exited container is not the same as an image, as it can be restarted and will retain its settings and any filesystem changes.
Maybe explaining the whole workflow can help.
Everything starts with the Dockerfile. The Dockerfile is the source code of the image.
Once the Dockerfile is created, you build it to create the image of the container. The image is just the "compiled version" of the "source code" which is the Dockerfile.
Once you have the image of the container, you should redistribute it using the registry. The registry is like a Git repository -- you can push and pull images.
Next, you can use the image to run containers. A running container is very similar, in many aspects, to a virtual machine (but without the hypervisor).
Dockerfile → (Build) → Image → (Run) → Container.
Dockerfile: contains a set of Docker instructions that provisions your operating system the way you like, and installs/configure all your software.
Image: compiled Dockerfile. Saves you time from rebuilding the Dockerfile every time you need to run a container. And it's a way to hide your provision code.
Container: the virtual operating system itself. You can ssh into it and run any commands you wish, as if it's a real environment. You can run 1000+ containers from the same Image.
Workflow
Here is the end-to-end workflow showing the various commands and their associated inputs and outputs. That should clarify the relationship between an image and a container.
+------------+ docker build +--------------+ docker run -dt +-----------+ docker exec -it +------+
| Dockerfile | --------------> | Image | ---------------> | Container | -----------------> | Bash |
+------------+ +--------------+ +-----------+ +------+
^
| docker pull
|
+--------------+
| Registry |
+--------------+
To list the images you could run, execute:
docker image ls
To list the containers you could execute commands on:
docker ps
I couldn't understand the concept of image and layer in spite of reading all the questions here and then eventually stumbled upon this excellent documentation from Docker (duh!).
The example there is really the key to understand the whole concept. It is a lengthy post, so I am summarising the key points that need to be really grasped to get clarity.
Image: A Docker image is built up from a series of read-only layers
Layer: Each layer represents an instruction in the image’s Dockerfile.
Example: The below Dockerfile contains four commands, each of which creates a layer.
FROM ubuntu:15.04
COPY . /app
RUN make /app
CMD python /app/app.py
Importantly, each layer is only a set of differences from the layer before it.
Container.
When you create a new container, you add a new writable layer on top of the underlying layers. This layer is often called the “container layer”. All changes made to the running container, such as writing new files, modifying existing files, and deleting files, are written to this thin writable container layer.
Hence, the major difference between a container and an image is
the top writable layer. All writes to the container that add new or
modify existing data are stored in this writable layer. When the
container is deleted, the writable layer is also deleted. The
underlying image remains unchanged.
Understanding images cnd Containers from a size-on-disk perspective
To view the approximate size of a running container, you can use the docker ps -s command. You get size and virtual size as two of the outputs:
Size: the amount of data (on disk) that is used for the writable layer of each container
Virtual Size: the amount of data used for the read-only image data used by the container. Multiple containers may share some or all read-only image data. Hence these are not additive. I.e. you can't add all the virtual sizes to calculate how much size on disk is used by the image
Another important concept is the copy-on-write strategy
If a file or directory exists in a lower layer within the image, and another layer (including the writable layer) needs read access to it, it just uses the existing file. The first time another layer needs to modify the file (when building the image or running the container), the file is copied into that layer and modified.
I hope that helps someone else like me.
Simply said, if an image is a class, then a container is an instance of a class is a runtime object.
A container is just an executable binary that is to be run by the host OS under a set of restrictions that are preset using an application (e.g., Docker) that knows how to tell the OS which restrictions to apply.
The typical restrictions are process-isolation related, security related (like using SELinux protection) and system-resource related (memory, disk, CPU, and networking).
Until recently, only kernels in Unix-based systems supported the ability to run executables under strict restrictions. That's why most container talk today involves mostly Linux or other Unix distributions.
Docker is one of those applications that knows how to tell the OS (Linux mostly) what restrictions to run an executable under. The executable is contained in the Docker image, which is just a tarfile. That executable is usually a stripped-down version of a Linux distribution's User space (Ubuntu, CentOS, Debian, etc.) preconfigured to run one or more applications within.
Though most people use a Linux base as the executable, it can be any other binary application as long as the host OS's kernel can run it (see creating a simple base image using scratch). Whether the binary in the Docker image is an OS User space or simply an application, to the OS host it is just another process, a contained process ruled by preset OS boundaries.
Other applications that, like Docker, can tell the host OS which boundaries to apply to a process while it is running, include LXC, libvirt, and systemd. Docker used to use these applications to indirectly interact with the Linux OS, but now Docker interacts directly with Linux using its own library called "libcontainer".
So containers are just processes running in a restricted mode, similar to what chroot used to do.
IMO, what sets Docker apart from any other container technology is its repository (Docker Hub) and their management tools which makes working with containers extremely easy.
See Docker (software).
The core concept of Docker is to make it easy to create "machines" which in this case can be considered containers. The container aids in reusability, allowing you to create and drop containers with ease.
Images depict the state of a container at every point in time. So the basic workflow is:
create an image
start a container
make changes to the container
save the container back as an image
As many answers pointed this out: You build Dockerfile to get an image and you run image to get a container.
However, following steps helped me get a better feel for what Docker image and container are:
1) Build Dockerfile:
docker build -t my_image dir_with_dockerfile
2) Save the image to .tar file
docker save -o my_file.tar my_image_id
my_file.tar will store the image. Open it with tar -xvf my_file.tar, and you will get to see all the layers. If you dive deeper into each layer you can see what changes were added in each layer. (They should be pretty close to commands in the Dockerfile).
3) To take a look inside of a container, you can do:
sudo docker run -it my_image bash
and you can see that is very much like an OS.
It may help to think of an image as a "snapshot" of a container.
You can make images from a container (new "snapshots"), and you can also start new containers from an image (instantiate the "snapshot"). For example, you can instantiate a new container from a base image, run some commands in the container, and then "snapshot" that as a new image. Then you can instantiate 100 containers from that new image.
Other things to consider:
An image is made of layers, and layers are snapshot "diffs"; when you push an image, only the "diff" is sent to the registry.
A Dockerfile defines some commands on top of a base image, that creates new layers ("diffs") that result in a new image ("snapshot").
Containers are always instantiated from images.
Image tags are not just tags. They are the image's "full name" ("repository:tag"). If the same image has multiple names, it shows multiple times when doing docker images.
Image is an equivalent to a class definition in OOP and layers are different methods and properties of that class.
Container is the actual instantiation of the image just like how an object is an instantiation or an instance of a class.
I think it is better to explain at the beginning.
Suppose you run the command docker run hello-world. What happens?
It calls Docker CLI which is responsible to take Docker commands and transform to call Docker server commands. As soon as Docker server gets a command to run an image, it checks weather the images cache holds an image with such a name.
Suppose hello-world do not exists. Docker server goes to Docker Hub (Docker Hub is just a free repository of images) and asks, hey Hub, do you have an image called hello-world?
Hub responses - yes, I do. Then give it to me, please. And the download process starts. As soon as the Docker image is downloaded, the Docker server puts it in the image cache.
So before we explain what Docker images and Docker containers are, let's start with an introduction about the operation system on your computer and how it runs software.
When you run, for example, Chrome on your computer, it calls the operating system, the operating system itself calls the kernel and asks, hey I want to run this program. The kernel manages to run files from your hard disk.
Now imagine that you have two programs, Chrome and Node.js. Chrome requires Python version 2 to run and Node.js requires Python version 3 to run. If you only have installed Python v2 on your computer, only Chrome will be run.
To make both cases work, somehow you need to use an operating system feature known as namespacing. A namespace is a feature which gives you the opportunity to isolate processes, hard drive, network, users, hostnames and so on.
So, when we talk about an image we actually talk about a file system snapshot. An image is a physical file which contains directions and metadata to build a specific container. The container itself is an instance of an image; it isolates the hard drive using namespacing which is available only for this container. So a container is a process or set of processes which groups different resources assigned to it.
A Docker image packs up the application and environment required by the application to run, and a container is a running instance of the image.
Images are the packing part of Docker, analogous to "source code" or a "program". Containers are the execution part of Docker, analogous to a "process".
In the question, only the "program" part is referred to and that's the image. The "running" part of Docker is the container. When a container is run and changes are made, it's as if the process makes a change in its own source code and saves it as the new image.
As in the programming aspect,
Image is source code.
When source code is compiled and build, it is called an application.
Similar to that "when an instance is created for the image", it is called a "container".
I would like to fill the missing part here between docker images and containers. Docker uses a union file system (UFS) for containers, which allows multiple filesystems to be mounted in a hierarchy and to appear as a single filesystem. The filesystem from the image has been mounted as a read-only layer, and any changes to the running container are made to a read-write layer mounted on top of this. Because of this, Docker only has to look at the topmost read-write layer to find the changes made to the running system.
I would state it with the following analogy:
+-----------------------------+-------+-----------+
| Domain | Meta | Concrete |
+-----------------------------+-------+-----------+
| Docker | Image | Container |
| Object oriented programming | Class | Object |
+-----------------------------+-------+-----------+
Docker Client, Server, Machine, Images, Hub, Composes are all projects tools pieces of software that come together to form a platform where ecosystem around creating and running something called containers, now if you run the command docker run redis something called docker CLI reached out to something called the Docker Hub and it downloaded a single file called an image.
Docker Image:
An image is a single file containing all the dependencies and all the configuration required to run a very specific program, for example redis is the image that you just downloaded (by running command docker run redis) was supposed to run.
This is a single file that gets stored on your hard drive and at some point time you can use this image to create something called a container.
A container is an instance of an image and you can kind of think it as being like a running program with it's own isolated set of hardware resources so it kind of has its own little set or its own little space of memory has its own little space of networking technology and its own little space of hard drive space as well.
Now lets examine when you give bellow command:
sudo docker run hello-world
Above command will starts up the docker client or docker CLI, Docker CLI is in charge of taking commands from you kind of doing a little bit of processing on them and then communicating the commands over to something called the docker server, and docker server is in charge of the heavy lifting when we ran the command Docker run hello-world,
That meant that we wanted to start up a new container using the image with the name of hello world, the hello world image has a tiny tittle program inside of it whose sole purpose or sole job is to print out the message that you see in the terminal.
Now when we ran that command and it was issued over to the docker server a series of actions very quickly occurred in background. The Docker server saw that we were trying to start up a new container using an image called hello world.
The first thing that the docker server did was check to see if it already had a local copy like a copy on your personal machine of the hello world image or that hello world file.So the docker server looked into something called the image cache.
Now because you and I just installed Docker on our personal computers that image cache is currently empty, We have no images that have already been downloaded before.
So because the image cache was empty the docker server decided to reach out to a free service called Docker hub. The Docker Hub is a repository of free public images that you can freely download and run on your personal computer. So Docker server reached out to Docker Hub and and downloaded the hello world file and stored it on your computer in the image-cache, where it can now be re-run at some point the future very quickly without having to re-downloading it from the docker hub.
After that the docker server will use it to create an instance of a container, and we know that a container is an instance of an image, its sole purpose is to run one very specific program. So the docker server then essentially took that image file from image cache and loaded it up into memory to created a container out of it and then ran a single program inside of it. And that single programs purpose was to print out the message that you see.
What a container is:
First of all an image is a blueprint for how to create a container.
A container is a process or a set of processes that have a grouping of resource specifically assigned to it, in the bellow is a diagram that anytime that we think about a container we've got some running process that sends a system call to a kernel, the kernel is going to look at that incoming system call and direct it to a very specific portion of the hard drive, the RAM, CPU or what ever else it might need and a portion of each of these resources is made available to that singular process.
An image is to a class as a container to an object.
A container is an instance of an image as an object is an instance of a class.
*In docker, an image is an immutable file that holds the source code and information needed for a docker app to run. It can exist independent of a container.
*Docker containers are virtualized environments created during runtime and require images to run. The docker website has an image that kind of shows this relationship:
Just as an object is an instance of a class in an object-oriented programming language, so a Docker container is an instance of a Docker image.
For a dummy programming analogy, you can think of Docker has a abstract ImageFactory which holds ImageFactories they come from store.
Then once you want to create an app out of that ImageFactory, you will have a new container, and you can modify it as you want. DotNetImageFactory will be immutable, because it acts as a abstract factory class, where it only delivers instances you desire.
IContainer newDotNetApp = ImageFactory.DotNetImageFactory.CreateNew(appOptions);
newDotNetApp.ChangeDescription("I am making changes on this instance");
newDotNetApp.Run();
In short:
Container is a division (virtual) in a kernel which shares a common OS and runs an image (Docker image).
A container is a self-sustainable application that will have packages and all the necessary dependencies together to run the code.
A Docker container is running an instance of an image. You can relate an image with a program and a container with a process :)
Dockerfile is like your Bash script that produce a tarball (Docker image).
Docker containers is like extracted version of the tarball. You can have as many copies as you like in different folders (the containers).
An image is the blueprint from which container/s (running instances) are build.
Long story short.
Docker Images:
The file system and configuration(read-only) application which is used to create containers.
Docker Containers:
The major difference between a container and an image is the top writable layer. Containers are running instances of Docker images with top writable layer. Containers run the actual applications. A container includes an application and all of its dependencies. When the container is deleted, the writable layer is also deleted. The underlying image remains unchanged.
Other important terms to notice:
Docker daemon:
The background service running on the host that manages the building, running and distributing Docker containers.
Docker client:
The command line tool that allows the user to interact with the Docker daemon.
Docker Store:
Store is, among other things, a registry of Docker images. You can think of the registry as a directory of all available Docker images
A picture from this blog post is worth a thousand words.
Summary:
Pull image from Docker hub or build from a Dockerfile => Gives a
Docker image (not editable).
Run the image (docker run image_name:tag_name) => Gives a running
Image i.e. container (editable)
An image is like a class and container is like an object that class and so you can have an infinite number of containers behaving like the image. A class is a blueprint which isnt doing anything on its own. You have to create instances of the object un your program to do anything meaningful. And so is the case with an image and a container. You define your image and then create containers running that image. It isnt exactly similar because object is an instance of a class whereas a container is something like an empty hollow place and you use the image to build up a running host with exactly what the image says
An image or a container image is a file which contains your application code, application runtime, configurations, dependent libraries. The image is basically wraps all these into a single, secure immutable unit. Appropriate docker command is used to build the image. The image has image id and image tag. The tag is usually in the format of <docker-user-name>/image-name:tag.
When you start running your application using the image you actually start a container. So your container is a sandbox in which you run your image. Docker software is used to manage both the image and container.
Image is a secured package which contains your application artifact, libraries, configurations and application runtime. Container is the runtime representation of your image.

Ansible commands on docker containers?

Upto now i had setup my ansible-playbook commands running on AWS EC2 instances.
can i run regular ansible commands like (linefile, apt, pip, etc) on container?
can i add my container-ip to hosts file in container-group and then does the same code works, here if i chanage my main.yml file that has
hosts: ec2-group
to
hosts:contaniers-group
does all commands work?
i am bit beginner into this..please do confirm me i am actually thinking of making docker-compose files from scratch, and run docker-compose commands using ansible.
You can, but it's not really how Docker is designed to be used.
A Docker container is usually a wrapper around a single process. In the standard setup you create an image that has that application built and packaged, and you can just run it without any further setup. It's not usually interesting to run a bare Linux distribution container (which won't have an application installed) or to run an interactive shell as the main container process. Tutorials like Docker's Build and run your image walk through this sequence.
A corollary to this is that containers don't usually have any local state. In the best case any state a container needs is in an external database; if you can't do that then you store local state in a volume that outlives the container.
Finally, it's extremely routine to delete and recreate containers. You need to do this to change some common options; in a cluster environment like Kubernetes this can happen outside your control. When this happens the new container will restart running its default setup, and it won't know about any manual changes the previous container might have had.
So you don't usually want to try to install software directly in a running container, since that will get lost as soon as the container exits. You can, in principle, get a shell in a container (via docker exec) but this is more of a debugging tool than an administration tool. You could make the only process a container runs be an ssh daemon, but anything you start this way will get lost as soon as the container exits (and I've never seen a recipe that correctly and securely sets up credentials to access it).
I'd recommend learning the standard Dockerfile system and running self-contained Docker images over trying to adapt Ansible to this rather different environment.

How are Packer and Docker different? Which one should I prefer when provisioning images?

How are Packer and Docker different? Which one is easier/quickest to provision/maintain and why? What is the pros and cons of having a dockerfile?
Docker is a system for building, distributing and running OCI images as containers. Containers can be run on Linux and Windows.
Packer is an automated build system to manage the creation of images for containers and virtual machines. It outputs an image that you can then take and run on the platform you require.
For v1.8 this includes - Alicloud ECS, Amazon EC2, Azure, CloudStack, DigitalOcean, Docker, Google Cloud, Hetzner, Hyper-V, Libvirt, LXC, LXD, 1&1, OpenStack, Oracle OCI, Parallels, ProfitBricks, Proxmox, QEMU, Scaleway, Triton, Vagrant, VirtualBox, VMware, Vultr
Docker's Dockerfile
Docker uses a Dockerfile to manage builds which has a specific set of instructions and rules about how you build a container.
Images are built in layers. Each FROM RUN ADD COPY commands modify the layers included in an OCI image. These layers can be cached which helps speed up builds. Each layer can also be addressed individually which helps with disk usage and download usage when multiple images share layers.
Dockerfiles have a bit of a learning curve, It's best to look at some of the official Docker images for practices to follow.
Packer's Docker builder
Packer does not require a Dockerfile to build a container image. The docker plugin has a HCL or JSON config file which start the image build from a specified base image (like FROM).
Packer then allows you to run standard system config tools called "Provisioners" on top of that image. Tools like Ansible, Chef, Salt, shell scripts etc.
This image will then be exported as a single layer, so you lose the layer caching/addressing benefits compared to a Dockerfile build.
Packer allows some modifications to the build container environment, like running as --privileged or mounting a volume at build time, that Docker builds will not allow.
Times you might want to use Packer are if you want to build images for multiple platforms and use the same setup. It also makes it easy to use existing build scripts if there is a provisioner for it.
Expanding on the Which one is easier/quickest to provision/maintain and why? What are the pros and cons of having a docker file?`
From personal experience learning and using both, I found: (YMMV)
docker configuration was easier to learn than packer
docker configuration was harder to coerce into doing what I wanted than packer
speed difference in creating the image was negligible, after development
docker was faster during development, because of the caching
the docker daemon consumed some system resources even when not using docker
there are a handful of processes running as the daemon
I did my development on Windows, though I was targeting LINUX servers for running the images.
That isn't an issue during development, except for a foible of running Docker on Windows.
The docker daemon reserves various TCP port ranges for itself
The ranges might change every time you reboot your system or restart the daemon
The only error message is to the effect: can't use that port! but not why it can't
BTW, The workaround is to:
turn off Hypervisor
reboot
reserve the public ports you want your host system to see
turn on hypervisor
reboot
Running packer on Windows, however, the issue I found is that the provisioner I wanted to use, ansible, doesn't run on Windows.
Sigh.
So I end up having to run packer on a LINUX system after all.
Just because I was feeling perverse, I wrote a Dockerfile so I could run both packer and ansible from my Windows station in a docker container using that image.
Docker builds images using a Dockerfile.
These can be run (Docker containers).
Packer also builds images. But you don't need a Dockerfile. And you get the option of using Provisioners such as Ansible which lets you create vastly more customisable images. It isn't used for running these images.

Docker Storage - Getting a Layman's answer

I am just discovering Docker - I am finding so much information, but I can't seem to get a straight answer on this option. If someone could give me a clear explanation based on my understanding I have of it so far it would be appreciated.
I am downloading a docker image locally - say the default one from Microsoft, using microsoft/dotnet-samples:dotnetapp-nanoserver I am lost as to where this is downloaded to? Is this downloaded and installed as a program on the host machine, with a isolated script that controls the container? The download is about 1.3 gigs because it includes .Net Core
In another example, if I download apache2 to run as a web server, does it install it in the default paths on the host system, but every container I want to use taps into that - or does every container contain it's isolated version of apache2?
I ask this because I can't find files that mimic the file size of these programs.
I know they are not complete VM's but where can I find the files associated with a container?
I am using Windows Server 2016 and a Mac since I want to do some trials with containers.
An image is a filesystem
Docker images are encapsulated filesystems. The software and files inside are not being directly installed onto your system.
You can think of a Docker image sort of the way you think of a .zip file. You can download a .zip file from somewhere, and it is a single file. Contained inside it might be one file, or dozens of files, or a nested tree of directories and files. But on your disk, it exists as one file.
A Docker image is similar (conceptually, at least... the details are more complicated).
Image storage
Where images are stored varies by platform. On a Linux system, they are usually under /var/lib/docker. I don't know where they are stored on Windows, but this is a more or less opaque store. Poking around inside will not reveal very much to you anyway.
To see what you have, you should use the docker images command. It will show you the images you have stored locally.
Like I said earlier, each image may consist of multiple layers. By default, that command will only show you the top layer, which is the one you'll care about, to run containers from. Technically, there are other layers, and you can see all of them using docker images -a.
Where is the software installed?
When you download an Apache image, nothing is installed on your system at all. The image file(s) are downloaded and stored. Hiding inside is Apache and everything Apache needs in order to run, but Apache is not installed onto your Windows OS anywhere.
When you want to use Apache, you would run a container. Docker takes the Apache image and, using it as a starting template, creates a running process container, inside of which Apache is running. This is isolated from your operating system. Apache is only running inside of the container.
If you run a second container from the Apache image, you now have two completely separate Apache instances running, each in their own isolated filesystem environment.
Where can I find the files?
If you just want to poke around in the container filesystem, you can start the container in interactive mode, and run a shell instead of whatever it normally runs (like Apache). For instance, if you have an image apache:latest, you can do this:
docker run --rm -it apache:latest bash
This will run an instance of apache:latest, but instead of launching Apache, it will run a bash shell and drop you into it.
The --rm flag is convenient for cases like this. It tells Docker to remove the running container when its process exits. That way for a "just looking at something" container like this one, it cleans up after itself.
The -it is actually two flags. -i is interactive mode, and -t allocates a terminal. This is a common flag to pass when you want to directly interact with the container.
Once inside, you can use the usual commands to look at files and directory listings. Note that many containers are stripped-down, though. You don't always have all of the tools you are used to having. Things like ls in Linux are typically there, but a lot of things will not be.
Simply exit when you're done looking around to exit.
Looking around while the process is running
You can also look at the container while Apache is running. First start it normally.
docker run -d apache:latest
This will return a container ID. You can also get the ID from docker ps. Then you can attach to the container with that ID by executing a shell.
docker exec -it <container_id> bash
Now you're in the container in a shell, but Apache is in there running.

What is the difference between a Docker image and a container?

When using Docker, we start with a base image. We boot it up, create changes and those changes are saved in layers forming another image.
So eventually I have an image for my PostgreSQL instance and an image for my web application, changes to which keep on being persisted.
What is a container?
An instance of an image is called a container. You have an image, which is a set of layers as you describe. If you start this image, you have a running container of this image. You can have many running containers of the same image.
You can see all your images with docker images whereas you can see your running containers with docker ps (and you can see all containers with docker ps -a).
So a running instance of an image is a container.
From my article on Automating Docker Deployments (archived):
Docker Images vs. Containers
In Dockerland, there are images and there are containers. The two are closely related, but distinct. For me, grasping this dichotomy has clarified Docker immensely.
What's an Image?
An image is an inert, immutable, file that's essentially a snapshot of a container. Images are created with the build command, and they'll produce a container when started with run. Images are stored in a Docker registry such as registry.hub.docker.com. Because they can become quite large, images are designed to be composed of layers of other images, allowing a minimal amount of data to be sent when transferring images over the network.
Local images can be listed by running docker images:
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
ubuntu 13.10 5e019ab7bf6d 2 months ago 180 MB
ubuntu 14.04 99ec81b80c55 2 months ago 266 MB
ubuntu latest 99ec81b80c55 2 months ago 266 MB
ubuntu trusty 99ec81b80c55 2 months ago 266 MB
<none> <none> 4ab0d9120985 3 months ago 486.5 MB
Some things to note:
IMAGE ID is the first 12 characters of the true identifier for an image. You can create many tags of a given image, but their IDs will all be the same (as above).
VIRTUAL SIZE is virtual because it's adding up the sizes of all the distinct underlying layers. This means that the sum of all the values in that column is probably much larger than the disk space used by all of those images.
The value in the REPOSITORY column comes from the -t flag of the docker build command, or from docker tag-ing an existing image. You're free to tag images using a nomenclature that makes sense to you, but know that docker will use the tag as the registry location in a docker push or docker pull.
The full form of a tag is [REGISTRYHOST/][USERNAME/]NAME[:TAG]. For ubuntu above, REGISTRYHOST is inferred to be registry.hub.docker.com. So if you plan on storing your image called my-application in a registry at docker.example.com, you should tag that image docker.example.com/my-application.
The TAG column is just the [:TAG] part of the full tag. This is unfortunate terminology.
The latest tag is not magical, it's simply the default tag when you don't specify a tag.
You can have untagged images only identifiable by their IMAGE IDs. These will get the <none> TAG and REPOSITORY. It's easy to forget about them.
More information on images is available from the Docker documentation and glossary.
What's a container?
To use a programming metaphor, if an image is a class, then a container is an instance of a class—a runtime object. Containers are hopefully why you're using Docker; they're lightweight and portable encapsulations of an environment in which to run applications.
View local running containers with docker ps:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f2ff1af05450 samalba/docker-registry:latest /bin/sh -c 'exec doc 4 months ago Up 12 weeks 0.0.0.0:5000->5000/tcp docker-registry
Here I'm running a dockerized version of the docker registry, so that I have a private place to store my images. Again, some things to note:
Like IMAGE ID, CONTAINER ID is the true identifier for the container. It has the same form, but it identifies a different kind of object.
docker ps only outputs running containers. You can view all containers (running or stopped) with docker ps -a.
NAMES can be used to identify a started container via the --name flag.
How to avoid image and container buildup
One of my early frustrations with Docker was the seemingly constant buildup of untagged images and stopped containers. On a handful of occasions this buildup resulted in maxed out hard drives slowing down my laptop or halting my automated build pipeline. Talk about "containers everywhere"!
We can remove all untagged images by combining docker rmi with the recent dangling=true query:
docker images -q --filter "dangling=true" | xargs docker rmi
Docker won't be able to remove images that are behind existing containers, so you may have to remove stopped containers with docker rm first:
docker rm `docker ps --no-trunc -aq`
These are known pain points with Docker and may be addressed in future releases. However, with a clear understanding of images and containers, these situations can be avoided with a couple of practices:
Always remove a useless, stopped container with docker rm [CONTAINER_ID].
Always remove the image behind a useless, stopped container with docker rmi [IMAGE_ID].
While it's simplest to think of a container as a running image, this isn't quite accurate.
An image is really a template that can be turned into a container. To turn an image into a container, the Docker engine takes the image, adds a read-write filesystem on top and initialises various settings including network ports, container name, ID and resource limits. A running container has a currently executing process, but a container can also be stopped (or exited in Docker's terminology). An exited container is not the same as an image, as it can be restarted and will retain its settings and any filesystem changes.
Maybe explaining the whole workflow can help.
Everything starts with the Dockerfile. The Dockerfile is the source code of the image.
Once the Dockerfile is created, you build it to create the image of the container. The image is just the "compiled version" of the "source code" which is the Dockerfile.
Once you have the image of the container, you should redistribute it using the registry. The registry is like a Git repository -- you can push and pull images.
Next, you can use the image to run containers. A running container is very similar, in many aspects, to a virtual machine (but without the hypervisor).
Dockerfile → (Build) → Image → (Run) → Container.
Dockerfile: contains a set of Docker instructions that provisions your operating system the way you like, and installs/configure all your software.
Image: compiled Dockerfile. Saves you time from rebuilding the Dockerfile every time you need to run a container. And it's a way to hide your provision code.
Container: the virtual operating system itself. You can ssh into it and run any commands you wish, as if it's a real environment. You can run 1000+ containers from the same Image.
Workflow
Here is the end-to-end workflow showing the various commands and their associated inputs and outputs. That should clarify the relationship between an image and a container.
+------------+ docker build +--------------+ docker run -dt +-----------+ docker exec -it +------+
| Dockerfile | --------------> | Image | ---------------> | Container | -----------------> | Bash |
+------------+ +--------------+ +-----------+ +------+
^
| docker pull
|
+--------------+
| Registry |
+--------------+
To list the images you could run, execute:
docker image ls
To list the containers you could execute commands on:
docker ps
I couldn't understand the concept of image and layer in spite of reading all the questions here and then eventually stumbled upon this excellent documentation from Docker (duh!).
The example there is really the key to understand the whole concept. It is a lengthy post, so I am summarising the key points that need to be really grasped to get clarity.
Image: A Docker image is built up from a series of read-only layers
Layer: Each layer represents an instruction in the image’s Dockerfile.
Example: The below Dockerfile contains four commands, each of which creates a layer.
FROM ubuntu:15.04
COPY . /app
RUN make /app
CMD python /app/app.py
Importantly, each layer is only a set of differences from the layer before it.
Container.
When you create a new container, you add a new writable layer on top of the underlying layers. This layer is often called the “container layer”. All changes made to the running container, such as writing new files, modifying existing files, and deleting files, are written to this thin writable container layer.
Hence, the major difference between a container and an image is
the top writable layer. All writes to the container that add new or
modify existing data are stored in this writable layer. When the
container is deleted, the writable layer is also deleted. The
underlying image remains unchanged.
Understanding images cnd Containers from a size-on-disk perspective
To view the approximate size of a running container, you can use the docker ps -s command. You get size and virtual size as two of the outputs:
Size: the amount of data (on disk) that is used for the writable layer of each container
Virtual Size: the amount of data used for the read-only image data used by the container. Multiple containers may share some or all read-only image data. Hence these are not additive. I.e. you can't add all the virtual sizes to calculate how much size on disk is used by the image
Another important concept is the copy-on-write strategy
If a file or directory exists in a lower layer within the image, and another layer (including the writable layer) needs read access to it, it just uses the existing file. The first time another layer needs to modify the file (when building the image or running the container), the file is copied into that layer and modified.
I hope that helps someone else like me.
Simply said, if an image is a class, then a container is an instance of a class is a runtime object.
A container is just an executable binary that is to be run by the host OS under a set of restrictions that are preset using an application (e.g., Docker) that knows how to tell the OS which restrictions to apply.
The typical restrictions are process-isolation related, security related (like using SELinux protection) and system-resource related (memory, disk, CPU, and networking).
Until recently, only kernels in Unix-based systems supported the ability to run executables under strict restrictions. That's why most container talk today involves mostly Linux or other Unix distributions.
Docker is one of those applications that knows how to tell the OS (Linux mostly) what restrictions to run an executable under. The executable is contained in the Docker image, which is just a tarfile. That executable is usually a stripped-down version of a Linux distribution's User space (Ubuntu, CentOS, Debian, etc.) preconfigured to run one or more applications within.
Though most people use a Linux base as the executable, it can be any other binary application as long as the host OS's kernel can run it (see creating a simple base image using scratch). Whether the binary in the Docker image is an OS User space or simply an application, to the OS host it is just another process, a contained process ruled by preset OS boundaries.
Other applications that, like Docker, can tell the host OS which boundaries to apply to a process while it is running, include LXC, libvirt, and systemd. Docker used to use these applications to indirectly interact with the Linux OS, but now Docker interacts directly with Linux using its own library called "libcontainer".
So containers are just processes running in a restricted mode, similar to what chroot used to do.
IMO, what sets Docker apart from any other container technology is its repository (Docker Hub) and their management tools which makes working with containers extremely easy.
See Docker (software).
The core concept of Docker is to make it easy to create "machines" which in this case can be considered containers. The container aids in reusability, allowing you to create and drop containers with ease.
Images depict the state of a container at every point in time. So the basic workflow is:
create an image
start a container
make changes to the container
save the container back as an image
As many answers pointed this out: You build Dockerfile to get an image and you run image to get a container.
However, following steps helped me get a better feel for what Docker image and container are:
1) Build Dockerfile:
docker build -t my_image dir_with_dockerfile
2) Save the image to .tar file
docker save -o my_file.tar my_image_id
my_file.tar will store the image. Open it with tar -xvf my_file.tar, and you will get to see all the layers. If you dive deeper into each layer you can see what changes were added in each layer. (They should be pretty close to commands in the Dockerfile).
3) To take a look inside of a container, you can do:
sudo docker run -it my_image bash
and you can see that is very much like an OS.
It may help to think of an image as a "snapshot" of a container.
You can make images from a container (new "snapshots"), and you can also start new containers from an image (instantiate the "snapshot"). For example, you can instantiate a new container from a base image, run some commands in the container, and then "snapshot" that as a new image. Then you can instantiate 100 containers from that new image.
Other things to consider:
An image is made of layers, and layers are snapshot "diffs"; when you push an image, only the "diff" is sent to the registry.
A Dockerfile defines some commands on top of a base image, that creates new layers ("diffs") that result in a new image ("snapshot").
Containers are always instantiated from images.
Image tags are not just tags. They are the image's "full name" ("repository:tag"). If the same image has multiple names, it shows multiple times when doing docker images.
Image is an equivalent to a class definition in OOP and layers are different methods and properties of that class.
Container is the actual instantiation of the image just like how an object is an instantiation or an instance of a class.
I think it is better to explain at the beginning.
Suppose you run the command docker run hello-world. What happens?
It calls Docker CLI which is responsible to take Docker commands and transform to call Docker server commands. As soon as Docker server gets a command to run an image, it checks weather the images cache holds an image with such a name.
Suppose hello-world do not exists. Docker server goes to Docker Hub (Docker Hub is just a free repository of images) and asks, hey Hub, do you have an image called hello-world?
Hub responses - yes, I do. Then give it to me, please. And the download process starts. As soon as the Docker image is downloaded, the Docker server puts it in the image cache.
So before we explain what Docker images and Docker containers are, let's start with an introduction about the operation system on your computer and how it runs software.
When you run, for example, Chrome on your computer, it calls the operating system, the operating system itself calls the kernel and asks, hey I want to run this program. The kernel manages to run files from your hard disk.
Now imagine that you have two programs, Chrome and Node.js. Chrome requires Python version 2 to run and Node.js requires Python version 3 to run. If you only have installed Python v2 on your computer, only Chrome will be run.
To make both cases work, somehow you need to use an operating system feature known as namespacing. A namespace is a feature which gives you the opportunity to isolate processes, hard drive, network, users, hostnames and so on.
So, when we talk about an image we actually talk about a file system snapshot. An image is a physical file which contains directions and metadata to build a specific container. The container itself is an instance of an image; it isolates the hard drive using namespacing which is available only for this container. So a container is a process or set of processes which groups different resources assigned to it.
A Docker image packs up the application and environment required by the application to run, and a container is a running instance of the image.
Images are the packing part of Docker, analogous to "source code" or a "program". Containers are the execution part of Docker, analogous to a "process".
In the question, only the "program" part is referred to and that's the image. The "running" part of Docker is the container. When a container is run and changes are made, it's as if the process makes a change in its own source code and saves it as the new image.
As in the programming aspect,
Image is source code.
When source code is compiled and build, it is called an application.
Similar to that "when an instance is created for the image", it is called a "container".
I would like to fill the missing part here between docker images and containers. Docker uses a union file system (UFS) for containers, which allows multiple filesystems to be mounted in a hierarchy and to appear as a single filesystem. The filesystem from the image has been mounted as a read-only layer, and any changes to the running container are made to a read-write layer mounted on top of this. Because of this, Docker only has to look at the topmost read-write layer to find the changes made to the running system.
I would state it with the following analogy:
+-----------------------------+-------+-----------+
| Domain | Meta | Concrete |
+-----------------------------+-------+-----------+
| Docker | Image | Container |
| Object oriented programming | Class | Object |
+-----------------------------+-------+-----------+
Docker Client, Server, Machine, Images, Hub, Composes are all projects tools pieces of software that come together to form a platform where ecosystem around creating and running something called containers, now if you run the command docker run redis something called docker CLI reached out to something called the Docker Hub and it downloaded a single file called an image.
Docker Image:
An image is a single file containing all the dependencies and all the configuration required to run a very specific program, for example redis is the image that you just downloaded (by running command docker run redis) was supposed to run.
This is a single file that gets stored on your hard drive and at some point time you can use this image to create something called a container.
A container is an instance of an image and you can kind of think it as being like a running program with it's own isolated set of hardware resources so it kind of has its own little set or its own little space of memory has its own little space of networking technology and its own little space of hard drive space as well.
Now lets examine when you give bellow command:
sudo docker run hello-world
Above command will starts up the docker client or docker CLI, Docker CLI is in charge of taking commands from you kind of doing a little bit of processing on them and then communicating the commands over to something called the docker server, and docker server is in charge of the heavy lifting when we ran the command Docker run hello-world,
That meant that we wanted to start up a new container using the image with the name of hello world, the hello world image has a tiny tittle program inside of it whose sole purpose or sole job is to print out the message that you see in the terminal.
Now when we ran that command and it was issued over to the docker server a series of actions very quickly occurred in background. The Docker server saw that we were trying to start up a new container using an image called hello world.
The first thing that the docker server did was check to see if it already had a local copy like a copy on your personal machine of the hello world image or that hello world file.So the docker server looked into something called the image cache.
Now because you and I just installed Docker on our personal computers that image cache is currently empty, We have no images that have already been downloaded before.
So because the image cache was empty the docker server decided to reach out to a free service called Docker hub. The Docker Hub is a repository of free public images that you can freely download and run on your personal computer. So Docker server reached out to Docker Hub and and downloaded the hello world file and stored it on your computer in the image-cache, where it can now be re-run at some point the future very quickly without having to re-downloading it from the docker hub.
After that the docker server will use it to create an instance of a container, and we know that a container is an instance of an image, its sole purpose is to run one very specific program. So the docker server then essentially took that image file from image cache and loaded it up into memory to created a container out of it and then ran a single program inside of it. And that single programs purpose was to print out the message that you see.
What a container is:
First of all an image is a blueprint for how to create a container.
A container is a process or a set of processes that have a grouping of resource specifically assigned to it, in the bellow is a diagram that anytime that we think about a container we've got some running process that sends a system call to a kernel, the kernel is going to look at that incoming system call and direct it to a very specific portion of the hard drive, the RAM, CPU or what ever else it might need and a portion of each of these resources is made available to that singular process.
An image is to a class as a container to an object.
A container is an instance of an image as an object is an instance of a class.
*In docker, an image is an immutable file that holds the source code and information needed for a docker app to run. It can exist independent of a container.
*Docker containers are virtualized environments created during runtime and require images to run. The docker website has an image that kind of shows this relationship:
Just as an object is an instance of a class in an object-oriented programming language, so a Docker container is an instance of a Docker image.
For a dummy programming analogy, you can think of Docker has a abstract ImageFactory which holds ImageFactories they come from store.
Then once you want to create an app out of that ImageFactory, you will have a new container, and you can modify it as you want. DotNetImageFactory will be immutable, because it acts as a abstract factory class, where it only delivers instances you desire.
IContainer newDotNetApp = ImageFactory.DotNetImageFactory.CreateNew(appOptions);
newDotNetApp.ChangeDescription("I am making changes on this instance");
newDotNetApp.Run();
In short:
Container is a division (virtual) in a kernel which shares a common OS and runs an image (Docker image).
A container is a self-sustainable application that will have packages and all the necessary dependencies together to run the code.
A Docker container is running an instance of an image. You can relate an image with a program and a container with a process :)
Dockerfile is like your Bash script that produce a tarball (Docker image).
Docker containers is like extracted version of the tarball. You can have as many copies as you like in different folders (the containers).
An image is the blueprint from which container/s (running instances) are build.
Long story short.
Docker Images:
The file system and configuration(read-only) application which is used to create containers.
Docker Containers:
The major difference between a container and an image is the top writable layer. Containers are running instances of Docker images with top writable layer. Containers run the actual applications. A container includes an application and all of its dependencies. When the container is deleted, the writable layer is also deleted. The underlying image remains unchanged.
Other important terms to notice:
Docker daemon:
The background service running on the host that manages the building, running and distributing Docker containers.
Docker client:
The command line tool that allows the user to interact with the Docker daemon.
Docker Store:
Store is, among other things, a registry of Docker images. You can think of the registry as a directory of all available Docker images
A picture from this blog post is worth a thousand words.
Summary:
Pull image from Docker hub or build from a Dockerfile => Gives a
Docker image (not editable).
Run the image (docker run image_name:tag_name) => Gives a running
Image i.e. container (editable)
An image is like a class and container is like an object that class and so you can have an infinite number of containers behaving like the image. A class is a blueprint which isnt doing anything on its own. You have to create instances of the object un your program to do anything meaningful. And so is the case with an image and a container. You define your image and then create containers running that image. It isnt exactly similar because object is an instance of a class whereas a container is something like an empty hollow place and you use the image to build up a running host with exactly what the image says
An image or a container image is a file which contains your application code, application runtime, configurations, dependent libraries. The image is basically wraps all these into a single, secure immutable unit. Appropriate docker command is used to build the image. The image has image id and image tag. The tag is usually in the format of <docker-user-name>/image-name:tag.
When you start running your application using the image you actually start a container. So your container is a sandbox in which you run your image. Docker software is used to manage both the image and container.
Image is a secured package which contains your application artifact, libraries, configurations and application runtime. Container is the runtime representation of your image.

Resources