Any significant concept difference between container and binder? - docker

I am new to Docker and confused about the concept of containerization. I wonder about the key difference between Docker container and Binder project. Here is the definition from Google search:
A Docker container is a standardized, encapsulated environment that runs applications.
A Binder or "Binder-ready repository" is a code repository that contains both code and content to run, and configuration files for the environment needed to run it.
Can anyone elaborate it a bit? Thanks!

You confusion is understandable. Docker itself is a lot to follow and then adding in Binder, makes it even more complex if you look behind the curtain.
One big point to be aware of is that much of the use of MyBinder.org by a typical user is targeted at eliminating the need for those users to learn about Docker, Dockerfile syntax, and concepts of a container, etc.. The idea of the configuration files that you include to make your repository 'Binder-ready' is to make it easier to make a resulting container without the need for writing a dockerfile with the dockerfile syntax. You can more-or-less simply list the packages you need in requirements.txt or environment.yml and not deal with making a dockerfile while still getting those dependencies already installed in the container you end up working in. environment.yml is a step above in complexity from requirements.txt as the .yml file has a syntax to it while requirements.txt at it's most basic is simply a list. The active container the user gets at the end of the launch is not readily apparent to the typical user. Typically, they should go from launching a session to having the environment they specified on an active JupyterHub session on a remote computer.
Binder combines quite a bit of tech including docker to make things like MyBinder.org work. MyBinder.org is a public BinderHub. And a Binderhub is actually just a specialized JupyterHub running on a cluster that uses images to serve up containers to users.
You can point a MyBinder.org at a repo and it will spin up a JupyterHub session with that content, and an environment based on any configuration files in the repository. If there aren't any configuration files, you'll have the content but it just gives you a default Python stack.
Binder uses repo2docker to take a repository and make it into something that can work with docker. You can run repo2docker yourself locally on your own machine and use the produced image to spawn a running container if you want.
The built images specify the environment on the JupyterHub you get from MyBinder.org has backing it. In fact, the session you get served from MyBinder.org is a running docker container running on a Kubernetes cluster.

Related

Is it possible to manage Dockerfile for a project externally

Is it possible to manage Dockerfile for a project externally
So instead of ProjectA/Dockerfile, ProjectB/Dockerfile
Can we do something like project-deploy/Dockerfile.ProjectA, project-deploy/Dockerfile.ProjectB which somehow will know how to build ProjectA, ProjectB docker images.
We would like to allow separation of the developer, devops roles
Yes this is possible, though not recommended (I'll explain why in a second). First, how you would accomplish what you asked:
Docker Build
The command to build an image in its simplest form is docker build . which performs a build with a build context pulled from the current directory. That means the entire current directory is sent to the docker service, and the service will use it to build an image. This build context should contain all of the local resources docker needs to build your image. In this case, docker will also assume the existence of a file called Dockerfile inside of this context, and use it for the actual build.
However, we can override the default behavior by specifying a the -f flag in our build command, e.g. docker build -f /path/to/some.dockerfile . This command uses your current directory as the build context, but uses it's own Dockerfile that can be defined elsewhere.
So in your case, let's we assume the code for ProjectA is housed in the directory project-a and project-deploy in project-deploy. You can build and tag your docker image as project-a:latest like so:
docker build -f project-deploy/Dockerfile.ProjectA -t project-a:latest project-a/
Why this is a bad idea
There are many benefits to using containers over traditional application packaging strategies. These benefits stem from the extra layer of abstraction that a container provides. It enables operators to use a simple and consistent interface for deploying applications, and it empowers developers with greater control and ownership of the environment their application runs in.
This aligns well with the DevOps philosophy, increases your team's agility, and greatly alleviates operational complexity.
However, to enjoy the advantages containers bring, you must make the organizational changes to reflect them or all your doing is making thing more complex, and further separating operations and development:
If your operators are writing your dockerfiles instead of your developers, then you're just adding more complexity to their job with few tangible benefits;
If your developers are not in charge of their application environments, there will continue to be conflict between operations and development, accomplishing basically nothing for them either.
In short, Docker is a tool, not a solution. The real solution is to make organizational changes that empower and accelerate the individual with logically consistent abstractions, and docker is a great tool designed to complement that organizational change.
So yes, while you could separate the application's environment (the Dockerfile) from its code, it would be in direct opposition to the DevOps philosophy. The best solution would be to treat the docker image as an application resource and keep it in the application project, and allow operational configuration (like environment variables and secrets) to be accomplished with docker's support for volumes and variables.

Intro to Docker for FreeBSD Jail User - How and should I start the container with systemd?

We're currently migrating room server to the cloud for reliability, but our provider doesn't have the FreeBSD option. Although I'm prepared to pay and upload a custom system image for deployment, I nontheless want to learn how to start a application system instance using Docker.
in FreeBSD Jail, what I did was to extract an entire base.txz directory hierarchy as system content into /usr/jail/app, and pkg -r /usr/jail/app install apache24 php perl; then I configured /etc/jail.conf to start the /etc/rc script in the jail.
I followed the official FreeBSD Handbook, and this is generally what I've worked out so far.
But Docker is another world entirely.
To build a Docker image, there are two options: a) import from a tarball, b) use a Dockerfile. The latter of which lets you specify a "CMD", which is the default command to run, but
Q1. why isn't it available from a)?
Q2. where are information like "CMD ENV" stored? in the image? in the container?
Q3. How to start a GNU/Linux system in a container? Do I just run systemd and let it figure out the rest from configuration? Do I need to pass to it some special arguments or envvars?
You should think of a Docker container as a packaging around a single running daemon. The ideal Docker container runs one process and one process only. Systemd in particular is so heavyweight and invasive that it's actively difficult to run inside a Docker container; if you need multiple processes in a container then a lighter-weight init system like supervisord can work for you, but that's usually an exception more than a standard packaging.
Docker has an official tutorial on building and running custom images which is worth a read through; this is a pretty typical use case for Docker. In particular, best practice is to write a Dockerfile that describes how to build an image and check it into source control. Containers should avoid having persistent data if they can (storing everything in an external database is ideal); if you change an image, you need to delete and recreate any containers based on it. If local data is unavoidable then either Docker volumes or bind mounts will let you keep data "outside" the container.
While Docker has several other ways to create containers and images, none of them are as reproducible. You should avoid the import, export, and commit commands; and you should only use save and load if you can't use or set up a Docker registry and are forced to move images between systems via a tar file.
On your specific questions:
Q1. I suspect the best reason the non-docker build paths to create images don't easily let you specify things like CMD is just an implementation detail: if you look at the docker history of an image you'll see the CMD winds up being its own layer. Don't worry about it and use a Dockerfile.
Q2. The default CMD, any set ENV variables, and other related metadata are stored in the image alongside the filesystem tree. (Once you launch a container, it has a normal Unix process tree, with the initial process being pid 1.)
Q3. You don't "start a system in a container". Generally run one process or service in a container, and manage their lifecycles independently.

Should I create multiple Dockerfile's for parts of my webapp?

I cannot get the idea of connecting parts of a webapp via Dockerfile's.
Say, I need Postgres server, Golang compiler, nginx instance and something else.
I want to have a Dockerfile that describes all these dependencies and which I can deploy somewhere, then create an image and run a container from it.
Is it correct that I can put everything in one Dockerfile or should I create a separate Dockerfile for each dependency?
If I need to create a Dockerfile for each dependency, what's the correct way to create a merged image from them all and make all the parts work inside one container?
The current best practice is to have a single container perform one function. This means that you would have one container for ngnix and another for your app.. Each could be defined by their own dockerfile. Then to tie them all together, you would use docker-compose to define the dependencies between them.
A dockerfile is your docker image. One dockerfile for each image you build and push to a docker register. There are no rules as to how many images you manage, but it does take effort to manage an image.
You shouldn't need to build your own docker images for things like Postgres, Nginx, Golang, etc.. etc.. as there are many official images already published. They are configurable, easy to consume and can be often be run as just a CLI command.
Go to the page for a docker image and read the documentation. It often examples what mounts it supports, what ports it exposes and what you need to do to get it running.
Here's nginx:
https://hub.docker.com/_/nginx/
You use docker-compose to connect together multiple docker images. It makes it easy to docker-compose up an entire server stack with one command.
How to use docker-compose is like trying to explain how to use docker. It's a big topic, but I'll address the key point of your question.
Say, I need Postgres server, Golang compiler, nginx instance and something else. I want to have a Dockerfile that describes all these dependencies and which I can deploy somewhere, then create an image and run a container from it.
No, you don't describe those things with a dockerfile. Here's the problem in trying to answer your question. You might not need a dockerfile at all!.
Without knowing the specific details of what you're trying to build we can't tell you if you need your own docker images or how many.
You can for example; deploy a running LAMP server using nothing but published docker images from the docker hub. You would just mount the folder with your PHP source code and you're done.
So the key here is that you need to learn how to use docker-compose. Only after learning what it can not do will you know what work is left for you to do to fill in the gaps.
It's better to come back to stackoverflow with specific questions like "how do I run the Golang compiler on the command line via docker"

How should I create Dockerfile to run multiple services through docker-compose?

I'm new to Docker. I wanted to create a Dockerfile to start services like RabbitMQ, ftp server and elastic search. But I'm not able to think from where should I start ?
I have asked a similar question here : How should I create a Dockerfile to run more than one services in one instance?
There I got to know, to create different containers : one for rabbitmq, one for ftp server and other for elasticsearch and run them using docker-compose file. There you'll find my created Dockerfile code.
It will be great if someone can help me out with this thing. Thanks!
They are correct. Each container & by extension, each image should be responsible for one concern & that is typically mapped to a single process. So if you need to run more than one thing (or more than one process, generally speaking, not strictly) then you most probably require to build separate images. One of the easiest & recommended ways of creating an image is writing a Dockerfile. This is expected to be an extremely simple process and most of it will be a copy paste of the same commands you would have used to install that component.
One you write the Dockerfile's for each service, you must build them using docker build command, which will result in the images.
When you run an image you get what is known as a container. Think of it roughly like an iso file is the image & the actual vm or running machine is the container.
Now you can use docker-compose to orchestrate how these various containers so they can communicate (or be isolated from) with each other. A docker-compose.yml file is a plain text file in the yaml format that describes the relationship between the different components within the app. Apps can be made up of several services - like webserver, appserver, searchengine, database server, cache engine etc etc. Each of these is a service and runs as a container, but it is also not necessary to run everything as a container. Some can remain running in the traditional way, on VM's or on bare metal servers.
I'll check your other post and add if there is anything needed. But I hope this helps you get started at least.

Rebuild container after each change?

The Docker documentation suggests to use the ONBUILD instruction if you have the following scenario:
For example, if your image is a reusable python application builder, it will require application source code to be added in a particular directory, and it might require a build script to be called after that. You can't just call ADD and RUN now, because you don't yet have access to the application source code, and it will be different for each application build. You could simply provide application developers with a boilerplate Dockerfile to copy-paste into their application, but that is inefficient, error-prone and difficult to update because it mixes with application-specific code.
Basically, this all sounds nice and good, but that does mean that I have to re-create the app container every single time I change something, even if it's only a typo.
This doesn't seem to be very efficient, e.g. when creating web applications where you are used to change something, save, and hit refresh in the browser.
How do you deal with this?
does mean that I have to re-create the app container every single time I change something, even if it's only a typo
not necessarily, you could use the -v option for the docker run command to inject your project files into a container. So you would not have to rebuild a docker image.
Note that the ONBUILD instruction is meant for cases where a Dockerfile inherits FROM a parent Dockerfile. The ONBUILD instructions found in the parent Dockerfile would be run when Docker builds an image of the child Dockerfile.
This doesn't seem to be very efficient, e.g. when creating web applications where you are used to change something, save, and hit refresh in the browser.
If you are using a Docker container to serve a web application while you are iterating on that application code, then I suggest you make a special Docker image which only contains everything to run your app but the app code.
Then share the directory that contains your app code on your host machine with the directory from which the application files are served within the docker container.
For instance, if I'm developing a static web site and my workspace is at /home/thomas/workspace/project1/, then I would start a container running nginx with:
docker run -d -p 80:80 -v /home/thomas/workspace/project1/:/usr/local/nginx/html:ro nginx
That way I can change files in /home/thomas/workspace/project1/ and the changes are reflected live without having to rebuild the docker image or even restart the docker container.

Resources