What's the difference between Docker and Chef's new Habitat tool? - docker

Does Chef's new Habitat tool somehow work with Docker? If so, what problem is Habitat trying to solve or is it just trying to replace tools in the Docker toolset (e.g., Docker Swarm, Docker Machine, Docker Compose, etc.)?

This is skirting the limits of StackOverflow's policy on open-ended questions, but I'll answer anyway:
Docker and Habitat don't really overlap much. The main point of competition is on building release artifacts. Docker has Dockerfiles and docker build, Habitat has plans and the Studio. The output of both can be a Docker image though, which is basically a tarball of a filesystem along with some metadata. Habitat is aimed more are building super minimal artifacts, i.e. not including a Linux distro of any kind, no package manager, just statically compiled executable code and whatever support files you need for that specific app.
As for runtime, they are 100% orthogonal. Docker is a way to run a process inside a bunch of Linux security features collectively called a "container" now. Habitat is a little stub that surrounds your process and handles things like runtime config distribution, secrets transfer, and service discovery. Those features are more overlapping with higher-level tools like Kube but even there it's only barely overlapping. You need something to actually start hab-sup, which could be docker run (possibly via Swarm), Nomad, Kube, or even a non-container system like Upstart or Runit if you wanted to. The only interaction point between those is those tools all start an entrypoint process, and hab-sup is a generic entrypoint process that gives whatever app it runs underneath some cool features if they want to use 'em.

Related

Is docker a infrastructure as code technology because it virtualizes an OS to handle multiple workloads on a single OS instance?

I have come across the word IaC many times while learning DevOps and when I googled it to know what it is it showed that it used code as it is the process of managing and provisioning computer data centers through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools. So is docker also a infrastructure as code technology because it virtualizes an OS to handle multiple workloads on a single OS instance? Thanks in advance
I'm not sure exactly what you are asking, but Docker provides infrastructure as code because the Docker functionality is set via Dockerfiles and shell scripts. You don't install a list of programs manually when defining an image. You don't configure anything with a GUI in order to create an environment when you pull an image from Docker hub or when you deploy your own image.
And as said in another answer, Docker is not virtualization, as everything is actually running in your Linux kernel, but with limited resources in its own namespace. You can see a container process via htop in the host machine, for instance. There's no hypervisor. There's no overhead.
I think you misunderstud the concept, because neither Docker is an hypervidor nor containers are VMs.
From this page: https://www.docker.com/resources/what-container
A Docker container image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries and settings.
Container images become containers at runtime and in the case of Docker containers - images become containers when they run on Docker Engine.
Containers are an abstraction at the app layer that packages code and dependencies together. Multiple containers can run on the same machine and share the OS kernel with other containers, each running as isolated processes in user space.

Why do we build "inside" docker?

When I first learned Docker I expected a config file, image producer, CLI, and options for mounting and networks. That's all there.
I did not expect to put build commands inside a Dockerfile. I thought docker would wrap/tar/include a prebuilt task I made. Why give build commands in Docker?
Surely it can import a task thus keeping Jenkins/Bazel etc. distinct and apart for making an image/container?
I guess we are dealing with a misconception here. Docker is NOT a lighweight version of VMware/Xen/KVM/Parallels/FancyVirtualization.
Disclaimer: The following is heavily simplified for the sake of comprehensiveness.
So what is Docker?
In one sentence: Docker is a system to isolate processes from the other processes within an operating system as much as possible while still providing all means to run them. Put differently:
Docker is a package manager for isolated processes.
One of its closest ancestors are chroot and BSD jails. What those basically do is to isolate (more in the case of BSD, less in the case of chroot) a part of your OS resources and have a complete environment running independently from the rest of the OS - except for the kernel.
In order to be able to do that, a Docker image obviously needs to contain everything except for a kernel. So you need to provide a shell (if you choose to do so), standard libraries like glibc and even resources like CA certificates. For reference: In order to set up chroot jails, you did all this by hand once upon a time, preinstalling your chroot environment with each and every piece of software required. Docker is basically taking the heavy lifting from you here.
The mentioned isolation even down to the installed (and usable software) sounds cumbersome, but it gives you several advantages as a developer. Since you provide basically everything except for a (compatible) kernel, you can develop and test your code in the same environment it will run later down the road. Not a close approximation, but literally the same environment, bit for bit. A rather famous proverb in relation to Docker is:
"Runs on my machine" is no excuse any more.
Another advantage is that can add static resources to your Docker image and access them via quite ordinary file system semantics. While it is true that you can do that with virtualisation images as well, they usually do not come with a language for provisioning. Docker does - the Dockerfile:
FROM alpine
LABEL maintainer="you#example.com"
COPY file/in/host destination/on/image
Ok, got it, now why the build commands?
As described above, you need to provide all dependencies (and transitive dependencies) your application has. The easiest way to ensure that is to build your application inside your Docker image:
FROM somebase
RUN yourpackagemanager install long list of dependencies && \
make yourapplication && \
make install
If the build fails, you know you have missing dependencies. Now you can tweak and tune your Dockerfile until it compiles and is tested. So now your Docker image is finished, you can confidently distribute it, since you know that as long as the docker daemon runs on the machine somebody tries to run your image on, your image will run.
In the Go ecosystem, you basically assure your go.mod and go.sum are up to date and working and your work stay's reproducible.
Again, this works with virtualisation as well, so where is the deal?
A (good) docker image only runs what it needs to run. In the vast majority of docker images, this means exactly one process, for example your Go program.
Side note: It is very bad practise to run multiple processes in one Docker image, say your application and a database server and a cache and whatnot. That is what docker-compose is there for, or more generally container orchestration. But this is far too big of a topic to explain here.
A virtualised OS, however, needs to run a kernel, a shell, drivers, log systems and whatnot.
So the deal basically is that you get all the good stuff (isolation, reproducibility, ease of distribution) with less waste of resources (running 5 versions of the same OS with all its shenanigans).
Because we want to have enviroment for reproducible build. We don't want to depend on version of language, existence of compiler, version of libraires and so on.
Building inside a Dockerfile allows you to have all the tools and environment you need inside independently of your platform and ready to use. In a development perspective is easier to have all you need inside the container.
But you have to think about the objective of building inside a Dockerfile, if you have a very complex build process with a lot of dependencies you have to be worried about having all the tools inside and it reflects on the final size of your resulting image. Because this is not the same building to generate an artifact than building to produce the final container.
Thinking about this two aspects you have to learn to use the multistage build process in Docker here. The main idea is closer to your question because you can have a as many stages as you need depending on your build process and use different FROM images to ensure you have the correct requirements and dependences on each stage, to finally generate the image with the minimum dependencies and smaller size.
I'll add to the answers above:
Doing builds in or out of docker is a choice that depends on your goal. In my case I am more interested in docker containers for kubernetes, and in addition we have mature builds already.
This link shows how you take prebuilt tasks and add them to an image. This strategy together with adding libs, env etc leverages docker well and shows that indeed docker is flexible. https://medium.com/#chemidy/create-the-smallest-and-secured-golang-docker-image-based-on-scratch-4752223b7324

Can I run a docker container doing a x86 build on a IBM Power system?

Our build setup is backed into a large docker container (basically a 2 GB image coming with a complete X86 linux in itself).
We have two ways to actually build: the official approach is jenkins environment (running on X86 hardware). But we also have a little "side X86 server" running RH 7. Developers can log into that RH server and kick off specific builds (using said docker images) themselves.
Those RH servers will be shut down at some point, to be replaced with IBM Power8 machines (running RH7 Little Endian for power).
I am simply wondering: is there a chance that our existing build setup and docker images simply work on Power8? Or are the fundamental technical issues that make it unlikely and not even worth trying?
You can probably use your existing build methodology and scripts close to unchanged, but you'll need to rebuild the actual images.
You can't directly run x86 binaries on Power (at a very low level, the bytes of machine code are just different). Docker doesn't contain any sort of virtualization layer; it does a bunch of setup to isolate the container from the host, but then runs the binaries in an image directly.
If your Jenkins setup has enough parameters for image names and version tags, then you should be able to run the x86 and Power setups side-by-side; you need to encode the architecture somewhere in the built image name or tag; for instance, repo.example.com/app/build:20180904-power. (I don't know that one or the other is considered better if you control all of the machinery.) If you have a private repo, you could encode it earlier in the path, winding up with image names like repo.example.com/power/build:20180904.
You'd need to double-check that everywhere that has a Docker image reference has it correctly parameterized (which is a good practice anyways). That would include any direct docker run commands; any Docker Compose or Kubernetes YAML files or similar artifacts; and the FROM line of any Dockerfiles.
Existing build setup? Not sure!
Docker images? NO, don’t even try.
Docker images are actually multiple layers which stored on filesystem through corresponding storage driver and backing filesystem(shown in the output of docker info).
If storage driver/backing filesystem has been changed, which likely be true when OS changed, older docker images could not be valid any more. Meaning they must be rebuilt for sure.

What components should be "containerized" - Docker

I am exploring use of containers in a new application and have looked at a fair amount of content and created a sandbox environment to explore docker and containers. My struggle is more an understanding what components needs to be containerized individually vs bundling multiple components into my own container. And what points to consider when architecting this?
Example:
I am building a python back end service to be executed via webservice call.
The service would interact with both Mongo DB, and RabbitMQ.
My questions are:
Should I run individual OS container (EG Ubuntu), Python Container, MongoDB Container, Rabbit MQ container etc? Combined they all form part of my application and by decoupling everything I have the ability to scale individually?
How would I be able to bundle/link these for deployment without losing the benefits of decoupling/decomposing into individual containers
Is an OS and python container actually required as this will all be running on an OS with python anyways?
Would love to see how people have approached this problem?
Docker's philosophy: using microservices in containers. The term "Microservice Architecture" has sprung up over the last few years to describe a particular way of designing software applications as suites of independently deployable services.
Some advantages of microservices architecture are:
Easier upgrade management
Eliminates long-term commitment to a single technology stack
Improved fault isolation
Makes it easier for a new developer to understand the functionality of a service
Improved Security
...
Should I run individual OS container (EG Ubuntu), Python Container,
MongoDB Container, Rabbit MQ container etc? Combined they all form
part of my application and by decoupling everything I have the ability
to scale individually?
You dont need an individual OS ontainer. Each container will use Docker host's kernel, and will contain only binaries required, python binaries for example.
So you will have, a python container for you python service, MongoDB container and RabbitMQ container.
How would I be able to bundle/link these for deployment without losing
the benefits of decoupling/decomposing into individual containers?
For deployments, You will use dockerfiles + docker-compose file. Dockerfiles include instructions to create a docker image. If you are just using official library images, you don't need dockerfiles.
docker-compose will help you orchestrate the container builds (from docker files), start ups, Creating required networks, Mounting required volumes and etc.

Does Docker reduce or mitigate the need for Puppet/Chef et al?

I'm not au fait with any of these technologies (embarrassing really), but at my present gig, the company badly needs to automate.
So as I begin to read-up on Puppet and Chef and PowerShell DSC, I then remember that Docker and containerisation is coming to Windows.
Does Docker do away with the need for these tools, or do they work together?
I understand that Docker uses virtualisation technology in the OS, so I get the feeling that Docker solves a different problem, and a configuration tool is still needed but I've no certain, practical knowledge.
Does Docker do away with the need for these tools, or do they work together?
They work together: provisioning and containerization solve different issues, and you actually can provision docker containers themselves with a provisioning tool.
See for instance "Docker: Using Puppet"
Tools like Chef & Puppet are important for configuration, but they do have one weakness that Docker helps to shore up. They are not always fully idempotent (hype notwithstanding). In other words, running Chef twice on the same virtual machine may cause unexpected and hard-to-find changes on that machine, and you'd be restoring a backup to get to a known good state.
By contrast, a Docker deployment involves building an entirely new image and swapping it out with your old image. Rollback involves simply unswapping them and comparing them to diagnose the problems in the new image.
Note that you still might very well use Chef to build your Docker container. But you might very well not. Since containers are supposed to run just one process in a particular way, I've found that a series of simple shell commands is way preferable to the overhead entailed by Chef.
In short no, you don't need anything like Chef or Puppet. Of course you can use if like to but it's not required.
If you build your system in such way that everything in containerized then what you need is only a tiny OS like CoreOS or Atomic.
So you just configure your VM via Cloud-Config if needed and deploy your container either with cloud config or Docker cli itself. The idea is your machines should have a static state and they can be created whenever you want new one and destroyed when you don't need.
There are other tools that can help with Docker orchestration which another story by itself.
Tools like Swarm, Kubernetes and Mesosphere.
docker-machine is also very helpful for development purpose. (maybe deployment too).
Here is CoreOS example:
https://coreos.com/os/docs/latest/cloud-config.html
Resource: I do it in production for different apps.
UPDATE:
BTW, Docker is not only a visualization technology. It does some sort of containerization (you can call it virtualization too) and that's only a small part of the what Docker can do. Docker can configure, build, ship and run application whit eliminating its dependencies on host machine. And that's why you don't need those classic configuration tools.
Puppet and Chef are configuration management tools, where as Docker is a virtualization tool such as LXC.
Usually you'd be using Chef or puppet to manage Docker containers. For example take a look at Chef docs.
EDIT as per #ptierno comment.
Docker is three things: a cool way to run a process, a decent image-based deploy system, and a mediocre system image builder.
The first is not related to config management as those tools aren't involved in running a process, at least not directly. The second takes the place of some amount of config management in production by doing it ahead of time when you build the image. There is still often some need for last-mile config for stuff like service discovery and secrets but this can be handled by lighter tools like consul-templates or confd. The last is where the rub lies. docker build is simple, easy to get started with, and mostly unhelpful for complex situations. You get, at most, a single inheritance tree between dockerfiles which makes stuff like multi-axis matrix builds ({app1 app2 app3} x {prod qa dev}) more difficult than it could be. Also building composable abstraction for other groups to use is difficult, though again it isn't impossible. Using something like Packer to drive image builds can produce simpler code sometimes, and supports the full suite of CAPS (Chef, Ansible, Puppet, Salt) tools. This is mostly aimed at the use case where you are treating Docker images like tiny VMs, which I wish fewer people would do, but it's a thing so here we are.

Resources