Combining Chef And Docker - docker

I am having hard time figuring how I should combine Chef and Docker to get the best of them.
Right now I am using Chef to automatically pull a docker image and create a container.
But things get messy when I want to change the configuration inside the container.
I read about knife container but I didn't understand how one can bootstrap a container and a new vm (on Amazon for example) all together.

I would suggest that if all you want to do is manage Docker images/containers, that you don't really need Chef.
Docker provides tools like:
Fig (http://www.fig.sh/), which brings up multiple containers as one logical unit.
Swarm (https://github.com/docker/swarm/), which allows you to abstract away the machines you have for deployments. For example, "My app needs 2GB of RAM, 1 CPU, 10GB of HD, which machine has available resources?"
Machine (https://github.com/docker/machine), which allows you to create VMs in the cloud in pretty much any provider.
A REST API (https://docs.docker.com/reference/api/docker_remote_api/), which allows you to remotely start/stop containers etc.
In my opinion those suite of tools replace the need for Chef if all you're going to do is manage Docker images and containers.
As someone already noted, don't change configs after a container has started. Better to make a new image or restart the container. You could also mount the configs external to the container and modify them there, then restart the container.

Related

Dealing with dockers and containers in production

I am new to the containers topic and would appreciate if this forum is the right place to ask this question.
I am learning dockers and containers and I now have some skills using the docker commands and dealing with containers. I understand that docker has two main parts, the docket client (docker.exe) and the docker server (dockerd.exe). Now in the development life both are installed on my local machine (I am manually installed them on windows server 2016) followed Nigel Poulton tutorial here https://app.pluralsight.com/course-player?clipId=f1f27565-e2bf-4e58-96f3-bc2c3b160ec9. Now when it comes to the real production life, then, how would I configure my docker client to communicate with a remote docker server. I tried to make some research on the internet but honestly could not find a simple answer for this question. I installed docker for desktop on my windows 10 machine and noticed that it created a hyper-v machine which might be Linux machine, my understanding is that this machine has the docker server that my docker client interacts with but do not understand how is this interaction gets done.
I would appreciate if I get some guidance or clear answer to my inquiries.
In production environments you never have a remote Docker daemon. Generally you interact with Docker either through a dedicated orchestrator (Kubernetes, Docker Swarm, Nomad, AWS ECS), or through a general-purpose system automation tool (Chef, Ansible, Salt Stack), or if you must by directly ssh'ing to the system and running docker commands there.
Remote access to the Docker daemon is something of a security disaster. If you can access the Docker daemon at all, you can edit any file on the host system as root, and pretty trivially take over the whole thing. (Google "Docker cryptojacking" for some real-world examples.) In principle you can secure it with mutual TLS, but this is a tricky setup.
The other important best practice is that Docker images should be self-contained. Don't try to deploy a Docker image to production, and also separately copy your application code. The same Ansible setup that can deploy a Docker container can also install Node directly on the target system, avoiding a layer; it's tricky to copy application code into a Kubernetes volume, especially when Kubernetes pods can restart outside your direct control. Deploy (and test!) your images with all of the code COPYd in a Dockerfile, minimizing the use of bind mounts.

Docker-machine vs Vagrant? [duplicate]

Every Docker image, as I understand, is based on base image - for example, Ubuntu.
And if I want to isolate any process I should deploy ubuntu docker base image (where is difference with Vagrant here?), and create a necessary subimage after it installing on ubuntu image?
So, if Ubuntu is launched on Vagrant and on Docker, where is practice difference?
And if to use docker provider in Vagrant - where here is difference between Vagrant and Docker?
And, in Docker is it possible to isolate processes on some PC without base image without it's sharing to another PC?
Vagrant is a utility to help you automate setting up VMs. Docker is a utility that helps you use containerization in linux.
A virtual machine runs a whole system, and emulates hardware. Containers section off processes in a single running kernel without emulating hardware.
Both a VM and a Docker image may be Ubuntu 14.04, but with the Docker image you don't need to run the whole OS.
For example, if I want to run an nginx container based on ubuntu, I'd end up with only the nginx process running. No upstart/systemd/init is needed. A VM would run an init system, manage its own networking, and run other services as well. The container image approach that uses a linux distro base is mostly for convenience.
It is entirely possible to run Docker containers with very minimal images. A statically compiled binary alone in an image is all you'd need to run a container.
Vagrant : Vagrant is a project that helps the spawning of virtual machines. It started as an command line of VirtualBox, something similar to Gemfile for VM's. You can choose the base image to start with, network, IP, share folders and put it all in a file that anyone can reuse to spawn the same configured machine. Vagrant has different extensions, provisioning options and VM providers. You can run a VirtualBox, VMware and it is extensible enough to be able to create instances on EC2.
Docker : Docker, allows to package an application with all of its dependencies into a standardized unit of software development. So, it reduces a friction between developer, QA and testing. It dynamically change your application, adding new capabilities every single day, scaling out services to quickly changing the problem areas. Docker is putting itself in an excited place as the interface to PaaS be it networking, discovery and service discovery with applications not having to care about underlying infrastructure. Yes, their are still issues with docker in production, but, hopefully, we'll see the solutions to those problems, as docker team and contributors working hard on those issues. As Docker Volume driver allows third-party container data management solutions to provide data volumes for containers which operate on data, such as database, key-value stores, and other stateful applications. The latest version is coming with much more flexibility, complete orchestration build-in, advanced networking, secrets management, etc. As you can see one, rexray, as volume plugin and provides advanced storage functionality. emccode/rexray We're finally starting to agree on more than just images and run time.

What is a cluster and a node oriented to containers?

Sorry for this question, but I just started with Docker and Docker Compose and I really didn't need any of this until I read that I need to use Docker Swarn or Kuebernetes to have more stability in production. I started reading about Docker Swarn and they mentioned nodes and clusters.
I was really happy not knowing about this as I understood docker-compose:
Is that I could manage my services/containers from a single file
and only have to run several commands to launch, build, delete, etc.
all my services based on the docker-compose configuration.
But now the nodes and cluster have come out and I've really gone a bit crazy, and that's why if you can help me understand this next step in the life of containers. I've been googling and it's not very clear to me.
I hope you can help me and explain it to me in a way that I can understand.
Thank you!
A node is just a physical or virtual machine.
In Kubernetes/Docker Swarm context each node must have the relevant binaries installed (Docker Engine, kubelet etc..)
A cluster is a grouping of one or more nodes.
If you have just been testing on your local machine you have a single node.
If you were to add a second machine and link both machines together using docker swarm/kubernetes then you would have created a 2 node cluster
You can then use docker swarm/kubernetes to run your services/containers on any or all nodes in your cluster. This allows your services to be more resilient and fault tolerant.
By default Docker Compose runs a set of containers on a single system. If you need to run more containers than fit on one system, or you're just afraid of that system crashing, you need more than one system to do it. The cluster is the group of all of the systems (physical computers, virtual machines, cloud instances) that are working together to run the containers. Each of those individual systems is a node.
The other important part of the cluster container setups is that you can generally run multiple replicas of a give container, and you don't care where in the cluster they run. Say you have five nodes, and a Web server container, and you'd like to run three copies of it for redundancy. Instead of having to pick a node, ssh to it, and manually docker run there, you just tell the cluster manager "run me three of these", and it chooses a node and launches the container for you. You can also scale the containers up and down at runtime, or potentially set the cluster to do the scaling on its own based on load.
If your workload is okay running a single copy of containers on a single server, you don't need a cluster setup. (You might have some downtime during updates or if the single server dies.) Swarm has the advantages of being bundled with Docker and being able to use Docker-native tools (docker-compose can deploy to a Swarm cluster). Kubernetes is much more complex, but at this point most public cloud providers will sell you a preconfigured Kubernetes cluster, and it has better stories around security, storage management, and autoscaling. There are also a couple other less-prominent alternatives like Nomad and Mesos out there.

docker-swarm vs.docker-compose on single host in production

Is there a reason to use docker-swarm instead of docker-compose for deploying a single host in production?
I'm currently rewriting an existing application. My predecessors set up the application using docker-swarm. But I do not understand why: the application will only consist of a single host running a couple of services. These services will only supply some local information on the customer network via a REST-Api to a kubernetes cluster (so no real load or reason to add additional hosts).
I looked through the Docker website and could not find a reason to use docker-swarm to deploy a single host, apart from testing a deployment on a single host dev environment.
Are there benefits of using docker-swarm compared to docker-compose regarding deployment, networking, etc...?
Docker Swarm and Docker Compose are fundamentally different animals. Compose is a build tool that lets you define and configure a group of related containers, whereas swarm is an orchestration tool that manages multiple docker engines in a way that lets you treat them (somewhat) as a single unit. Swarm exposes an API that is mostly compatible with the Docker Remote API, which allows existing applications to use Swarm to scale horizontally without having to completely overhaul the existing interface to the container engine.
That said, much of the functionality in Docker Compose that overlaps with Docker Swarm has been added incrementally. Compose has grown over time, and the distinction between the two has narrowed a bit. Swarm was eventually integrated into the Docker engine, and Docker Stack was introduced, allowing compose.yml files to be read directly by Docker, without using Compose.
So the real question might be: what is the difference between docker compose and docker stack? Not a whole lot. Compose is actually a separate project, written in Python that uses the Docker API under the hood. Stack does much of the same things as Compose, but is integrated into Docker. Stack also wants pre-built images, while compose will handle those image builds for you, which makes compose very handy for development.
What you are dealing with might be a product of a time when these 2 tools were a lot more distinct. Docker Swarm is part of Docker, and it allows for easy scaling if needed (even if you don't need it now, it might be good down the road). On the other hand, Compose (in my opinion anyway) is much more useful for development situations where you are making frequent tweaks to your images, and rebuilding.

How do I do docker clustering or hot copy a docker container?

Is it possible to hotcopy a docker container? or some sort of clustering with docker for HA purposes?
Can someone simplify this?
How to scale Docker containers in production
Docker containers are not designed to be VMs and are not really meant for hot-copies. Instead you should define your container such that it has a well-known start state. If the container goes down the alternate should start from the well-known start state. If you need to keep track of state that the container generates at run time this has to be done externally to docker.
One option is to use volumes to mount the state (files) on to the host filesystem. Then use RAID, NTFS or any other means, to share that file system with other physical nodes. Then you can mount the same files on to a second docker container on a second host with the same state.
Depending on what you are running in your containers you can also have to state sharing inside your containers for example using mongo replication sets. To reiterate though containers are not as of yet designed to be migrated with runtime state.
There is a variety of technologies around Docker that could help, depending on what you need HA-wise.
If you simply wish to start a stateless service container on different host, you need a network overlay, such as weave.
If you wish to replicate data across for something like database failover, you need a storage solution, such as Flocker.
If you want to run multiple services and have load-balancing and forget on which host each container runs, given that X instances are up, then Kubernetes is the kind of tool you need.
It is possible to make many Docker-related tools work together, we have a few stories on our blog already.

Resources