Best Docker logging architecture using ELK stack - docker

Recently I am trying to find out best Docker logging mechanism using ELK stack. I am having some questions regarding the best work flow that companies use in production. Our system has typical software stack including Tomcat, PostgreSQL, MongoDB, Nginx, RabbitMQ, Couchbase etc. As of now, our stack runs in CoreOS cluster. Please find my questions below
With ELK stack, what is the best methodology to do the log forwarding - Should I use Lumberjack ?. I am asking this because I have seen workflows where people use Syslog/Rsyslog to forward the logs to logstash.
Since all of our software pieces are containerized, should I include Log-forwarder in all my containers ? I am planning to do this as most of my containers switch nodes based on health so I am not keen on mounting the file system from the container to host.
Should I use redis as a broker in forwarding the logs ? If yes why ?
How difficult is it to write log-config files that defines the log format to be forwarded to log-stash ?
This is a subjective questions but I am sure that this is a problem that people have solved long ago and I am not keen on re-inventing the wheel.

Good questions and the answer like in many other cases are - "it depends".
Shipping Logs - we use rsyslog as docker containers internally and logstash-forwarder in some cases - the advantage of logstash-forwarder is that it encrypts the logs and compresses them so in some cases that's important. I find rsyslog to be very stable and low on resources so we use it as a default shipper. The full logstash might be heavy for small machines (some more data about logstash - http://logz.io/blog/5-logstash-pitfalls-and-how-to-avoid-them/)
We're also fully dockerized and use a separate Docker for each rsyslog/lumberjack. Easy to maintain, update versions and move around when needed.
Yes, definitely use Redis. I wrote a blog about how to build production ELK (http://logz.io/blog/deploy-elk-production/) - I spoke about what I find to be the right architecture to deploy ELK in production
Not sure what exactly are you trying to achieve with that.
HTH

Docker as of Aug 2015, has "Logging Driver", so that you can ship logs into other places. These are the supported way to ship the logs remotely.
syslog
fluentd
journald
gelf
etc..

I would recommend against putting the logging forwarder into each Docker image. That adds unneeded complexity and bloat to your Docker containers. A cleaner solution is to put the log forwarder (the latest log forwarder from Elastic being FileBeat, which replaces logstash forwarder) into its own container and mount the host machine's /var/lib/docker directory as a volume for that container.
docker run --detach --name=docker-filebeat -v /var/lib/docker:/var/lib/docker
/var/lib/docker contains all the logs for every container running on the host's Docker daemon. The data of the log files in this directory is the same data you would get from running docker logs <container_id> on each container.
Then in the filebeat.yml configuration file, put:
filebeat:
prospectors:
-
paths:
- /var/lib/docker/containers/*/*.log
Then config Filebeat to forward to the rest of your ELK stack and start the container. All the Docker container logs on that machine will be forwarded to your ELK stack automatically.
The cool thing about this approach is that it allows you to forward the rest of the host systems logs as well if you want to. Simply add another volume pointing to the host system log files you want to forward and add that path to your filebeat.yml config as well.
I find this method cleaner and more flexible than other methods such as using the Docker logging drivers because the rest of the your Docker setup stays the same. You don't have to add the logging driver flags to each Docker run command (or to the Docker daemon parameters).

Related

What are benefits of running ELK stack on Docker over running it on VM

I'm learning ELK stack. I wonder, why would people run it on Docker?
If I understand everything correctly, it would have to have some directory of a host OS mapped to be persistent over resets of the image.
Meanwhile, running just VL with installed docker would be persistent anyway.
Why should I use Docker to run ELK stack? In fact, should I use Docker or VM? Is the performance the only reason to choose Docker?
A good question that makes make think of my favorite xkcd comic:
https://xkcd.com/1988/
When running the Docker images you are getting installations prepared by the software creators. All you need to do is glue them together with your configuration. You don't need to learn all about how to install the apps, just run them.
With a docker-compose manifest (like this one) you can be up and running on an ELK stack on any Docker compatible system with a single, well documented command.
Elastic, the creators of ELK, also have a nice write up about running their software quickly and easily on Docker:
https://www.elastic.co/blog/a-full-stack-in-one-command

use docker's remote API in a secure manner

I am trying to find an effective way to use the docker remote API in a secure way.
I have a docker daemon running in a remote host, and a docker client on a different machine. I need my solution to not be client/server OS dependent, so that it would be relevant to any machine with a docker client/daemon etc.
So far, the only way I found to do such a thing is to create certs on a Linux machine with openssl and copy the certs to the client/server manually, as in this example:
https://docs.docker.com/engine/security/https/
and then configure docker on both sides to use the certificates for encryption and authentication.
This method is rather clunky in my opinion, because some times it's a problem to copy files and put them on each machine I want to use remote API from.
I am looking for something more elegant.
Another solution I've found is using a proxy for basic HTTP authentication, but in this method the traffic is not encrypted and it is not really secure that way.
Does anyone have a suggestion for a different solution or for a way to improve one of the above?
Your favorite system automation tool (Chef, SaltStack, Ansible) can probably directly manage the running Docker containers on a remote host, without opening another root-equivalent network path. There are Docker-oriented clustering tools (Docker Swarm, Nomad, Kubernetes, AWS ECS) that can run a container locally or remotely, but you have less control over where exactly (you frequently don't actually care) and they tend to take over the machines they're running on.
If I really had to manage systems this way I'd probably use some sort of centralized storage to keep the TLS client keys, most likely Vault, which has the property of storing the keys encrypted, requiring some level of authentication to retrieve them, and being able to access-control them. You could write a shell function like this (untested):
dockerHost() {
mkdir -p "$HOME/.docker/$1"
JSON=$(vault kv get -format=json "secret/docker/$1")
for f in ca.pem cert.pem key.pem; do
echo "$JSON" | jq ".data.data.[\"$f\"]" > "$HOME/.docker/$1/$f"
done
export DOCKER_HOST="https://$1:2376"
export DOCKER_CERT_PATH="$HOME/.docker/$1"
}
While your question makes clear you understand this, it bears repeating: do not enable unauthenticated remote access to the Docker daemon, since it is trivial to take over a host with unrestricted root access if you can access the socket at all.
Based on your comments, I would suggest you go with Ansible if you don't need the swarm functionality and require only single host support. Ansible only requires SSH access which you probably already have available.
It's very easy to use an existing service that's defined in Docker Compose or you can just invoke your shell scripts in Ansible. No need to expose the Docker daemon to the external world.
A very simple example file (playbook.yml)
- hosts: all
tasks:
- name: setup container
docker_container:
name: helloworld
image: hello-world
Running the playbook
ansible-playbook -i username#mysshhost.com, playbook.yml
Ansible provides pretty much all of the functionality you need to interact with Docker via its module system:
docker_service
Use your existing Docker compose files to orchestrate containers on a single Docker daemon or on Swarm. Supports compose versions 1 and 2.
docker_container
Manages the container lifecycle by providing the ability to create, update, stop, start and destroy a container.
docker_image
Provides full control over images, including: build, pull, push, tag and remove.
docker_image_facts
Inspects one or more images in the Docker host’s image cache, providing the information as facts for making decision or assertions in a playbook.
docker_login
Authenticates with Docker Hub or any Docker registry and updates the Docker Engine config file, which in turn provides password-free pushing and pulling of images to and from the registry.
docker (dynamic inventory)
Dynamically builds an inventory of all the available containers from a set of one or more Docker hosts.

Recommended way to run a Docker Compose stack in production?

I have a couple of compose files (docker-compose.yml) describing a simple Django application (five containers, three images).
I want to run this stack in production - to have the whole stack begin on boot, and for containers to restart or be recreated if they crash. There aren't any volumes I care about and the containers won't hold any important state and can be recycled at will.
I haven't found much information on using specifically docker-compose in production in such a way. The documentation is helpful but doesn't mention anything about starting on boot, and I am using Amazon Linux so don't (currently) have access to Docker Machine. I'm used to using supervisord to babysit processes and ensure they start on boot up, but I don't think this is the way to do it with Docker containers, as they end up being ultimately supervised by the Docker daemon?
As a simple start I am thinking to just put restart: always on all my services and make an init script to do docker-compose up -d on boot. Is there a recommended way to manage a docker-compose stack in production in a robust way?
EDIT: I'm looking for a 'simple' way to run the equivalent of docker-compose up for my container stack in a robust way. I know upfront that all the containers declared in the stack can reside on the same machine; in this case I don't have need to orchestrate containers from the same stack across multiple instances, but that would be helpful to know as well.
Compose is a client tool, but when you run docker-compose up -d all the container options are sent to the Engine and stored. If you specify restart as always (or preferably unless-stopped to give you more flexibility) then you don't need run docker-compose up every time your host boots.
When the host starts, provided you have configured the Docker daemon to start on boot, Docker will start all the containers that are flagged to be restarted. So you only need to run docker-compose up -d once and Docker takes care of the rest.
As to orchestrating containers across multiple nodes in a Swarm - the preferred approach will be to use Distributed Application Bundles, but that's currently (as of Docker 1.12) experimental. You'll basically create a bundle from a local Compose file which represents your distributed system, and then deploy that remotely to a Swarm. Docker moves fast, so I would expect that functionality to be available soon.
You can find in their documentation more information about using docker-compose in production. But, as they mention, compose is primarily aimed at development and testing environments.
If you want to use your containers in production, I would suggest you to use a suitable tool to orchestrate containers, as Kubernetes.
If you can organize your Django application as a swarmkit service (docker 1.11+), you can orchestrate the execution of your application with Task.
Swarmkit has a restart policy (see swarmctl flags)
Restart Policies: The orchestration layer monitors tasks and reacts to failures based on the specified policy.
The operator can define restart conditions, delays and limits (maximum number of attempts in a given time window). SwarmKit can decide to restart a task on a different machine. This means that faulty nodes will gradually be drained of their tasks.
Even if your "cluster" has only one node, the orchestration layer will make sure your containers are always up and running.
You say that you use AWS so why don't you use ECS which is built for what you ask. You create an application which is the pack of your 5 containers. You will configure which and how many instances EC2 you want in your cluster.
You just have to convert your docker-compose.yml to the specific Dockerrun.aws.json which is not hard.
AWS will start your containers when you deploy and also restart them in case of crash

Is it wrong to run a single process in docker without providing basic system services?

After reading the introduction of the phusion/baseimage I feel like creating containers from the Ubuntu image or any other official distro image and running a single application process inside the container is wrong.
The main reasons in short:
No proper init process (that handles zombie and orphaned processes)
No syslog service
Based on this facts, most of the official docker images available on docker hub seem to do things wrong. As an example, the MySQL image runs mysqld as the only process and does not provide any logging facilities other than messages written by mysqld to STDOUT and STDERR, accessible via docker logs.
Now the question arises which is the appropriate way to run an service inside docker container.
Is it wrong to run only a single application process inside a docker container and not provide basic Linux system services like syslog?
Does it depend on the type of service running inside the container?
Check this discussion for a good read on this issue. Basically the official party line from Solomon Hykes and docker is that docker containers should be as close to single processes micro servers as possible. There may be many such servers on a single 'real' server. If a processes fails you should just launch a new docker container rather than try to setup initialization etc inside the containers. So if you are looking for the canonical best practices the answer is yeah no basic linux services. It also makes sense when you think in terms of many docker containers running on a single node, you really want them all to run their own versions of these services?
That being said the state of logging in the docker service is famously broken. Even Solomon Hykes the creator of docker admits its a work in progress. In addition you normally need a little more flexibility for a real world deployment. I normally mount my logs onto the host system using volumes and have a log rotate daemon etc running in the host vm. Similarly I either install sshd or leave an interactive shell open in the the container so I can issue minor commands without relaunching, at least until I am really sure my containers are air-tight and no more debugging will be needed.
Edit:
With docker 1.3 and the exec command its no longer necessary to "leave an interactive shell open."
It depends on the type of service you are running.
Docker allows you to "build, ship, and run any app, anywhere" (from the website). That tells me that if an "app" consists of/requires multiple services/processes, then those should be ran in a single Docker container. It would be a pain for a user to have to download, then run multiple Docker images just to run one application.
As a side note, breaking up your application into multiple images is subject to configuration drift.
I can see why you would want to limit a docker container to one process. One reason being uptime. When creating a Docker provisioning system, it's essential to keep the uptime of a container to a minimum so that scaling sideways is fast. This means, that if I can get away with running a single process per Docker container, then I should go for it. But that's not always possible.
To answer your question directly. No, it's not wrong to run a single process in docker.
HTH

How to link Docker services across hosts?

Docker allows servers from multiple containers to connect to each other via links and service discovery. However, from what I can see this service discovery is host-local. I would like to implement a service that uses other services hosted on a different machine.
There have been several approaches to solving this problem in Docker, such as CoreOS's jumpers, host-local services that essentially proxy to the other machine, and a whole bunch of github projects for managing Docker deployments that appear to have attempted to support this use-case.
Given the pace of development it is hard to follow what current best practices are. Therefore my question is essentially:
What (if any) is the current predominant method for linking across hosts in Docker, and
Are there any plans for supporting this functionality directly in the Docker system?
Update
Docker has recently announced a new tool called Swarm for Docker orchestration.
Swarm allows you do "join" multiple docker daemons: You first create a swarm, start a swarm manager on one machine, and have docker daemons "join" the swarm manager using the swarm's identifier. The docker client connects to the swarm manager as if it were a regular docker server.
When a container started with Swarm, it is automatically assigned to a free node that meets any constraints that have been defined. The following example is taken from the blog post:
$ docker run -d -P -e constraint:storage=ssd mysql
One of the supported constraints is "node" that allows you pin a container to a specific hostname. The swarm also resolves links across nodes.
In my testing I got the impression that Swarm doesn't yet work with volumes at a fixed location very well (or at least the process of linking them is not very intuitive), so this is something to keep in mind.
Swarm is now in beta phase.
Until recently, the Ambassador Pattern was the only Docker-native approach to remote-host service discovery. This pattern can still be used and doesn't require any magic beyond plain Docker in that the pattern consists of one or more additional containers that act as proxies.
Additionally, there are several third-party extensions to make Docker cluster-capable. Third-party solutions include:
Connecting the Docker network bridges on two hosts, lightweight and various solutions exist, but generally with some caveats
DNS-based discovery e.g. with skydock and SkyDNS
Docker management tools such as Shipyard, and Docker orchestration tools. See this question for an extensive list: How to scale Docker containers in production
UPDATE 3
Libswarm has been renamed as swarm and is now a separate application.
Here is the github page demo to use as a starting point:
# create a cluster
$ swarm create
6856663cdefdec325839a4b7e1de38e8
# on each of your nodes, start the swarm agent
# <node_ip> doesn't have to be public (eg. 192.168.0.X),
# as long as the other nodes can reach it, it is fine.
$ swarm join --token=6856663cdefdec325839a4b7e1de38e8 --addr=<node_ip:2375>
# start the manager on any machine or your laptop
$ swarm manage --token=6856663cdefdec325839a4b7e1de38e8 --addr=<swarm_ip:swarm_port>
# use the regular docker cli
$ docker -H <swarm_ip:swarm_port> info
$ docker -H <swarm_ip:swarm_port> run ...
$ docker -H <swarm_ip:swarm_port> ps
$ docker -H <swarm_ip:swarm_port> logs ...
...
# list nodes in your cluster
$ swarm list --token=6856663cdefdec325839a4b7e1de38e8
http://<node_ip:2375>
UPDATE 2
The official approach is now to use libswarm see a demo here
UPDATE
There is a nice gist for openvswitch hosts communication in docker using the same approach.
To allow service discovery there is an interesting approach based on DNS called skydock.
There is also a screencast.
This is also a nice article using the same pieces of the puzzle but adding also vlans on top:
http://fbevmware.blogspot.it/2013/12/coupling-docker-and-open-vswitch.html
The patching has nothing to do with the robustness of the solution. Docker is actually only a sort of DSL upon Linux Containers and both solutions in these articles simply bypass some Docker automatic settings and fall back directly to Linux Containers.
So you can use the solutions safely and wait to be able to do it in a simpler way once Docker will implement it.
Weave is a new Docker virtual network technology that acts as a virtual ethernet switch over TCP/UDP - all you need is a Docker container running Weave on your host.
What's interesting here is
Instead of links, use static IPs/hostnames in your virtual network
Hosts don't need full connectivity, a mesh is formed based on what peers are available, and packets will be routed multi-hop to where they need to go
This leads to interesting scenarios like
Create a virtual network across the WAN, none of the Docker containers will know or care what actual network they sit in
Move your containers to different physical docker hosts, Weave will detect the peer accordingly
For example, there's an example guide on how to create a multi-node Cassandra cluster across your laptop and a few cloud (EC2) hosts with two commands per host. I launched a CoreOS cluster with AWS CloudFormation, installed weave on each in /home/core, plus my laptop vagrant docker VM, and got a cluster up in under an hour. My laptop is firewalled but Weave seemed to be okay with that, it just connects out to its EC2 peers.
Update
Docker 1.12 contains the so called swarm mode and also adds a service abstraction. They probably aren't mature enough for every use case, but I suggest you to keep them under observation. The swarm mode at least helps in a multi-host setup, which doesn't necessarily make linking easier. The Docker-internal DNS server (since 1.11) should help you to access container names, if they are well-known - meaning that the generated names in a Swarm context won't be so easy to address.
With the Docker 1.9 release you'll get built in multi host networking. They also provide an example script to easily provision a working cluster.
You'll need a K/V store (e.g. Consul) which allows to share state across the different Docker engines on every host. Every Docker engine need to be configured with that K/V store and you can then use Swarm to connect your hosts.
Then you create a new overlay network like this:
$ docker network create --driver overlay my-network
Containers can now be run with the network name as run parameter:
$ docker run -itd --net=my-network busybox
They can also be connected to a network when already running:
$ docker network connect my-network my-container
More details are available in the documentation.
The following article describes nicely how to connect docker containers on multiple hosts: http://goldmann.pl/blog/2014/01/21/connecting-docker-containers-on-multiple-hosts/
It is possible to bridge several Docker subnets together using Open vSwitch or Tinc. I have prepared Gists to show how to do it:
Open vSwitch: https://gist.github.com/noteed/8656989
Tinc: https://gist.github.com/noteed/11031504
The advantage I see using this solution instead of the --link option and the ambassador pattern is that I find it more transparent: there is no need to have additional containers and more importantly, no need to expose ports on the host. Actually I think of the --link option to be a temporary hack before Docker get a nicer story about multi-host (or multi-daemon) setups.
Note: I know there is another answer pointing to my first Gist but I don't have enough karma to edit or comment on that answer.
As mentioned above, Weave is definitely a viable solution to link Docker containers across the hosts. Based on my own experience with it, it is fairly straightfoward to set it up. It is now also has DNS service which you can address container's by its DNS names.
On the other hand, there is CoreOS's Flannel and Juniper's Opencontrail for wiring the containers across the hosts.
Seems like docker swarm 1.14 allows you to:
assing hostname to container, using --hostname tag, but i haven't been able to make it work, containers are not able to ping each other by assigned hostnames.
assigning services to machine using --constraint 'node.hostname == <host>'

Resources