We are using Hashicorp Nomad to run microservices on Windows. We experienced that allocations come and go, but we would like to have centralized logging solution (ideally ELK) for all logs from all jobs and tasks from multiple environments. It is quite simple to do it with dockerized environments, but how can I do it if I run raw_exec tasks?
There's nothing specific to containers for log shipping other than the output driver. If containers write their logs to volumes, which Nomad can be configured to do, then the answer is the same.
Assuming your raw_exec jobs write logs into the local filesystem, then you need a log shipper product such as Filebeat or Fluentd to watch those paths, then push that data to Elastic / Logstash.
Related
I'm trying to persist the logs of a container that is running inside a docker stack I'm deploying the whole thing to the swarm using a .yml file but every solution I come across either does not work or I have to set it up manually like everytime I deploy the stack I have to mount manually. What would be the best way to persist the logs automatically without having to do it manually everytime? (Without Kibana etc..).
Deploy EFK stack in the container platform. FLuentd is run as daemonset collecting the logs from all the containers from a host and feeds to elasticsearch. Using kibana you can visualize the logs stored in elasticsearch.
With curator you can apply data retention policies depending on the amount of days you want to keep the logs.
Kubernetes Volumes can be referred to write the logs into persistent storage.
There are different solutions stacks for shipping, storing and viewing logs.
I have a couple of compose files (docker-compose.yml) describing a simple Django application (five containers, three images).
I want to run this stack in production - to have the whole stack begin on boot, and for containers to restart or be recreated if they crash. There aren't any volumes I care about and the containers won't hold any important state and can be recycled at will.
I haven't found much information on using specifically docker-compose in production in such a way. The documentation is helpful but doesn't mention anything about starting on boot, and I am using Amazon Linux so don't (currently) have access to Docker Machine. I'm used to using supervisord to babysit processes and ensure they start on boot up, but I don't think this is the way to do it with Docker containers, as they end up being ultimately supervised by the Docker daemon?
As a simple start I am thinking to just put restart: always on all my services and make an init script to do docker-compose up -d on boot. Is there a recommended way to manage a docker-compose stack in production in a robust way?
EDIT: I'm looking for a 'simple' way to run the equivalent of docker-compose up for my container stack in a robust way. I know upfront that all the containers declared in the stack can reside on the same machine; in this case I don't have need to orchestrate containers from the same stack across multiple instances, but that would be helpful to know as well.
Compose is a client tool, but when you run docker-compose up -d all the container options are sent to the Engine and stored. If you specify restart as always (or preferably unless-stopped to give you more flexibility) then you don't need run docker-compose up every time your host boots.
When the host starts, provided you have configured the Docker daemon to start on boot, Docker will start all the containers that are flagged to be restarted. So you only need to run docker-compose up -d once and Docker takes care of the rest.
As to orchestrating containers across multiple nodes in a Swarm - the preferred approach will be to use Distributed Application Bundles, but that's currently (as of Docker 1.12) experimental. You'll basically create a bundle from a local Compose file which represents your distributed system, and then deploy that remotely to a Swarm. Docker moves fast, so I would expect that functionality to be available soon.
You can find in their documentation more information about using docker-compose in production. But, as they mention, compose is primarily aimed at development and testing environments.
If you want to use your containers in production, I would suggest you to use a suitable tool to orchestrate containers, as Kubernetes.
If you can organize your Django application as a swarmkit service (docker 1.11+), you can orchestrate the execution of your application with Task.
Swarmkit has a restart policy (see swarmctl flags)
Restart Policies: The orchestration layer monitors tasks and reacts to failures based on the specified policy.
The operator can define restart conditions, delays and limits (maximum number of attempts in a given time window). SwarmKit can decide to restart a task on a different machine. This means that faulty nodes will gradually be drained of their tasks.
Even if your "cluster" has only one node, the orchestration layer will make sure your containers are always up and running.
You say that you use AWS so why don't you use ECS which is built for what you ask. You create an application which is the pack of your 5 containers. You will configure which and how many instances EC2 you want in your cluster.
You just have to convert your docker-compose.yml to the specific Dockerrun.aws.json which is not hard.
AWS will start your containers when you deploy and also restart them in case of crash
We are currently moving towards microservices with Docker from a monolith application running in JBoss. I want to know the platform/tools/frameworks to be used to test these Docker containers in developer environment. Also what tools should be used to deploy these containers to this developer test environment.
Is it a good option to use some thing like Kubernetes with chef/puppet/vagrant?
I think so. Make sure to get service discovery, logging and virtual networking right. For the former you can check out skydns. Docker now has a few logging plugins you can use for log management. For virtual networking you can look for Flannel and Weave.
You want service discovery because Kubernetes will schedule the containers the way it sees fit and you need some way of telling what IP/port your microservice will be at. Virtual networking make it so each container has it's own subnet thus preventing port clashes in case you have two containers with the same ports exposed in the same host (kubernetes won't let it clash, it will schedule containers to run until you have hosts with ports available, if you try to create more it just won't run).
Also, you can try the built-in cluster tools in Docker itself, like docker service, docker network commands and Docker Swarm.
Docker-machine helps in case you already have a VM infrastructure in place.
We have created and open-sourced a platform to develop and deploy docker based microservices.
It supports service discovery, clustering, load balancing, health checks, configuration management, diagnosing and mini-DNS.
We are using it in our local development environment and production environment on AWS. We have a Vagrant box with everything prepared so you can give it a try:
http://armada.sh
https://github.com/armadaplatform/armada
I'm looking for the monitoring solution for the web application, deployed as a Swarm of Docker containers spread through 7-10 VMs. High level requirements are:
Configurable Web and REST interface to performance dashboard
General performance metrics on VM levels (CPU/Memory/IO)
Alerts when containers and/or VMs are going offline/restart
Possibility to drill down into containers process activity when needed
Host OS are CoreOS and Ubuntu
Any recommendations/best practices here?
NOTE: external Kibana installation is being used to collect application logs from Logstash agents deployed on VMs.
Based on your requirements, it sounds like Sematext Docker Agent would be a good fit. It runs as a tiny container on each Docker host and collects all host+containers metrics, events, and logs. It can parse logs, route them, blacklist/whitelist them, has container auto-discovery, and so on. In the end logs end up in Logsene and metrics and events end up in SPM, which gives you a single pane of glass sort of view into all your Docker ops bits, with alerting, anomaly detection, correlation, and so on.
I am currently evaluating bosun with scollector + cAdvisor support. Look ok so far.
Edit:
It should meet all the listed requirements and a little bit more. :)
Take a look at Axibase Time-Series Database / Google Cadvisor / collectd stack.
Disclosure: I work for the company that develops ATSD.
Deploy 1 Cadvisor container per VM to collect Docker container statistics. Cadvisor front-end allows you to view top container processes.
Deploy 1 ATSD container to ingest data from multiple Cadvisor instances.
Deploy collectd daemon on each VM to collect host statistics, configure collectd daemons to stream data into ATSD using write_atsd plugin.
Dashboards:
Host:
Container:
API / SQL:
https://github.com/axibase/atsd/tree/master/api#api-categories
Alerts:
ATSD comes with a built-in rule engine. You can configure a rule to watch when containers stops collecting data and trigger an email or system command.
Recently I am trying to find out best Docker logging mechanism using ELK stack. I am having some questions regarding the best work flow that companies use in production. Our system has typical software stack including Tomcat, PostgreSQL, MongoDB, Nginx, RabbitMQ, Couchbase etc. As of now, our stack runs in CoreOS cluster. Please find my questions below
With ELK stack, what is the best methodology to do the log forwarding - Should I use Lumberjack ?. I am asking this because I have seen workflows where people use Syslog/Rsyslog to forward the logs to logstash.
Since all of our software pieces are containerized, should I include Log-forwarder in all my containers ? I am planning to do this as most of my containers switch nodes based on health so I am not keen on mounting the file system from the container to host.
Should I use redis as a broker in forwarding the logs ? If yes why ?
How difficult is it to write log-config files that defines the log format to be forwarded to log-stash ?
This is a subjective questions but I am sure that this is a problem that people have solved long ago and I am not keen on re-inventing the wheel.
Good questions and the answer like in many other cases are - "it depends".
Shipping Logs - we use rsyslog as docker containers internally and logstash-forwarder in some cases - the advantage of logstash-forwarder is that it encrypts the logs and compresses them so in some cases that's important. I find rsyslog to be very stable and low on resources so we use it as a default shipper. The full logstash might be heavy for small machines (some more data about logstash - http://logz.io/blog/5-logstash-pitfalls-and-how-to-avoid-them/)
We're also fully dockerized and use a separate Docker for each rsyslog/lumberjack. Easy to maintain, update versions and move around when needed.
Yes, definitely use Redis. I wrote a blog about how to build production ELK (http://logz.io/blog/deploy-elk-production/) - I spoke about what I find to be the right architecture to deploy ELK in production
Not sure what exactly are you trying to achieve with that.
HTH
Docker as of Aug 2015, has "Logging Driver", so that you can ship logs into other places. These are the supported way to ship the logs remotely.
syslog
fluentd
journald
gelf
etc..
I would recommend against putting the logging forwarder into each Docker image. That adds unneeded complexity and bloat to your Docker containers. A cleaner solution is to put the log forwarder (the latest log forwarder from Elastic being FileBeat, which replaces logstash forwarder) into its own container and mount the host machine's /var/lib/docker directory as a volume for that container.
docker run --detach --name=docker-filebeat -v /var/lib/docker:/var/lib/docker
/var/lib/docker contains all the logs for every container running on the host's Docker daemon. The data of the log files in this directory is the same data you would get from running docker logs <container_id> on each container.
Then in the filebeat.yml configuration file, put:
filebeat:
prospectors:
-
paths:
- /var/lib/docker/containers/*/*.log
Then config Filebeat to forward to the rest of your ELK stack and start the container. All the Docker container logs on that machine will be forwarded to your ELK stack automatically.
The cool thing about this approach is that it allows you to forward the rest of the host systems logs as well if you want to. Simply add another volume pointing to the host system log files you want to forward and add that path to your filebeat.yml config as well.
I find this method cleaner and more flexible than other methods such as using the Docker logging drivers because the rest of the your Docker setup stays the same. You don't have to add the logging driver flags to each Docker run command (or to the Docker daemon parameters).