Scaling filebeat over docker containers

Scaling filebeat over docker containers - docker

I’m looking for the appropriate way to monitor applicative logs produced by nginx, tomcat, springboot embedded in docker with filebeat and ELK.
In the container strategy, a container should be used for only one purpose.
One nginx per container and one tomcat per container, meaning we can’t have an additional filebeat within a nginx or tomcat container.
Over what I have read over Internet, we could have the following setup:
a volume dedicated for storing logs
a nginx container which mount the dedicated logs volume
a tomcat / springboot container which mount the dedicated logs volume
a filebeat container also mounting the dedicated logs volume
This works fine but when it comes to scale out nginx and springboot container, it is a little bit more complex for me.
Which pattern should I use to push my logs using filebeat to logstash if I have the following configuration:
several nginx containers in load balancing with the same configuration (logs configuration is the same: same path)
several springboot rest api containers behing nginx containers with the same configuration (logs configuration is the same:same path)
Should I create one volume by set of nginx + springboot rest api and add a filebeat container ?
Should I create a global log volume shared by all my containers and have a different log filename by container
(having the name of the container in the filename of the logs?) and having only one filebeat container ?
In the second proposal, how to scale filebeat ?
Is there another way to do that ?
Many thanks for your help.

The easiest thing to do, if you can manage it, is to set each container process to log to its own stdout (you might be able to specify /dev/stdout or /proc/1/fd/1 as a log file). For example, the Docker Hub nginx Dockerfile specifies
RUN ln -sf /dev/stdout /var/log/nginx/access.log \
&& ln -sf /dev/stderr /var/log/nginx/error.log
so the ordinary nginx logs become the container logs. Once you do that, you can plug in the filebeat container input to read those logs and process them. You could also see them from outside the container with docker logs, they are the same logs.
What if you have to log to the filesystem? Or there are multiple separate log streams you want to be able to collect?
If the number of containers is variable, but you have good control over their configuration, then I'd probably set up a single global log volume as you describe and use the filebeat log input to read every log file in that directory tree.
If the number of containers is fixed, then you can set up a volume per container and mount it in each container's "usual" log storage location. Then mount all of those directories into the filebeat container. The obvious problem here is that if you do start or stop a container, you'll need to restart the log manager for the added/removed volume.
If you're actually on Kubernetes, there are two more possibilities. If you're trying to collect container logs out of the filesystem, you need to run a copy of filebeat on every node; a DaemonSet can manage this for you. A Kubernetes pod can also run multiple containers, so your other option is to set up pods with both an application container and a filebeat "sidecar" container that ships the logs off. Set up the pod with an emptyDir volume to hold the logs, and mount it into both containers. A template system like Helm can help you write the pod specifications without repeating the logging sidecar setup over and over.

Related

How to start syslogd in nginx container

I'm using an Nginx docker container as base image for an application. I'm redirecting Nginx logs to syslog but I'm not sure what is the best way to have the busybox syslogd started. It all works if I start it manually, I just need it to run as a daemon automatically when the container runs.
Seeing that nginx is in init.d I tried this in my Dockerfile:
RUN ln -s /bin/busybox syslogd /etc/init.d/syslogd || :
But syslogd still didn't run on start-up. Since the documentation says that only one [CMD] is allowed I have the following hack:
FROM nginx:mainline-alpine
CMD nginx & busybox syslogd -n
This works, locally at least, but I'm wondering what is the proper solution. By default the container already symlinks log files to stdout and stderr but I don't want to use docker's syslog logging driver, because the application will be deployed to Kubernetes so I need a self-contained solution, that will work in the pod. Thank you!

Have your container log to stdout, but collect the logs elsewhere.
One option is to configure Docker itself to send container logs to syslog:
docker run --log-driver=syslog --log-opt syslog-address=udp://... ... nginx
Since the Docker daemon itself is configuring this, the syslog-address needs to be something that can be reached from the host. If you're running syslogd in a separate container, this option needs to point at a published port.
Another option is to use the standard Docker JSON-format logging, but use another tool to forward the logs to somewhere else. This has the downside of needing an additional tool, but the upside of docker logs working unmodified. Fluentd is a prominent open-source option. (Logstash is another, but doesn't seem to directly have a Docker integration.)

Filebeat to monitor logs of several containers which are inside the containers

I have one question, Is there any way to ship the logs of each container where the log files are located inside the containers. Actually, the current flow will help to ship the log files which is located in the default path(var/lib/docker/containers//.log). I want to customize the filebeat.yaml to ship the logs from each container to logstash instead of the default path.

If you can set your containers to log to stdout rather than to files, it looks like filebeat has an autodiscover mode which will capture the docker logs of every container.
Another common setup in an ELK world is to configure logstash on your host, and set up Docker's logging options to send all output on containers' stdout into logstash. This makes docker logs not work, but all of your log output is available via Kibana.
If your container processes always write to log files, you can use the docker run -v option or the Docker Compose volumes: option to mount a host directory on to an individual container's /var/log directory. Then the log files will be visible on the host, and you can use whatever file-based collector to capture them. This is in the realm of routine changes that will require you to stop and delete your existing containers before starting them with different options.

How to work with variables in DOCKERFILE

I am creating NGINX container. I want to write all logs into a mounted volume rather than the default volume. This I can achieve by updating nginx.conf file by pointing access_log and error_log to a folder in mounted volume. The twist is that I want each container to write to container specific folder within the mounted volume.
For eg:
Container image name: mycontainerapp
Mounted volume: /logdirectory
Then I want:
/var/log to point to /logdirectory/mycontainerapp/{containerID}/log
This way, I can have multiple containers log to the common mounted volume.
AFAIK, I can get container ID from /proc/1/cpuset
I am not sure of any other way to get the container ID
Question is, how can I read that containerID and use it to create the mounted volume (with folder name) using DOCKERFILE?
Also, if there is a better approach to what I am trying to achieve, please do let me know as I am a newbie to docker.

Docker has a logging mechanism included which removes standard log files from the equation. All data sent to stdout and stderr will be captured by Dockers logging interface.
There are a number of logging drivers that can then ship logs from your Docker host to a central logging service (Graylog, Syslog, AWS CloudWatch, ETW, Fluentd, Google Cloud, Splunk). The json driver is the default which is locally stored on the Docker host. journald will also be stored and accessible locally.
In nginx config, or any container for that matter, send the access log stdout or /dev/fd/1 and send the error log to stderr or /dev/fd/2
daemon off;
error_log /dev/fd/2 info;
http {
access_log /dev/fd/1;
...
}
Once you start applying this concept to all containers, any log management requirements are removed from the container/application level and pushed up to the host. Container meta data can be attached to logs. It becomes easier to move or change the logging mechanism. Moving to clustered setups like Swarm becomes less of a hassle. This all ties into the 1 process per container idea of the world that Docker pushes.

Is there a Better way to run a command or shell on Docker swarm

Lets say I want to edit a config file for an NGINX Docker service that is replicated across 3 nodes.
Currently I list the services using docker service ls.
Then get the details to find a node running a container for that service using docker serivce ps servicename.
Then ssh to a node where one of the containers is running.
Finally, docker exec -it containername bash. Then I edit the config file.
Two questions:
Is there a better way to do this rather than ssh to a node running a container? Maybe there is a swarm or service command to do so?
If I were to edit that config file on one container would that change be replicated to the other 2 containers in the swarm?
The purpose of this exercise would be to edit configuration without shutting down a service.

You should not be exec'ing into containers to change their configuration, and so docker has not created an easy way to do this within Swarm Mode. You could use classic swarm to avoid the need to ssh into the other host, but I still don't recommend this.
The correct way to do this is to migrate your configuration file into a docker config entry. Version your config name. Then when you want to update it, you create a new version with the desired changes, and do a rolling update of your service to use that new configuration.
Unless the config is mounted from an external source like NFS, changes to one config in one container will not apply to other containers running on other nodes. If that config is stored locally inside your container as part of it's internal copy-on-write filesystem, then no changes from one container will be visible in any other container.

Running filebeat on docker host OS and collecting logs from containers

I have a server that is the host OS for multiple docker containers. Each of the containers contains an application that is creating logs. I want these logs to be sent to a single place by using the syslog daemon, and then I want filebeat to transmit this data to another server. Is it possible to install filebeat on the HOST OS (without making another container for filebeat), and make the containers applications' log data be collected by the syslog daemon and then consolidated in /var/log on the host OS? Thanks.

You need to share a volume with every container in order to get your logs in the host filesystem.
Then, you can install filebeat on the host and forward the logs where you want, as they were "standard" log files.
Please be aware that usually docker containers do not write they logs to real log files, but to stdout. That means that you'll probably need custom images in order to fix this logging problem.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart