I have a set of dockerized applications scattered across multiple servers and trying to setup production-level centralized logging with ELK. I'm ok with the ELK part itself, but I'm a little confused about how to forward the logs to my logstashes.
I'm trying to use Filebeat, because of its loadbalance feature.
I'd also like to avoid packing Filebeat (or anything else) into all my dockers, and keep it separated, dockerized or not.
How can I proceed?
I've been trying the following. My Dockers log on stdout so with a non-dockerized Filebeat configured to read from stdin I do:
docker logs -f mycontainer | ./filebeat -e -c filebeat.yml
That appears to work at the beginning. The first logs are forwarded to my logstash. The cached one I guess. But at some point it gets stuck and keep sending the same event
Is that just a bug or am I headed in the wrong direction? What solution have you setup?
Here's one way to forward docker logs to the ELK stack (requires docker >= 1.8 for the gelf log driver):
Start a Logstash container with the gelf input plugin to reads from gelf and outputs to an Elasticsearch host (ES_HOST:port):
docker run --rm -p 12201:12201/udp logstash \
logstash -e 'input { gelf { } } output { elasticsearch { hosts => ["ES_HOST:PORT"] } }'
Now start a Docker container and use the gelf Docker logging driver. Here's a dumb example:
docker run --log-driver=gelf --log-opt gelf-address=udp://localhost:12201 busybox \
/bin/sh -c 'while true; do echo "Hello $(date)"; sleep 1; done'
Load up Kibana and things that would've landed in docker logs are now visible. The gelf source code shows that some handy fields are generated for you (hat-tip: Christophe Labouisse): _container_id, _container_name, _image_id, _image_name, _command, _tag, _created.
If you use docker-compose (make sure to use docker-compose >= 1.5) and add the appropriate settings in docker-compose.yml after starting the logstash container:
log_driver: "gelf"
log_opt:
gelf-address: "udp://localhost:12201"
Docker allows you to specify the logDriver in use. This answer does not care about Filebeat or load balancing.
In a presentation I used syslog to forward the logs to a Logstash (ELK) instance listening on port 5000.
The following command constantly sends messages through syslog to Logstash:
docker run -t -d --log-driver=syslog --log-opt syslog-address=tcp://127.0.0.1:5000 ubuntu /bin/bash -c 'while true; do echo "Hello $(date)"; sleep 1; done'
Using filebeat you can just pipe docker logs output as you've described. Behavior you are seeing definitely sounds like a bug, but can also be the partial line read configuration hitting you (resend partial lines until newline symbol is found).
A problem I see with piping is possible back pressure in case no logstash is available. If filebeat can not send any events, it will buffer up events internally and at some point stop reading from stdin. No idea how/if docker protects from stdout becoming unresponsive. Another problem with piping might be restart behavior of filebeat + docker if you are using docker-compose. docker-compose by default reuses images + image state. So when you restart, you will ship all old logs again (given the underlying log file has not been rotated yet).
Instead of piping you can try to read the log files written by docker to the host system. The default docker log driver is the json log driver . You can and should configure the json log driver to do log-rotation + keep some old files (for buffering up on disk). See max-size and max-file options. The json driver puts one line of 'json' data for every line to be logged. On the docker host system the log files are written to /var/lib/docker/containers/container_id/container_id-json.log . These files will be forwarded by filebeat to logstash. If logstash or network becomes unavailable or filebeat is restarted, it continues forwarding log lines where it left of (given files have been not deleted due to log rotation). No events will be lost. In logstash you can use the json_lines codec or filter to parse the json lines and a grok filter to gain some more information from your logs.
There has been some discussion about using libbeat (used by filebeat for shipping log files) to add a new log driver to docker. Maybe it is possible to collect logs via dockerbeat in the future by using the docker logs api (I'm not aware of any plans about utilising the logs api, though).
Using syslog is also an option. Maybe you can get some syslog relay on your docker host load balancing log events. Or have syslog write log files and use filebeat to forward them. I think rsyslog has at least some failover mode. You can use logstash syslog input plugin and rsyslog to forward logs to logstash with failover support in case the active logstash instance becomes unavailable.
I created my own docker image using the Docker API to collect the logs of the containers running on the machine and ship them to Logstash thanks to Filebeat. No need to install or configure anything on the host.
Check it out and tell me if it suits your needs: https://hub.docker.com/r/bargenson/filebeat/.
The code is available here: https://github.com/bargenson/docker-filebeat
Just for helping others that need to do this, you can simply use Filebeat to ship the logs. I would use the container by #brice-argenson, but I needed SSL support so I went with a locally installed Filebeat instance.
The prospector from filebeat is (repeat for more containers):
- input_type: log
paths:
- /var/lib/docker/containers/<guid>/*.log
document_type: docker_log
fields:
dockercontainer: container_name
It sucks a bit that you need to know the GUIDs as they could change on updates.
On the logstash server, setup the usual filebeat input source for logstash, and use a filter like this:
filter {
if [type] == "docker_log" {
json {
source => "message"
add_field => [ "received_at", "%{#timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
mutate {
rename => { "log" => "message" }
}
date {
match => [ "time", "ISO8601" ]
}
}
}
This will parse the JSON from the Docker logs, and set the timestamp to the one reported by Docker.
If you are reading logs from the nginx Docker image, you can add this filter as well:
filter {
if [fields][dockercontainer] == "nginx" {
grok {
match => { "message" => "(?m)%{IPORHOST:targethost} %{COMBINEDAPACHELOG}" }
}
mutate {
convert => { "[bytes]" => "integer" }
convert => { "[response]" => "integer" }
}
mutate {
rename => { "bytes" => "http_streamlen" }
rename => { "response" => "http_statuscode" }
}
}
}
The convert/renames are optional, but fixes an oversight in the COMBINEDAPACHELOG expression where it does not cast these values to integers, making them unavailable for aggregation in Kibana.
I verified what erewok wrote above in a comment:
According to the docs, you should be able to use a pattern like this
in your prospectors.paths: /var/lib/docker/containers/*/*.log – erewok
Apr 18 at 21:03
The docker container guids, represented as the first '*', are correctly resolved when filebeat starts up. I do not know what happens as containers are added.
Related
We have a Java application that can be run in Docker containers. It produces messages to stdout and stderr with a different level of detail for different audiences.
Configuring Splunk as log driver all log lines received by Splunk a marked with source stdout although there must be log lines being logged to stderr.
Splunk log driver configuration in docker-compose:
logging:
driver: splunk
options:
splunk-url: https://splunkhf:8088
splunk-token: [TOKEN]
splunk-index: splunk_index
splunk-insecureskipverify: "true"
splunk-sourcetype: log4j
splunk-format: "json"
tag: "{{.Name}}/{{.ID}}"
Example log message sent to splunk:
{
line: 2021-01-12 11:37:49,191;10718;INFO ;[Thread-1];Logger; ;Executed all shutdown events.
source: stdout
tag: service_95f2bac29286/582385192fde
}
How can I configure Docker or Splunk to differentiate correctly between those different streams?`
If you run the service from docker-compose without -d then the logs lose their original source. It seems that Docker and Docker-Compose put everything from the container's output streams to stdout and use stderr for their logs.
Using the -d flag the log messages do not lose their original output stream.
I'm trying to make a Graylog Docker Container persistent.
Meaning that after restarting (docker-compose down; docker-compose up) the logs will still be there alongside the configuration.
I've used the documentation at https://docs.graylog.org/en/3.1/pages/installation/docker.html I created a yml file with the content under the topic "Persisting data".
I only edited the line "GRAYLOG_HTTP_EXTERNAL_URI=http://127.0.0.1:9000/" to not use localhost but the external ip the machine is using.
Docker works, i can create an input and collect logfiles. What does not work is the data being persistent. Also every time i restart the node id changes, so i have to reconfigure the input. Running docker volume ls lists five volumes 3 of which are the ones created in the yml file.
I don't understand why data is not persistent. Can anybody help?
I had the same problem and I'd been struggling for a while before I found a solution. I'm on 3.2 and also had issues with node persistence. The documentation doesn't seem to directly state that there is one more configuration folder you need to persist, which is:
/usr/share/graylog/data/config
They actually mention it in the Custom configuration files section and when I took a look via CLI in that directory, it turns out that it's where the graylog.conf and node-id (the file Graylog uses to store information about its nodes) are stored as well!
Here's my docker-compose.override.yml section with the necessary changes (marked with '# ADDED' comments)
services:
graylog:
environment:
# CHANGE ME (must be at least 16 characters)!
- GRAYLOG_PASSWORD_SECRET=somepasswordpepper
# Password: admin
- GRAYLOG_ROOT_PASSWORD_SHA2=8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918
- GRAYLOG_HTTP_EXTERNAL_URI=http://127.0.0.1:9000/
- GRAYLOG_IS_MASTER=true
#- GRAYLOG_NODE_ID_FILE=/usr/share/graylog/data/config/node-id
ports:
# Graylog web interface and REST API
- 9000:9000
# Syslog TCP
- 1514:1514
# Syslog UDP
- 1514:1514/udp
# GELF TCP
- 12201:12201
# GELF UDP
- 12201:12201/udp
volumes:
- "graylogjournal:/usr/share/graylog/data/journal"
- "graylogconfig:/usr/share/graylog/data/config" # ADDED
volumes:
graylogjournal:
driver: local
graylogconfig: # ADDED
driver: local # ADDED
Hope this helps
You can add into daemon.json file these lines ;
{
"log-driver": "gelf",
"log-opts": {
"gelf-address": "udp://1.2.3.4:12201"
}
}
https://docs.docker.com/config/containers/logging/gelf/
I'm trying to set up an interaction between running Docker container's logs and Logstash.
I run my Docker container with the following command:
docker run --log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201 nfrankel/simplelog:1
and the Logstash config.json is:
input {
gelf {}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
}
stdout {}
}
Logstash logs are fine, I see
New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["http://localhost:9200"]}
Starting gelf listener (udp) ... {:address=>"0.0.0.0:12201"}
Successfully started Logstash API endpoint {:port=>9600}
Nevertheless, it doesn't work. I don't see either logs in Logstash console or that Elasticsearch index is created.
Could you help me to resolve my issue?
Worth to mention that the running Docker container produces logs and I see them in a Cygwin from where I launched it.
Perhaps try configuring the gelf input to accept udp on port 12201 e.g.
input {
gelf {
use_udp => true
port_udp => 12201
}
}
I can tail the logs of a single docker container by doing:
docker logs -f container1
But, how can I tail the logs of multiple containers on the same screen?
docker logs container1 container2
doesn’t work. It gives an error:
“docker logs” requires exactly 1 argument(s).
Thank you.
If you are using docker-compose, this will show all logs from the diferent containers
docker-compose logs -f
If you have access and root to the docker server:
tail -f /var/lib/docker/containers/*/*.log
The docker logs command can't stream multiple logs files.
Logging Drivers
You could use one of the logging drivers other than the default json to ship the logs to a common point. The systemd journald or syslog drivers would readily work on most systems. Any of the other centralised log systems would work too.
Note that configuring syslog on the Docker daemon means that docker logs command can no longer query the logs, they will only be stored where your syslog puts them.
A simple daemon.json for syslog:
{
"log-driver": "syslog",
"log-opts": {
"syslog-address": "tcp://10.8.8.8:514",
"syslog-format": "rfc5424"
}
}
Compose
docker-compose is capable of streaming the logs for all containers it controls under a project.
API
You could write tool that attaches to each container via the API and streams the logs via a websocket. Two of the Java libararies are docker-client and docker-java.
Hack
Or run multiple docker logs and munge the output, in node.js:
const { spawn } = require('child_process')
function run(id){
let dkr = spawn('docker', [ 'logs', '--tail', '1', '-t', '--follow', id ])
dkr.stdout.on('data', data => console.log('%s: stdout', id, data.toString().replace(/\r?\n$/,'')))
dkr.stderr.on('data', data => console.error('%s: stderr', id, data.toString().replace(/\r?\n$/,'')))
dkr.on('close', exit_code => {
if ( exit_code !== 0 ) throw new Error(`Docker logs ${id} exited with ${exit_code}`)
})
}
let args = process.argv.splice(2)
args.forEach(arg => run(arg))
Which dumps data as docker logs writes it.
○→ node docker-logs.js 958cc8b41cd9 1dad69882b3d db4b844d9478
958cc8b41cd9: stdout 2018-03-01T06:37:45.152010823Z hello2
1dad69882b3d: stdout 2018-03-01T06:37:49.392475996Z hello
db4b844d9478: stderr 2018-03-01T06:37:47.336367247Z hello2
958cc8b41cd9: stdout 2018-03-01T06:37:55.155137606Z hello2
db4b844d9478: stderr 2018-03-01T06:37:57.339710598Z hello2
1dad69882b3d: stdout 2018-03-01T06:37:59.393960369Z hello
I am new to nagios.
I am trying to configure the "check_disk" service for one host but I am not getting the expected results.
I should get the emails when when disk usage goes beyond 80%.
So, There is already service defined for this task with multiple hosts, as below:
define service{
use local-service ; Name of service template to use
host_name localhost, host1, host2, host3, host4, host5, host6
service_description Root Partition
check_command check_local_disk!20%!10%!/
contact_groups unix-admins,db-admins
}
The Issue:
Further I tried to test single host i.e. "host2". The current usage of host2 is as follow:
# df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rootvg-rootvol 94G 45G 45G 50% /
So to get instant emails, I written another service as below, where warning set to <60% and critical set to <40%.
define service{
use local-service
host_name host2
service_description Root Partition again
check_command check_local_disk!60%!40%!/
contact_groups dev-admins
}
But still I am not receive any emails for the same.
Where it going wrong.
The "check_local_disk" command is defined as below:
define command{
command_name check_local_disk
command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
}
Your command definition currently is setup to only check your Nagios server's disk, not the remote hosts (such as host2). You need to define a new command definition to execute check_disk on the remote host via NRPE (Nagios Remote Plugin Execution).
On Nagios server, define the following:
define command {
command_name check_remote_disk
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_disk -a $ARG1$ $ARG2$ $ARG3$
register 1
}
define service{
use genric-service
host_name host1, host2, host3, host4, host5, host6
service_description Root Partition
check_command check_remote_disk!20%!10%!/
contact_groups unix-admins,db-admins
}
Restart the Nagios service.
On the remote host:
Ensure you have NRPE plugin installed.
Instructions for Ubuntu: http://tecadmin.net/install-nrpe-on-ubuntu/
Instructions for CentOS / RHEL: http://sharadchhetri.com/2013/03/02/how-to-install-and-configure-nagios-nrpe-in-centos-and-red-hat/
Ensure there is a command defined for check_disk on the remote host. This is usually included in nrpe.cfg, but commented-out. You'd have to un-comment the line.
Ensure you have the check_disk plugin installed on the remote host. Mine is located at: /usr/lib64/nagios/plugins/check_disk
Ensure that allowed_hosts field of nrpe.cfg includes the IP address / hostname of your Nagios server.
Ensure that dont_blame_nrpe field of nrpe.cfg is set to 1 to allow command line arguments to NRPE commands: dont_blame_nrpe=1
If you made any changes, restart the nrpe service.