Docker container Application logs to ELK stack without filebeat - docker

I'm using the Elasti Cloud as it appears to be the most suitable for quickly setting up application logging. I have 24 docker container running in different nodes, and some containers have no of replicas also. i want to export inside docker container logs to elk stack.. I don't want to install Filebeat on each of my containers because that seems like it goes directly against Docker's separation of duties mantra.
.... how do I get logs from my application containers to log stash server

You can send your syslog to Logstash by configuring rsyslogd like this
# /etc/rsyslog.d/99-ship-syslog.conf
*.*;syslog;auth,authpriv.none action(
type="omfwd"
Target="myremote.elk-server.net"
Port="5001"
Protocol="udp"
)
If you don't have rsyslog running yet, you can add it like so (alpine linux example):
# Dockerfile
FROM alpine:3.7
RUN apk update \
&& apk add rsyslog
COPY rsyslog.conf /etc/rsyslog.conf
EXPOSE 514 514/udp
VOLUME [ "/var/log", "/etc/rsyslog.d" ]
ENTRYPOINT [ "rsyslogd", "-n" ]
--
# rsyslogd.conf
#
# if you experience problems, check:
# http://www.rsyslog.com/troubleshoot
#### MODULES ####
module(load="imuxsock") # local system logging support (e.g. via logger command)
#module(load="imklog") # kernel logging support (previously done by rklogd)
module(load="immark") # --MARK-- message support
module(load="imudp") # UDP listener support
input(type="imudp" port="514")
# Log all kernel messages to the console.
# Logging much else clutters up the screen.
#kern.* action(type="omfile" file="/dev/console")
# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none action(type="omfile" file="/var/log/messages")
# The authpriv file has restricted access.
authpriv.* action(type="omfile" file="/var/log/secure")
# Log all the mail messages in one place.
mail.* action(type="omfile" file="/var/log/maillog")
# Log cron stuff
cron.* action(type="omfile" file="/var/log/cron")
# Everybody gets emergency messages
*.emerg action(type="omusrmsg" users="*")
# Save news errors of level crit and higher in a special file.
uucp,news.crit action(type="omfile" file="/var/log/spooler")
# Save boot messages also to boot.log
local7.* action(type="omfile" file="/var/log/boot.log")
# log every host in its own directory
if $fromhost-ip then /var/log/$fromhost-ip/messages
# Include all .conf files in /etc/rsyslog.d
$IncludeConfig /etc/rsyslog.d/*.conf
$template GRAYLOGRFC5424,"<%PRI%>%PROTOCOL-VERSION% %TIMESTAMP:::date-rfc3339% %HOSTNAME% %APP-NAME% %PROCID% %MSGID% %STRUCTURED-DATA% %msg%\n"
*.info;mail.none;authpriv.none;cron.none;*.* ##graylog:514;GRAYLOGRFC5424 # forward everything to remote server
As you're running within a java-application, you can even send you logs directly to syslog. Here's a small configuration example with log4j
log4j.rootLogger=INFO, SYSLOG
log4j.appender.SYSLOG=org.apache.log4j.net.SyslogAppender
log4j.appender.SYSLOG.syslogHost=myremote.elk-server.net
log4j.appender.SYSLOG.layout=org.apache.log4j.PatternLayout
log4j.appender.SYSLOG.layout.conversionPattern=%d{ISO8601} %-5p [%t] %c{2} %x - %m%n
log4j.appender.SYSLOG.Facility=LOCAL1

Related

how to log to journal with mqtt mosquitto

Mosquitto has an extensive logging mechanism. But it seems it only can log against syslog.
Is there a way to get logs captured in the systemd's jounal without running it as a service?
Note: mosquito runs inside a docker container
mosquitto.conf logging section is:
# =================================================================
# Logging
# =================================================================
# Places to log to. Use multiple log_dest lines for multiple
# logging destinations.
# Possible destinations are: stdout stderr syslog topic file dlt
#
# stdout and stderr log to the console on the named output.
#
# syslog uses the userspace syslog facility which usually ends up
# in /var/log/messages or similar.
#
# topic logs to the broker topic '$SYS/broker/log/<severity>',
# where severity is one of D, E, W, N, I, M which are debug, error,
# warning, notice, information and message. Message type severity is used by
# the subscribe/unsubscribe log_types and publishes log messages to
# $SYS/broker/log/M/susbcribe or $SYS/broker/log/M/unsubscribe.
#
# The file destination requires an additional parameter which is the file to be
# logged to, e.g. "log_dest file /var/log/mosquitto.log". The file will be
# closed and reopened when the broker receives a HUP signal. Only a single file
# destination may be configured.
#
# The dlt destination is for the automotive `Diagnostic Log and Trace` tool.
# This requires that Mosquitto has been compiled with DLT support.
#
# Note that if the broker is running as a Windows service it will default to
# "log_dest none" and neither stdout nor stderr logging is available.
# Use "log_dest none" if you wish to disable logging.
log_dest syslog
log_dest file /var/log/mosquitto/mosquitto.log
# Types of messages to log. Use multiple log_type lines for logging
# multiple types of messages.
# Possible types are: debug, error, warning, notice, information,
# none, subscribe, unsubscribe, websockets, all.
# Note that debug type messages are for decoding the incoming/outgoing
# network packets. They are not logged in "topics".
#log_type error
#log_type warning
#log_type notice
log_type all
# If set to true, client connection and disconnection messages will be included
# in the log.
#connection_messages true
# If using syslog logging (not on Windows), messages will be logged to the
# "daemon" facility by default. Use the log_facility option to choose which of
# local0 to local7 to log to instead. The option value should be an integer
# value, e.g. "log_facility 5" to use local5.
log_facility 5
# If set to true, add a timestamp value to each log message.
#log_timestamp true
# Set the format of the log timestamp. If left unset, this is the number of
# seconds since the Unix epoch.
# This is a free text string which will be passed to the strftime function. To
# get an ISO 8601 datetime, for example:
# log_timestamp_format %Y-%m-%dT%H:%M:%S
#log_timestamp_format
# Change the websockets logging level. This is a global option, it is not
# possible to set per listener. This is an integer that is interpreted by
# libwebsockets as a bit mask for its lws_log_levels enum. See the
# libwebsockets documentation for more details. "log_type websockets" must also
# be enabled.
#websockets_log_level 0
Because of log_dest syslog is specified, logging to syslog should be enabled, shouldn't it?
There are logs of mosquitto in /var/log/syslog:
Nov 17 23:08:29 raspberrypi mosquitto[40]: 1668726509: Opening ipv6 listen socket on port 1883.
Nov 17 23:08:29 raspberrypi mosquitto[40]: 1668726509: Opening ipv4 listen socket on port 1883.
Nov 17 23:08:29 raspberrypi mosquitto[40]: 1668726509: Opening websockets listen socket on port 9001.
Nov 17 23:08:29 raspberrypi mosquitto[40]: 1668726509: mosquitto version 2.0.15 running
...
Nov 30 12:12:18 raspberrypi mosquitto[39]: 1669810338: Sending PUBLISH to shell (d0, q0, r0, m0, '$SYS/broker/load/bytes/sent/1min', ... (4 bytes))
Nov 30 12:12:29 raspberrypi mosquitto[39]: 1669810349: Sending PUBLISH to shell (d0, q0, r0, m0, '$SYS/broker/load/bytes/sent/1min', ... (4 bytes))
Nov 30 12:12:40 raspberrypi mosquitto[39]: 1669810360: Sending PUBLISH to shell (d0, q0, r0, m0, '$SYS/broker/load/bytes/sent/1min', ... (4 bytes))
But they do not end up in the journal. I suspect it is because no systemd init is running. Can this be worked around?
Just configure mosquitto to log to syslog and it will end up in the journal.
Settings log_dest syslog should just work.
Edit for clarity.
This works on my Fedora 36 machine where journald appears to be consuming syslogd messages as can be seen by running journalctl -f -t mosquitto which output the usual mosquitto startup log info.

Apache Flume agent not starting but not showing error

I'm attempting to run an Apache Flume agent from an AWS EC2 cluster but when I start the agent, it neither starts nor throws an obvious error.
I'm just starting with the simple example from Apache's documentation.
When I run:
ubuntu#ip-172-31-41-5:~/Flume$ ./bin/flume-ng agent --conf conf --conf-file example.conf --name a1 -Dflume.root.logger=DEBUG,console
The console output is the following:
Info: Sourcing environment configuration script /home/ubuntu/Flume/conf/flume-env.sh
Info: Including HBASE libraries found via (/home/ubuntu/hbase-2.4.4/bin/hbase) for HBASE access
Info: Including Hive libraries found via () for Hive access
+ exec /usr/lib/jvm/java-8-openjdk-amd64/bin/java -Xmx20m -Dflume.root.logger=DEBUG,console -Dflume.root.logger=DEBUG,console -cp '/home/ubuntu/Flume/conf:/home/ubuntu/Flume/lib/*:/home/ubuntu/hbase-2.4.4/conf:/usr/lib/jvm/java-8-openjdk-amd64/lib/tools.jar:/home/ubuntu/hbase-2.4.4:/home/ubuntu/hbase-2.4.4/lib/shaded-clients/hbase-shaded-client-2.4.4.jar:/home/ubuntu/hbase-2.4.4/lib/client-facing-thirdparty/audience-annotations-0.5.0.jar:/home/ubuntu/hbase-2.4.4/lib/client-facing-thirdparty/commons-logging-1.2.jar:/home/ubuntu/hbase-2.4.4/lib/client-facing-thirdparty/htrace-core4-4.2.0-incubating.jar:/home/ubuntu/hbase-2.4.4/lib/client-facing-thirdparty/log4j-1.2.17.jar:/home/ubuntu/hbase-2.4.4/lib/client-facing-thirdparty/slf4j-api-1.7.30.jar:/home/ubuntu/hbase-2.4.4/conf:/lib/*' -Djava.library.path=:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib org.apache.flume.node.Application --conf-file example.conf --name a1 ./bin/flume-ng agent --conf-file example.conf --name a1
The agent doesn't throw an error but never gets further than this. I have also tried some variations including --conf-file conf/example.conf
Flume and Java appear to be installed correctly:
Flume
ubuntu#ip-172-31-41-5:~/Flume$ ./bin/flume-ng version
Source code repository: https://git.apache.org/repos/asf/flume.git
Revision: 1a15927e594fd0d05a59d804b90a9c31ec93f5e1
Compiled by rgoers on Sun Oct 16 14:44:15 MST 2022
From source with checksum bbbca682177262aac3a89defde369a37
Java
ubuntu#ip-172-31-41-5:~/Flume$ java -version
openjdk version "11.0.17" 2022-10-18
OpenJDK Runtime Environment (build 11.0.17+8-post-Ubuntu-1ubuntu222.04)
OpenJDK 64-Bit Server VM (build 11.0.17+8-post-Ubuntu-1ubuntu222.04, mixed mode, sharing)
Example.conf
# example.conf: A single-node Flume configuration
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
# Describe the sink
a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
Flume.env
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# If this file is placed at FLUME_CONF_DIR/flume-env.sh, it will be sourced
# during Flume startup.
# Enviroment variables can be set here.
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
# Give Flume more memory and pre-allocate, enable remote monitoring via JMX
# export JAVA_OPTS="-Xms100m -Xmx2000m -Dcom.sun.management.jmxremote"
# Let Flume write raw event data and configuration information to its log files for debugging
# purposes. Enabling these flags is not recommended in production,
# as it may result in logging sensitive user information or encryption secrets.
# export JAVA_OPTS="$JAVA_OPTS -Dorg.apache.flume.log.rawdata=true -Dorg.apache.flume.log.printconfig=true "
# Note that the Flume conf directory is always included in the classpath.
#FLUME_CLASSPATH=""
The only clue that I have is is in flume.log which shows the following error. I've even copied example.conf into the main Flume directory but it doesn't seem to make a difference.
03 Dec 2022 21:10:03,538 ERROR [main] (org.apache.flume.node.Application.main:506) - A fatal error occurred while running. Exception follows.org.apache.flume.conf.ConfigurationException: Unable to read file /home/ubuntu/Flume/example.confat org.apache.flume.node.FileConfigurationSource.<init>(FileConfigurationSource.java:52) ~[flume-ng-node-1.11.0.jar:1.11.0]at org.apache.flume.node.FileConfigurationSourceFactory.createConfigurationSource(FileConfigurationSourceFactory.java:40) ~[flume-ng-node-1.11.0.jar:1.11.0]at org.apache.flume.node.ConfigurationSourceFactory.getConfigurationSource(ConfigurationSourceFactory.java:39) ~[flume-ng-node-1.11.0.jar:1.11.0]at org.apache.flume.node.Application.main(Application.java:476) ~[flume-ng-node-1.11.0.jar:1.11.0]Caused by: java.nio.file.NoSuchFileException: /home/ubuntu/Flume/example.confat sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ~[?:1.8.0_352]at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[?:1.8.0_352]at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[?:1.8.0_352]at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) ~[?:1.8.0_352]at java.nio.file.Files.newByteChannel(Files.java:361) ~[?:1.8.0_352]at java.nio.file.Files.newByteChannel(Files.java:407) ~[?:1.8.0_352]at java.nio.file.Files.readAllBytes(Files.java:3152) ~[?:1.8.0_352]at org.apache.flume.node.FileConfigurationSource.<init>(FileConfigurationSource.java:49) ~[flume-ng-node-1.11.0.jar:1.11.0]... 3 more

filebeat not running with elastic search and kibana on ec2 not using logstash

I want to display nginx logs on kibana.
Elastic search and kibana are running fine.
Nginx logs are stored in /var/log/nginx/*.log
I installed filebeat and enbled the nginx service with it.
filebeat.yml
###################### Filebeat Configuration Example #########################
# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html
# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.
# ============================== Filebeat inputs ===============================
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
# filestream is an input for collecting log messages from files.
- type: filestream
# Change to true to enable this input configuration.
enabled: false
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /var/log/*.log
#- c:\programdata\elasticsearch\logs\*
# Exclude lines. A list of regular expressions to match. It drops the lines that are
# matching any regular expression from the list.
#exclude_lines: ['^DBG']
# Include lines. A list of regular expressions to match. It exports the lines that are
# matching any regular expression from the list.
#include_lines: ['^ERR', '^WARN']
# Exclude files. A list of regular expressions to match. Filebeat drops the files that
# are matching any regular expression from the list. By default, no files are dropped.
#prospector.scanner.exclude_files: ['.gz$']
# Optional additional fields. These fields can be freely picked
# to add additional information to the crawled log files for filtering
#fields:
# level: debug
# review: 1
# ============================== Filebeat modules ==============================
filebeat.config.modules:
# Glob pattern for configuration loading
path: ${path.config}/modules.d/*.yml
# Set to true to enable config reloading
reload.enabled: false
# Period on which files under path should be checked for changes
#reload.period: 10s
# ======================= Elasticsearch template setting =======================
setup.template.settings:
index.number_of_shards: 1
#index.codec: best_compression
#_source.enabled: false
# ================================== General ===================================
# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:
# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]
# Optional fields that you can specify to add additional information to the
# output.
#fields:
# env: staging
# ================================= Dashboards =================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false
# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:
# =================================== Kibana ===================================
# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:
# Kibana Host
# Scheme and port can be left out and will be set to the default (http and 5601)
# In case you specify and additional path, the scheme is required: http://localhost:5601/path
# IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
host: "localhost:3000"
# Kibana Space ID
# ID of the Kibana Space into which the dashboards should be loaded. By default,
# the Default Space will be used.
#space.id:
# =============================== Elastic Cloud ================================
# These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/).
# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:
###################### Filebeat Configuration Example #########################
# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html
# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.
# ============================== Filebeat inputs ===============================
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
# filestream is an input for collecting log messages from files.
- type: filestream
# Change to true to enable this input configuration.
enabled: false
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /var/log/*.log
#- c:\programdata\elasticsearch\logs\*
# Exclude lines. A list of regular expressions to match. It drops the lines that are
# matching any regular expression from the list.
#exclude_lines: ['^DBG']
# Include lines. A list of regular expressions to match. It exports the lines that are
# matching any regular expression from the list.
#include_lines: ['^ERR', '^WARN']
# Exclude files. A list of regular expressions to match. Filebeat drops the files that
# are matching any regular expression from the list. By default, no files are dropped.
#prospector.scanner.exclude_files: ['.gz$']
# Optional additional fields. These fields can be freely picked
# to add additional information to the crawled log files for filtering
#fields:
# level: debug
# review: 1
# ============================== Filebeat modules ==============================
filebeat.config.modules:
# Glob pattern for configuration loading
path: ${path.config}/modules.d/*.yml
# Set to true to enable config reloading
reload.enabled: false
# Period on which files under path should be checked for changes
#reload.period: 10s
# ======================= Elasticsearch template setting =======================
setup.template.settings:
index.number_of_shards: 1
#index.codec: best_compression
#_source.enabled: false
# ================================== General ===================================
# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:
# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]
# Optional fields that you can specify to add additional information to the
# output.
#fields:
# env: staging
# ================================= Dashboards =================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false
# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:
# =================================== Kibana ===================================
# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:
# Kibana Host
# Scheme and port can be left out and will be set to the default (http and 5601)
# In case you specify and additional path, the scheme is required: http://localhost:5601/path
# IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
host: "localhost:3000"
# Kibana Space ID
# ID of the Kibana Space into which the dashboards should be loaded. By default,
# the Default Space will be used.
#space.id:
# =============================== Elastic Cloud ================================
# These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/).
# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:
# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:
# ================================== Outputs ===================================
# Configure what output to use when sending the data collected by the beat.
# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["localhost:9200"]
# Protocol - either `http` (default) or `https`.
#protocol: "https"
# Authentication credentials - either API key or username/password.
#api_key: "id:api_key"
#username: "temp"
#password: "temp#1234"
# ------------------------------ Logstash Output -------------------------------
#output.logstash:
# The Logstash hosts
#hosts: ["localhost:5044"]
# Optional SSL. By default is off.
# List of root certificates for HTTPS server verifications
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]
# Certificate for SSL client authentication
#ssl.certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"
# ================================= Processors =================================
processors:
- add_host_metadata:
when.not.contains.tags: forwarded
- add_cloud_metadata: ~
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
# ================================== Logging ===================================
# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#logging.level: debug
# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use ["*"]. Examples of other selectors are "beat",
# "publisher", "service".
#logging.selectors: ["*"]
# ============================= X-Pack Monitoring ==============================
# Filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster. This requires xpack monitoring to be enabled in Elasticsearch. The
# reporting is disabled by default.
# Set to true to enable the monitoring reporter.
#monitoring.enabled: false
# Sets the UUID of the Elasticsearch cluster under which monitoring data for this
# Filebeat instance will appear in the Stack Monitoring UI. If output.elasticsearch
# is enabled, the UUID is derived from the Elasticsearch cluster referenced by output.elasticsearch.
#monitoring.cluster_uuid:
# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch output are accepted here as well.
# Note that the settings should point to your Elasticsearch *monitoring* cluster.
# Any setting that is not set is automatically inherited from the Elasticsearch
# output configuration, so if you have the Elasticsearch output configured such
# that it is pointing to your Elasticsearch monitoring cluster, you can simply
# uncomment the following line.
#monitoring.elasticsearch:
# ============================== Instrumentation ===============================
# Instrumentation support for the filebeat.
#instrumentation:
# Set to true to enable instrumentation of filebeat.
#enabled: false
# Environment in which filebeat is running on (eg: staging, production, etc.)
#environment: ""
# APM Server hosts to report instrumentation results to.
#hosts:
# - http://localhost:8200
# API Key for the APM Server(s).
# If api_key is set then secret_token will be ignored.
#api_key:
# Secret token for the APM Server(s).
#secret_token:
# ================================= Migration ==================================
# This allows to enable 6.7 migration aliases
#migration.6_to_7.enabled: true
And /etc/filebeat/modules/nginx.yml file
# Module: nginx
# Module: nginx
access:
enabled: true
# Set custom paths for the log files. If left empty,
# Filebeat will choose the paths depending on your OS.
var.paths: ["var/logs/nginx/*access.log"]
# Error logs
error:
enabled: true
# Set custom paths for the log files. If left empty,
# Filebeat will choose the paths depending on your OS.
var.paths: ["var/logs/nginx/*error.log"]
# Ingress-nginx controller logs. This is disabled by default. It could be used in Kubernetes environments to parse ingress-nginx logs
ingress_controller:
enabled: false
# Set custom paths for the log files. If left empty,
# Filebeat will choose the paths depending on your OS.
#var.paths:
after i run filebeat setup -e command I got the following output
2021-12-18T19:19:44.352+0530 INFO cfgfile/reload.go:262 Loading of config files completed.
2021-12-18T19:19:44.352+0530 INFO [load] cfgfile/list.go:129 Stopping 2 runners ...
Loaded Ingest pipelines
After this when i run systemctl restart filebeat
I got the following error
● filebeat.service - Filebeat sends log files to Logstash or directly to Elasticsearch.
Loaded: loaded (/usr/lib/systemd/system/filebeat.service; enabled; vendor preset: disabled)
Active: failed (Result: start-limit) since Sat 2021-12-18 19:03:53 IST; 17min ago
Docs: https://www.elastic.co/beats/filebeat
Main PID: 49606 (code=exited, status=1/FAILURE)
Dec 18 19:03:53 ip-10-249-5-178.ap-south-1.compute.internal systemd[1]: Unit filebeat.service entered failed state.
Dec 18 19:03:53 ip-10-249-5-178.ap-south-1.compute.internal systemd[1]: filebeat.service failed.
Dec 18 19:03:53 ip-10-249-5-178.ap-south-1.compute.internal systemd[1]: filebeat.service holdoff time over, scheduling restart.
Dec 18 19:03:53 ip-10-249-5-178.ap-south-1.compute.internal systemd[1]: Stopped Filebeat sends log files to Logstash or directly to Elasticsearch..
Dec 18 19:03:53 ip-10-249-5-178.ap-south-1.compute.internal systemd[1]: start request repeated too quickly for filebeat.service
Dec 18 19:03:53 ip-10-249-5-178.ap-south-1.compute.internal systemd[1]: Failed to start Filebeat sends log files to Logstash or directly to Elasticsearch..
Dec 18 19:03:53 ip-10-249-5-178.ap-south-1.compute.internal systemd[1]: Unit filebeat.service entered failed state.
Dec 18 19:03:53 ip-10-249-5-178.ap-south-1.compute.internal systemd[1]: filebeat.service failed.
Any help ??
Solved by deleting the file
rm -r /var/lib/filebeat/registry
Don't know the reason behind it but after this the service started successfully.

Winlogbeat setup error: x509 certificate is valid for <ip>, not <same ip>

I'm trying to send logs from Winlogbeat to my ELK stack.
I installed my ELK stack with docker and configured TLS on it.
I did everything according to the official guide and it worked for my host.
However, when copying the same winlogbeat directory to my Event Collector server, it did not work (all files are the same including the yml file).
When trying to run the "winlogbeat.exe setup -e" I got the following error: 'error connecting to elasticsearch at "https://elastic-host:9200" Get "https://elastic-host:9200" Winlogbeat setup error: x509 certificate is valid for elastic-host ip, not elastic-host ip' (same ips). The CA is already added to the trusted root certificates. Everything is configured the same as on the host, on the host it works, on the server it doesn't. (the ELK server and the EVC are in the same segment so there shouldn't be any firewall drops)
My .yml (same file on host and EVC server):
on the host it works without the ssl as well and the traffic is still encrypted due to the TLS that I configured on the docker cluster. So I'm not sure the ssl configuration is needed (but I wanted to include them in case they are needed).
# This file is an example configuration file highlighting only the most common
# options. The winlogbeat.reference.yml file from the same directory contains
# all the supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/winlogbeat/index.html
# ======================== Winlogbeat specific options =========================
# event_logs specifies a list of event logs to monitor as well as any
# accompanying options. The YAML data type of event_logs is a list of
# dictionaries.
#
# The supported keys are name (required), tags, fields, fields_under_root,
# forwarded, ignore_older, level, event_id, provider, and include_xml. Please
# visit the documentation for the complete details of each option.
# https://go.es.io/WinlogbeatConfig
winlogbeat.event_logs:
- name: Application
ignore_older: 72h
- name: System
- name: Security
processors:
- script:
lang: javascript
id: security
file: ${path.home}/module/security/config/winlogbeat-security.js
- name: Microsoft-Windows-Sysmon/Operational
processors:
- script:
lang: javascript
id: sysmon
file: ${path.home}/module/sysmon/config/winlogbeat-sysmon.js
- name: Windows PowerShell
event_id: 400, 403, 600, 800
processors:
- script:
lang: javascript
id: powershell
file: ${path.home}/module/powershell/config/winlogbeat-powershell.js
- name: Microsoft-Windows-PowerShell/Operational
event_id: 4103, 4104, 4105, 4106
processors:
- script:
lang: javascript
id: powershell
file: ${path.home}/module/powershell/config/winlogbeat-powershell.js
- name: ForwardedEvents
tags: [forwarded]
processors:
- script:
when.equals.winlog.channel: Security
lang: javascript
id: security
file: ${path.home}/module/security/config/winlogbeat-security.js
- script:
when.equals.winlog.channel: Microsoft-Windows-Sysmon/Operational
lang: javascript
id: sysmon
file: ${path.home}/module/sysmon/config/winlogbeat-sysmon.js
- script:
when.equals.winlog.channel: Windows PowerShell
lang: javascript
id: powershell
file: ${path.home}/module/powershell/config/winlogbeat-powershell.js
- script:
when.equals.winlog.channel: Microsoft-Windows-PowerShell/Operational
lang: javascript
id: powershell
file: ${path.home}/module/powershell/config/winlogbeat-powershell.js
# ====================== Elasticsearch template settings =======================
setup.template.settings:
index.number_of_shards: 1
#index.codec: best_compression
#_source.enabled: false
# ================================== General ===================================
# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:
# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]
# Optional fields that you can specify to add additional information to the
# output.
#fields:
# env: staging
# ================================= Dashboards =================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false
# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:
# =================================== Kibana ===================================
# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:
# Kibana Host
# Scheme and port can be left out and will be set to the default (http and 5601)
# In case you specify and additional path, the scheme is required: http://localhost:5601/path
# IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
host: "192.168.101.129:5601"
protocol: https
username: "elastic"
password: "passwd"
setup.kibana.ssl.enabled: true
setup.kibana.ssl.certificate_authorities: ["C:\\Program Files\\Winlogbeat\\ca.crt"]
setup.kibana.ssl.certificate: "C:\\Program Files\\Winlogbeat\\winlogbeat.crt"
setup.kibana.ssl.key: "C:\\Program Files\\Winlogbeat\\winlogbeat.key"
# Kibana Space ID
# ID of the Kibana Space into which the dashboards should be loaded. By default,
# the Default Space will be used.
#space.id:
# =============================== Elastic Cloud ================================
# These settings simplify using Winlogbeat with the Elastic Cloud (https://cloud.elastic.co/).
# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:
# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:
# ================================== Outputs ===================================
# Configure what output to use when sending the data collected by the beat.
# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["192.168.101.129:9200"]
username: "elastic"
password: "passwd"
# Protocol - either `http` (default) or `https`.
protocol: "https"
output.elasticsearch.ssl.certificate_authorities: ["C:\\Program Files\\Winlogbeat\\ca.crt"]
output.elasticsearch.ssl.certificate: "C:\\Program Files\\Winlogbeat\\winlogbeat.crt"
output.elasticsearch.ssl.key: "C:\\Program Files\\Winlogbeat\\winlogbeat.key"
# Authentication credentials - either API key or username/password.
#api_key: "id:api_key"
# ------------------------------ Logstash Output -------------------------------
#output.logstash:
# The Logstash hosts
#hosts: ["localhost:5044"]
# Optional SSL. By default is off.
# List of root certificates for HTTPS server verifications
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]
# Certificate for SSL client authentication
#ssl.certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"
# ================================= Processors =================================
processors:
- add_host_metadata:
when.not.contains.tags: forwarded
- add_cloud_metadata: ~
# ================================== Logging ===================================
# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#logging.level: debug
# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use ["*"]. Examples of other selectors are "beat",
# "publish", "service".
#logging.selectors: ["*"]
# ============================= X-Pack Monitoring ==============================
# Winlogbeat can export internal metrics to a central Elasticsearch monitoring
# cluster. This requires xpack monitoring to be enabled in Elasticsearch. The
# reporting is disabled by default.
# Set to true to enable the monitoring reporter.
#monitoring.enabled: false
# Sets the UUID of the Elasticsearch cluster under which monitoring data for this
# Winlogbeat instance will appear in the Stack Monitoring UI. If output.elasticsearch
# is enabled, the UUID is derived from the Elasticsearch cluster referenced by output.elasticsearch.
#monitoring.cluster_uuid:
# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch output are accepted here as well.
# Note that the settings should point to your Elasticsearch *monitoring* cluster.
# Any setting that is not set is automatically inherited from the Elasticsearch
# output configuration, so if you have the Elasticsearch output configured such
# that it is pointing to your Elasticsearch monitoring cluster, you can simply
# uncomment the following line.
#monitoring.elasticsearch:
# ============================== Instrumentation ===============================
# Instrumentation support for the winlogbeat.
#instrumentation:
# Set to true to enable instrumentation of winlogbeat.
#enabled: false
# Environment in which winlogbeat is running on (eg: staging, production, etc.)
#environment: ""
# APM Server hosts to report instrumentation results to.
#hosts:
# - http://localhost:8200
# API Key for the APM Server(s).
# If api_key is set then secret_token will be ignored.
#api_key:
# Secret token for the APM Server(s).
#secret_token:
# ================================= Migration ==================================
# This allows to enable 6.7 migration aliases
#migration.6_to_7.enabled: true
In your output, you need to specify ssl.verification_mode: certificate.
For your example, it looks like it is the Kibana output that has a certificate specified on it:
setup.kibana.ssl.enabled: true
setup.kibana.ssl.certificate_authorities: ["C:\\Program Files\\Winlogbeat\\ca.crt"]
setup.kibana.ssl.certificate: "C:\\Program Files\\Winlogbeat\\winlogbeat.crt"
setup.kibana.ssl.key: "C:\\Program Files\\Winlogbeat\\winlogbeat.key"
setup.kibana.ssl.verification_mode: certificate
Older versions of winlogbeat will need ssl.verification_mode: none instead.
See SSL/TLS configuration documentation at https://www.elastic.co/guide/en/beats/winlogbeat/current/configuration-ssl.html

How to capture logs from workers from a Dask-Yarn job?

I have tried using the following in ~/.config/dask/distributed.yaml and ~/.config/dask/yarn.yaml,
logging-file-config: "/path/to/config.ini"
or
logging:
version: 1
disable_existing_loggers: false
root:
level: INFO
handlers: [consoleHandler]
handlers:
consoleHandler:
class: logging.StreamHandler
level: INFO
formatter: sample_formatter
stream: ext://sys.stderr
formatters:
sample_formatter:
format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
and then in my function that gets evaluated at the worker:
import logging
from distributed.worker import logger
import dask
from dask.distributed import Client
from dask_yarn import YarnCluster
log = logging.getLogger(__name__)
#dask.delayed
def worker_func(args):
logger.info("This will show up in the worker logs")
log.info("This does not show up in worker logs")
return
if __name__ == "__main__":
dag_1 = {'worker_func': (worker_func, arg_1)}
tasks = dask.get(dag_1, 'load-1')
log.info("This also shows up in logs, and custom formatted)
cluster = YarnCluster()
client = Client(cluster)
dask.compute(tasks)
When I try to view the yarn logs using:
yarn logs -applicationId {application_id}
I do not see the log from log.info inside worker_func, but I do see the logs from distributed.worker.logger and from outside that function on the console. I also tried using client.get_worker_logs(), but that returned an empty dictionary. Is there a way to see customized logs from inside the function that gets evaluated at a worker?
There's a lot going on in this question, so I'm going to answer "How do I configure logging for dask-yarn workers" and hope everything else becomes clear by answering that.
Dask's configuration system is loaded locally on the machine you start a dask cluster from (usually the edge node). This configuration is not distributed to the workers automatically, you're responsible for doing that yourself. You have a few options here:
Have admin/IT put configuration in /etc/dask/ on every node, which will affect all users.
Bundle configuration with your packaged environment. Dask will load configuration from {prefix}/etc/dask/, where prefix is sys.prefix.
For example, if you have a conda environment at /path/to/environment, you'd do the following to bundle the configuration
# Create the configuration directory in the environment
mkdir -p /path/to/environment/etc/dask/
# Add your configuration to this directory
mv config.yaml /path/to/environment/etc/dask/config.yaml
# Package the environment
conda pack -p /path/to/environment -o environment.tar.gz
Any configuration values set in config.yaml will now be available on all the worker nodes. An example configuration file setting some logging configuration would be:
logging:
version: 1
root:
level: INFO
handlers: [consoleHandler]
handlers:
consoleHandler:
class: logging.StreamHandler
level: INFO
formatter: sample_formatter
stream: ext://sys.stderr
formatters:
sample_formatter:
format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
Logs from completed dask-yarn applications can be retrieved using the YARN cli at
yarn logs -applicationId <application-id>
Logs for running dask-yarn applications can be retrieved using client.get_worker_logs(). Note that these logs will only contain logs written to the distributed.worker logger. You cannot write to your own logger and have them appear in the output of client.get_worker_logs(). To write to this logger, get it via
import logging
logger = logging.getLogger("distributed.worker")
logger.info("Writing with the worker logger")
Any logger appropriately configured to log to stdout or stderr will appear in the logs accessed via the yarn CLI, but only the distributed.worker logger output will also be available to get_worker_logs().
Side note
I have tried using the following in ~/.config/dask/distributed.yaml and ~/.config/dask/yarn.yaml
The name of the config files doesn't matter, dask loads all yaml files in all config directories and merges their contents. For more information please read the configuration docs

Resources