fluentd error hostname is tested built-in placeholder(s) but there is no valid placeholder - fluentd

I am trying to setup EFK stack and our environment is as below
(a) ElasticSearch and Kibana runs on Windows machine
(b) FluentD runs on CentOS
I am able to setup EFK and send logs to ElasticSearch and view it in Kibana successfully with default fluent.conf
However, I would like to create indexes using the format ${record['kubernetes']['pod_name']} and I created a ConfigMap as follows
#include "#{ENV['FLUENTD_SYSTEMD_CONF'] || 'systemd'}.conf"
#include "#{ENV['FLUENTD_PROMETHEUS_CONF'] || 'prometheus'}.conf"
##include kubernetes.conf
##include kubernetes/*.conf
<match fluent.**>
# this tells fluentd to not output its log on stdout
#type null
</match>
# here we read the logs from Docker's containers and parse them
<source>
#type tail
path /var/log/containers/*.log
pos_file /var/log/containers.log.pos
tag kubernetes.*
read_from_head true
<parse>
#type json
time_format %Y-%m-%dT%H:%M:%S.%NZ
</parse>
</source>
# we use kubernetes metadata plugin to add metadatas to the log
<filter kubernetes.**>
#type kubernetes_metadata
</filter>
<match kubernetes.var.log.containers.**kube-logging**.log>
#type null
</match>
<match kubernetes.var.log.containers.**kube-system**.log>
#type null
</match>
<match kubernetes.var.log.containers.**monitoring**.log>
#type null
</match>
<match kubernetes.var.log.containers.**infra**.log>
#type null
</match>
# we send the logs to Elasticsearch
<match kubernetes.**>
#type elasticsearch_dynamic
#id out_es
#log_level debug
include_tag_key true
host "#{ENV['FLUENT_ELASTICSEARCH_HOST']}"
port "#{ENV['FLUENT_ELASTICSEARCH_PORT']}"
scheme "#{ENV['FLUENT_ELASTICSEARCH_SCHEME'] || 'http'}"
reload_connections true
logstash_format true
logstash_prefix ${record['kubernetes']['pod_name']}
<buffer>
#type file
path /var/log/fluentd-buffers/kubernetes.system.buffer
flush_mode interval
retry_type exponential_backoff
flush_thread_count 2
flush_interval 5s
retry_forever true
retry_max_interval 30
chunk_limit_size 2M
queue_limit_length 32
overflow_action block
</buffer>
</match>
However, with my own fluent.conf it failed with the following error message
Error Message
2022-03-03 11:23:59 +0000 [debug]: #0 [out_es] 'host 192.xxx.xx.xxx' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'host: 192.xxx.xx.xxx' doesn't have tag placeholder
2022-03-03 11:23:59 +0000 [debug]: #0 [out_es] 'index_name fluentd' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'index_name: fluentd' doesn't have tag placeholder
2022-03-03 11:23:59 +0000 [debug]: #0 [out_es] 'template_name ' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'template_name: ' doesn't have tag placeholder
2022-03-03 11:23:59 +0000 [debug]: #0 [out_es] 'logstash_prefix index-%Y.%m.%d' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'logstash_prefix: index-%Y.%m.%d' has timestamp placeholders, but chunk key 'time' is not configured
2022-03-03 11:23:59 +0000 [debug]: #0 [out_es] 'logstash_prefix index-%Y.%m.%d' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'logstash_prefix: index-%Y.%m.%d' doesn't have tag placeholder
2022-03-03 11:23:59 +0000 [debug]: #0 [out_es] 'logstash_dateformat %Y.%m.%d' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'logstash_dateformat: %Y.%m.%d' has timestamp placeholders, but chunk key 'time' is not configured
2022-03-03 11:23:59 +0000 [debug]: #0 [out_es] 'logstash_dateformat %Y.%m.%d' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'logstash_dateformat: %Y.%m.%d' doesn't have tag placeholder
2022-03-03 11:23:59 +0000 [debug]: #0 [out_es] 'deflector_alias ' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'deflector_alias: ' doesn't have tag placeholder
2022-03-03 11:23:59 +0000 [debug]: #0 [out_es] 'application_name default' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'application_name: default' doesn't have tag placeholder
2022-03-03 11:23:59 +0000 [debug]: #0 [out_es] 'ilm_policy_id logstash-policy' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'ilm_policy_id: logstash-policy' doesn't have tag placeholder
2022-03-03 11:23:59 +0000 [debug]: #0 [out_es] Need substitution: false
I tried suggestions googling however I am not sure what's missing in the config file. Any help is highly appreciated

Related

FluentD Output in Plain Text (non-json) format

I'm new to FluentD and I'm trying to determine if we can replace our current syslog application with FluentD. The issue that I'm trying to solve is compatability between FluentD and Legacy Application (which works w/ rsyslog) but cannot handle json.
Can FluentD output data in the format that it receives it - plain text (non-json) format that is RFC5424 compliant ?
From my research on the topic, the output is always json. I've explored using the single_value option, but that just extracts the message component which is incomplete without the host.
Any inputs or suggestions are welcome.
Here is the Fluentd config
##########
# INPUTS #
##########
# udp syslog
<source>
#type syslog
<transport udp>
</transport>
bind 0.0.0.0
port 514
tag syslog
<parse>
message_format auto
with_priority true
</parse>
</source>
###########
# OUTPUTS #
###########
<match syslog**>
#type copy
<store>
#type file
path /var/log/td-agent/syslog
compress gzip
</store>
<store>
#type file
path /var/log/td-agent/rfc_syslog
compress gzip
<format>
#type single_value
message_key message
</format>
</store>
</match>
Based on the configuration above, I receive the following outputs
File Output from the syslog location - which is all JSON
2022-10-21T09:34:53-05:00 syslog.user.info {"host":"icw-pc01.lab","ident":"MSWinEventLog\t2\tSystem\t136\tFri","message":"34:52 2022\t7036\tService Control Manager\tN/A\tN/A\tInformation\ticw-pc01.lab\tNone\t\tThe AppX Deployment Service (AppXSVC) service entered the running state.\t6 "}
File Output from the rfc_syslog location - which contains the message_key message single value
34:52 2022 7036 Service Control Manager N/A N/A Information icw-pc01.lab None The AppX Deployment Service (AppXSVC) service entered the running state. 6
Desired Output that we'd like (to support our legacy apps and legacy integrations)
Oct 21 09:34:53 icw-pc01.lab MSWinEventLog 2 System 136 Fri Oct 21 09:34:52 2022 7036 Service Control Manager N/A N/A Information icw-pc01.lab None The AppX Deployment Service (AppXSVC) service entered the running state. 6
Update:
The suggestion below solved the parsing as desired. However, when I try to forward the data to a remote syslog server, it is still going out as JSON. Below is the revised fluentd config
##########
# INPUTS #
##########
# udp syslog
<source>
#type syslog
<transport udp>
</transport>
bind 0.0.0.0
port 514
tag syslog
<parse>
#type none
message_format auto
with_priority true
</parse>
</source>
###########
# OUTPUTS #
###########
<match syslog**>
#type copy
<store>
#type file
path /var/log/td-agent/syslog
compress gzip
</store>
<store>
#type file
path /var/log/td-agent/rfc_syslog
compress gzip
<format>
#type single_value
message_key message
</format>
tag rfc_syslog
</store>
<store>
#type forward
<server>
host 192.168.0.2
port 514
</server>
</store>
</match>
<match rfc_syslog**>
#type forward
<server>
host 192.168.0.3
port 514
</server>
</match>
When configured as above, there is no forwarding happening on the 192.168.0.3 (my guess is the tag is not getting applied).
As far as the forwarding for 192.168.0.2 goes, I see the messages in the Kiwi Syslog Server - but they are in json (which is what I was trying to avoid for my legacy app).
Here is the output on the Kiwi Syslog App: kiwi-syslog-output
Update 2 [11/11/2022] : After applying the suggested config
2022-11-11 09:36:59 -0600 [info]: Received graceful stop
2022-11-11 09:36:59 -0600 [info]: Received graceful stop
2022-11-11 09:36:59 -0600 [info]: #0 fluentd worker is now stopping worker=0
2022-11-11 09:36:59 -0600 [info]: #0 shutting down fluentd worker worker=0
2022-11-11 09:36:59 -0600 [info]: #0 shutting down input plugin type=:syslog plugin_id="object:7e4"
2022-11-11 09:36:59 -0600 [info]: #0 shutting down output plugin type=:copy plugin_id="object:780"
2022-11-11 09:36:59 -0600 [info]: #0 shutting down output plugin type=:stdout plugin_id="object:7bc"
2022-11-11 09:37:15 -0600 [info]: #0 shutting down output plugin type=:forward plugin_id="object:794"
2022-11-11 09:37:16 -0600 [info]: Worker 0 finished with status 0
2022-11-11 09:49:03 -0600 [info]: parsing config file is succeeded path="/etc/td-agent/td-agent.conf"
2022-11-11 09:49:03 -0600 [info]: gem 'fluent-plugin-elasticsearch' version '5.1.4'
2022-11-11 09:49:03 -0600 [info]: gem 'fluent-plugin-flowcounter-simple' version '0.1.0'
2022-11-11 09:49:03 -0600 [info]: gem 'fluent-plugin-kafka' version '0.17.3'
2022-11-11 09:49:03 -0600 [info]: gem 'fluent-plugin-multi-format-parser' version '1.0.0'
2022-11-11 09:49:03 -0600 [info]: gem 'fluent-plugin-prometheus' version '2.0.2'
2022-11-11 09:49:03 -0600 [info]: gem 'fluent-plugin-prometheus_pushgateway' version '0.1.0'
2022-11-11 09:49:03 -0600 [info]: gem 'fluent-plugin-record-modifier' version '2.1.0'
2022-11-11 09:49:03 -0600 [info]: gem 'fluent-plugin-remote_syslog' version '1.1.0'
2022-11-11 09:49:03 -0600 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '2.4.0'
2022-11-11 09:49:03 -0600 [info]: gem 'fluent-plugin-s3' version '1.6.1'
2022-11-11 09:49:03 -0600 [info]: gem 'fluent-plugin-sd-dns' version '0.1.0'
2022-11-11 09:49:03 -0600 [info]: gem 'fluent-plugin-splunk-hec' version '1.2.10'
2022-11-11 09:49:03 -0600 [info]: gem 'fluent-plugin-syslog_rfc5424' version '0.8.0'
2022-11-11 09:49:03 -0600 [info]: gem 'fluent-plugin-systemd' version '1.0.5'
2022-11-11 09:49:03 -0600 [info]: gem 'fluent-plugin-td' version '1.1.0'
2022-11-11 09:49:03 -0600 [info]: gem 'fluent-plugin-utmpx' version '0.5.0'
2022-11-11 09:49:03 -0600 [info]: gem 'fluent-plugin-webhdfs' version '1.5.0'
2022-11-11 09:49:03 -0600 [info]: gem 'fluentd' version '1.14.4'
2022-11-11 09:49:03 -0600 [info]: gem 'fluentd' version '1.14.3'
2022-11-11 09:49:03 -0600 [info]: adding forwarding server '192.168.0.2:514' host="192.168.0.2" port=514 weight=60 plugin_id="object:794"
2022-11-11 09:49:03 -0600 [info]: using configuration file: <ROOT>
<system>
process_name "aggregator1"
</system>
<source>
#type syslog
bind "0.0.0.0"
port 514
tag "syslog"
<transport udp>
</transport>
<parse>
#type "none"
message_format auto
with_priority true
</parse>
</source>
<match syslog**>
#type copy
<store>
#type "forward"
<server>
host "192.168.0.2"
port 514
</server>
</store>
<store>
#type "stdout"
</store>
</match>
</ROOT>
2022-11-11 09:49:03 -0600 [info]: starting fluentd-1.14.4 pid=25424 ruby="2.7.5"
2022-11-11 09:49:03 -0600 [info]: spawn command to main: cmdline=["/opt/td-agent/bin/ruby", "-Eascii-8bit:ascii-8bit", "/opt/td-agent/bin/fluentd", "--log", "/var/log/td-agent/td-agent.log", "--daemon", "/var/run/td-agent/td-agent.pid", "--under-supervisor"]
2022-11-11 09:49:04 -0600 [info]: adding match pattern="syslog**" type="copy"
2022-11-11 09:49:04 -0600 [info]: #0 adding forwarding server '192.168.0.2:514' host="192.168.0.2" port=514 weight=60 plugin_id="object:794"
2022-11-11 09:49:04 -0600 [info]: adding source type="syslog"
2022-11-11 09:49:04 -0600 [warn]: parameter 'message_format' in <parse>
#type "none"
message_format auto
with_priority true
</parse> is not used.
2022-11-11 09:49:04 -0600 [info]: #0 starting fluentd worker pid=25440 ppid=25437 worker=0
2022-11-11 09:49:04 -0600 [info]: #0 listening syslog socket on 0.0.0.0:514 with udp
2022-11-11 09:49:04 -0600 [info]: #0 fluentd worker is now running worker=0
2022-11-11 09:49:04.682972925 -0600 syslog.auth.notice: {"message":"date=2022-11-11 time=15:49:04 devname=\"fg101.lab.local\" devid=\"FG101\" logid=\"0000000013\" type=\"traffic\" subtype=\"forward\" level=\"notice\" vd=\"vdom1\" eventtime=1668181744 srcip=10.1.100.155 srcport=40772 srcintf=\"port12\" srcintfrole=\"undefined\" dstip=35.197.51.42 dstport=443 dstintf=\"port11\" dstintfrole=\"undefined\" poluuid=\"707a0d88-c972-51e7-bbc7-4d421660557b\" sessionid=8058 proto=6 action=\"close\" policyid=1 policytype=\"policy\" policymode=\"learn\" service=\"HTTPS\" dstcountry=\"United States\" srccountry=\"Reserved\" trandisp=\"snat\" transip=172.16.200.2 transport=40772 duration=180 sentbyte=82 rcvdbyte=151 sentpkt=1 rcvdpkt=1 appcat=\"unscanned\""}
2022-11-11 09:49:04.683460611 -0600 syslog.local4.debug: {"message":"2022-11-11T15:49:04.407Z esx01.lab.local Rhttpproxy: verbose rhttpproxy[1051289] [Originator#6876 sub=Proxy Req 87086] Resolved endpoint : [N7Vmacore4Http16LocalServiceSpecE:0x000000fa0ed298d0] _serverNamespace = /sdk action = Allow _port = 8307"}
2022-11-11 09:49:04.683737270 -0600 syslog.local4.debug: {"message":"2022-11-11T15:49:04.408Z esx01.lab.local Rhttpproxy: verbose rhttpproxy[1051277] [Originator#6876 sub=Proxy Req 87086] Connected to localhost:8307 (/sdk) over <io_obj p:0x000000f9cc153648, h:18, <TCP '127.0.0.1 : 59272'>, <TCP '127.0.0.1 : 8307'>>"}
2022-11-11 09:49:04.683950628 -0600 syslog.local4.debug: {"message":"2022-11-11T15:49:04.410Z esx01.lab.local Rhttpproxy: verbose rhttpproxy[1082351] [Originator#6876 sub=Proxy Req 87086] The client closed the stream, not unexpectedly."}
2022-11-11 09:49:04.684235085 -0600 syslog.local4.debug: {"message":"2022-11-11T15:49:04.422Z esx01.lab.local Rhttpproxy: verbose rhttpproxy[1051291] [Originator#6876 sub=Proxy Req 87087] New proxy client <SSL(<io_obj p:0x000000fa0ea0bff8, h:17, <TCP '10.1.233.128 : 443'>, <TCP '10.0.0.250 : 46140'>>)>"}
2022-11-11 09:49:04.684453505 -0600 syslog.local4.debug: {"message":"2022-11-11T15:49:04.423Z esx01.lab.local Rhttpproxy: verbose rhttpproxy[1287838] [Originator#6876 sub=Proxy Req 87087] Resolved endpoint : [N7Vmacore4Http16LocalServiceSpecE:0x000000fa0ed298d0] _serverNamespace = /sdk action = Allow _port = 8307"}
2022-11-11 09:49:04.684749571 -0600 syslog.local4.debug: {"message":"2022-11-11T15:49:04.423Z esx01.lab.local Rhttpproxy: verbose rhttpproxy[1051278] [Originator#6876 sub=Proxy Req 87087] Connected to localhost:8307 (/sdk) over <io_obj p:0x000000f9cc153648, h:18, <TCP '127.0.0.1 : 51121'>, <TCP '127.0.0.1 : 8307'>>"}
2022-11-11 09:49:10.521901882 -0600 syslog.auth.info: {"message":"Nov 11 09:49:10 icw-pc01.lab MSWinEventLog\t2\tSecurity\t744984\tFri Nov 11 09:49:10 2022\t6417\tMicrosoft-Windows-Security-Auditing\tN/A\tN/A\tSuccess Audit\ticw-pc01.lab\tSystem Integrity\t\tThe FIPS mode crypto selftests succeeded. Process ID: 0x17cc Process Name: C:\\Python27\\python.exe\t717211 "}
As stated in my response above (on Nov 29, 2022) - I was missing some dependencies for the Remote Syslog plugin.
Once the dependencies were installed, I was able to get the Remote Syslog plugin to work as desired (w/ the extra text as outlined in my comment above).

Fluentd - Kubernetes - ParserError error="pattern not matched with data"

I'm trying to redirect Kubernetes logs from containers to OpenSearch.
But there is always some error with the date. What am I doing wrong?
Docker logs example:
{"log":"time=\"2022-04-01T10:02:31Z\" level=warning msg=\"Cannot take snapshot backup\" controller=longhorn-backup error=\"could not find snapshot 'snapshot-0d1744c2-ff8d-4a68-8a2c-fbfd16408975' to backup, volume 'pvc-03557105-c20d-4fbe-8d0d-8a0b4ac16f6d'\" node=k8s-worker2\n","stream":"stderr","time":"2022-04-01T10:02:31.230191143Z"}
{"log":"E0401 10:02:31.230146 1 backup_controller.go:153] longhorn-backup: fail to sync backup longhorn-system/backup-989764daba094e0d: could not find snapshot 'snapshot-0d1744c2-ff8d-4a68-8a2c-fbfd16408975' to backup, volume 'pvc-03557105-c20d-4fbe-8d0d-8a0b4ac16f6d'\n","stream":"stderr","time":"2022-04-01T10:02:31.230214608Z"}
{"log":"time=\"2022-04-01T10:02:31Z\" level=warning msg=\"Dropping Longhorn backup longhorn-system/backup-989764daba094e0d out of the queue\" controller=longhorn-backup error=\"longhorn-backup: fail to sync backup longhorn-system/backup-989764daba094e0d: could not find snapshot 'snapshot-0d1744c2-ff8d-4a68-8a2c-fbfd16408975' to backup, volume 'pvc-03557105-c20d-4fbe-8d0d-8a0b4ac16f6d'\" node=k8s-worker2\n","stream":"stderr","time":"2022-04-01T10:02:31.230218285Z"}
Fluentd Output:
fluentd/fluentd-x7pgc[fluentd]: 2022-04-01 09:54:33 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not matched with data 'time=\"2022-04-01T09:54:33Z\" level=warning msg=\"Cannot take snapshot backup\" controller=longhorn-backup error=\"could not find snapshot 'snapshot-15dfec02-b8c4-40db-a7ed-bf84429ac220' to backup, volume 'pvc-03557105-c20d-4fbe-8d0d-8a0b4ac16f6d'\" node=k8s-worker1\n'" location=nil tag="kubernetes.var.log.containers.longhorn-manager-5lmdf_longhorn-system_longhorn-manager-5f6bc9870a9efe75670274d177d0bf17dee0dd995a433343432b3155af946823.log" time=2022-04-01 09:54:33.524389862 +0000 record={"log"=>"time=\"2022-04-01T09:54:33Z\" level=warning msg=\"Cannot take snapshot backup\" controller=longhorn-backup error=\"could not find snapshot 'snapshot-15dfec02-b8c4-40db-a7ed-bf84429ac220' to backup, volume 'pvc-03557105-c20d-4fbe-8d0d-8a0b4ac16f6d'\" node=k8s-worker1\n", "stream"=>"stderr"}
fluentd/fluentd-x7pgc[fluentd]: 2022-04-01 09:54:33 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not matched with data 'time=\"2022-04-01T09:54:33Z\" level=warning msg=\"Error syncing Longhorn backup longhorn-system/backup-c5104cf80da04be6\" controller=longhorn-backup error=\"longhorn-backup: fail to sync backup longhorn-system/backup-c5104cf80da04be6: could not find snapshot 'snapshot-15dfec02-b8c4-40db-a7ed-bf84429ac220' to backup, volume 'pvc-03557105-c20d-4fbe-8d0d-8a0b4ac16f6d'\" node=k8s-worker1\n'" location=nil tag="kubernetes.var.log.containers.longhorn-manager-5lmdf_longhorn-system_longhorn-manager-5f6bc9870a9efe75670274d177d0bf17dee0dd995a433343432b3155af946823.log" time=2022-04-01 09:54:33.524404952 +0000 record={"log"=>"time=\"2022-04-01T09:54:33Z\" level=warning msg=\"Error syncing Longhorn backup longhorn-system/backup-c5104cf80da04be6\" controller=longhorn-backup error=\"longhorn-backup: fail to sync backup longhorn-system/backup-c5104cf80da04be6: could not find snapshot 'snapshot-15dfec02-b8c4-40db-a7ed-bf84429ac220' to backup, volume 'pvc-03557105-c20d-4fbe-8d0d-8a0b4ac16f6d'\" node=k8s-worker1\n", "stream"=>"stderr"}
fluentd/fluentd-x7pgc[fluentd]: 2022-04-01 09:54:33 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not matched with data 'time=\"2022-04-01T09:54:33Z\" level=warning msg=\"Cannot take snapshot backup\" controller=longhorn-backup error=\"could not find snapshot 'snapshot-9d5705bf-26fc-49e5-a771-cf9352049c04' to backup, volume 'pvc-03557105-c20d-4fbe-8d0d-8a0b4ac16f6d'\" node=k8s-worker1\n'" location=nil tag="kubernetes.var.log.containers.longhorn-manager-5lmdf_longhorn-system_longhorn-manager-5f6bc9870a9efe75670274d177d0bf17dee0dd995a433343432b3155af946823.log" time=2022-04-01 09:54:33.538539106 +0000 record={"log"=>"time=\"2022-04-01T09:54:33Z\" level=warning msg=\"Cannot take snapshot backup\" controller=longhorn-backup error=\"could not find snapshot 'snapshot-9d5705bf-26fc-49e5-a771-cf9352049c04' to backup, volume 'pvc-03557105-c20d-4fbe-8d0d-8a0b4ac16f6d'\" node=k8s-worker1\n", "stream"=>"stderr"}
Config:
<source>
#type tail
#id tail_all_container_logs
#label #FLUENTD.OPENSEARCH
path /var/log/containers/longhorn*.log
pos_file /var/log/fluentd-containers.log.pos
tag kubernetes.*
exclude_path "#{ENV['FLUENT_ALL_CONTAINERS_TAIL_EXCLUDE_PATHS']}"
<parse>
#type json
</parse>
</source>
<filter kubernetes.**>
#type parser
key_name log
<parse>
#type json
time_format %Y-%m-%dT%H:%M:%S.%N%z
timezone +00:00
</parse>
</filter>
The date/time from your logs is 2022-04-01T09:54:33Z note no milliseconds.
While in your config has time_format %Y-%m-%dT%H:%M:%S.%N%z
%N - Fractional seconds digits, default is 9 digits (nanosecond).
Try to remove the .%N part to match with your time format from logs. Which would be:
time_format %Y-%m-%dT%H:%M:%S%z
For more information about the time format syntax kindly refer to this page.
#piotr-malec - Thanks for the answer.
I've tried many different date/time formats but it's always the same.
<filter kubernetes.**>
#type parser
key_name log
<parse>
#type json
time_format %Y-%m-%dT%H:%M:%S.%N%z
#time_format %Y-%m-%dT%H:%M:%S.%z
#time_format %Y-%m-%dT%H:%M:%S%z
#time_format %Y-%m-%dT%H:%M:%Sz
#time_format %Y-%m-%dT%H:%M:%SZ
#time_format %Y-%m-%dT%H:%M:%S%Z
#time_format %Y-%m-%dT%H:%M:%S.%z
timezone +00:00
</parse>
</filter>
Now I have a similar case with a different log.
fluentd/fluentd-9gfjs[fluentd]: 2022-04-05 08:46:58 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not matched with data '[2022-04-05T08:46:58,305][INFO ][o.o.j.s.JobSweeper ] [opensearch-cluster-master-0] Running full sweep\n'" location=nil tag="opensearch-master" time=1970-01-01 00:33:42.306044198 +0000 record={"log"=>"[2022-04-05T08:46:58,305][INFO ][o.o.j.s.JobSweeper ] [opensearch-cluster-master-0] Running full sweep\n", "stream"=>"stdout"}
fluentd/fluentd-9gfjs[fluentd]: 2022-04-05 08:51:58 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not matched with data '[2022-04-05T08:51:58,306][INFO ][o.o.j.s.JobSweeper ] [opensearch-cluster-master-0] Running full sweep\n'" location=nil tag="opensearch-master" time=1970-01-01 00:33:42.306370871 +0000 record={"log"=>"[2022-04-05T08:51:58,306][INFO ][o.o.j.s.JobSweeper ] [opensearch-cluster-master-0] Running full sweep\n", "stream"=>"stdout"}
I have tried the following formats. Each of them does not work.
<filter opensearch-master>
#type parser
key_name log
<parse>
#type json
time_key #timestamp
time_format %Y-%m-%dT%H:%M:%S,%3N
#time_format %Y-%m-%dT%H:%M:%S,%L
timezone +02:00
</parse>
</filter>
Ruby Documentation:
https://docs.ruby-lang.org/en/2.4.0/Time.html#method-i-strftime

Fail2ban - creating second sshd-jail for docker-container log does not work

I have a Linux box on Ubuntu 18.04.3 and have a working fail2ban configuration (like on all my hosts).
In this case I setup a docker-container which acts as a sftp-server for several users - the docker-container has a running rsyslogd and writes login events to /var/log/auth.log - /var/log is mounted to the host-system to /myapp/log/sftp.
So I created a second sshd-jail with this config snippet in jail.local
[myapp-sftp]
filter=sshd
enabled = true
findtime = 1200
maxretry = 2
mode = aggressive
backend = polling
logpath=/myapp/log/sftp/auth.log
The logfile /myapp/log/sftp/auth.log is absolutely there and filled with a lot of failed login tries - from myself and others.
But the jail never gets triggered with a found log entry in fail2ban.log.
I already reset the fail2ban database ... and have no clue what might be wrong.
I tried backend = polling and the default pyinotify.
Checking with fail2ban-regex says that it matches..
# fail2ban-regex /myapp/log/sftp/auth.log /etc/fail2ban/filter.d/sshd.conf
Running tests
=============
Use failregex filter file : sshd, basedir: /etc/fail2ban
Use maxlines : 1
Use datepattern : Default Detectors
Use log file : /myapp/log/sftp/auth.log
Use encoding : UTF-8
Results
=======
Failregex: 268 total
|- #) [# of hits] regular expression
| 3) [64] ^Failed \S+ for invalid user <F-USER>(?P<cond_user>\S+)|(?:(?! from ).)*?</F-USER> from <HOST>(?: port \d+)?(?: on \S+(?: port \d+)?)?(?: ssh\d*)?(?(cond_user): |(?:(?:(?! from ).)*)$)
| 4) [29] ^Failed \b(?!publickey)\S+ for (?P<cond_inv>invalid user )?<F-USER>(?P<cond_user>\S+)|(?(cond_inv)(?:(?! from ).)*?|[^:]+)</F-USER> from <HOST>(?: port \d+)?(?: on \S+(?: port \d+)?)?(?: ssh\d*)?(?(cond_user): |(?:(?:(?! from ).)*)$)
| 6) [64] ^[iI](?:llegal|nvalid) user <F-USER>.*?</F-USER> from <HOST>(?: port \d+)?(?: on \S+(?: port \d+)?)?\s*$
| 21) [111] ^<F-NOFAIL>Connection from</F-NOFAIL> <HOST>
`-
Ignoreregex: 0 total
Date template hits:
|- [# of hits] date format
| [642] {^LN-BEG}(?:DAY )?MON Day %k:Minute:Second(?:\.Microseconds)?(?: ExYear)?
`-
Lines: 642 lines, 0 ignored, 268 matched, 374 missed
[processed in 0.13 sec]
Missed line(s): too many to print. Use --print-all-missed to print all 374 lines
and
# fail2ban-client status myapp-sftp
Status for the jail: myapp-sftp
|- Filter
| |- Currently failed: 0
| |- Total failed: 0
| `- File list: /myapp/log/sftp/auth.log
`- Actions
|- Currently banned: 0
|- Total banned: 0
`- Banned IP list:
# cat /var/log/fail2ban.log | grep myapp
2019-08-21 10:35:33,647 fail2ban.jail [649]: INFO Creating new jail 'wippex-sftp'
2019-08-21 10:35:33,647 fail2ban.jail [649]: INFO Jail 'myapp-sftp' uses pyinotify {}
2019-08-21 10:35:33,664 fail2ban.server [649]: INFO Jail myapp-sftp is not a JournalFilter instance
2019-08-21 10:35:33,665 fail2ban.filter [649]: INFO Added logfile: '/wippex/log/sftp.log' (pos = 0, hash = 287d8cc2e307c5f427aa87c4c649ced889d6bf6a)
2019-08-21 10:35:33,689 fail2ban.jail [649]: INFO Jail 'myapp-sftp' started
I really never get an expected found entry... nor a ban.
Any ideas are welcome.
# fail2ban-server -V
Fail2Ban v0.10.2
Copyright (c) 2004-2008 Cyril Jaquier, 2008- Fail2Ban Contributors
Copyright of modifications held by their respective authors.
log sample from /myapp/log/sftp/auth.log
Aug 21 14:03:13 a9ede63166d9 sshd[202]: Failed password for invalid user mapp from 95.85.16.178 port 41766 ssh2
Aug 21 14:03:13 a9ede63166d9 sshd[202]: Received disconnect from 95.85.16.178 port 41766:11: Normal Shutdown, Thank you for playing [preauth]
Aug 21 14:03:13 a9ede63166d9 sshd[202]: Disconnected from 95.85.16.178 port 41766 [preauth]
Aug 21 14:03:49 a9ede63166d9 sshd[204]: Connection from 95.85.16.178 port 34722 on 172.17.0.3 port 22
Aug 21 14:03:49 a9ede63166d9 sshd[204]: Invalid user mapp from 95.85.16.178 port 34722
Aug 21 14:03:49 a9ede63166d9 sshd[204]: input_userauth_request: invalid user mapp [preauth]
Aug 21 14:03:49 a9ede63166d9 sshd[204]: error: Could not get shadow information for NOUSER
Aug 21 14:03:49 a9ede63166d9 sshd[204]: Failed password for invalid user mapp from 95.85.16.178 port 34722 ssh2
Aug 21 14:03:49 a9ede63166d9 sshd[204]: Received disconnect from 95.85.16.178 port 34722:11: Normal Shutdown, Thank you for playing [preauth]
Aug 21 14:03:49 a9ede63166d9 sshd[204]: Disconnected from 95.85.16.178 port 34722 [preauth]
Problem is "solved". The docker container simply used a different timezone than the host and the logfile timestamps didnt contain the timezone.
So fail2ban assumed the timestamps were written in the same timezone as it´s running environment (on host) and didn´t interprete "old" log entries (2 hr. diff).
See https://github.com/fail2ban/fail2ban/issues/2486
I simply set the host timezone to UTC now - but will try now to set rsyncd to use a timezoned dateformat

how to fix fluentd(td-agent) buffer problem?

I have an fluentd, elasticsearch, graylog setup and I'm getting below error intermittently in td-agent log
[warn]: temporarily failed to flush the buffer. next_retry=2019-01-27
19:00:14 -0500 error_class="ArgumentError" error="Data too big (189382
bytes), would create more than 128 chunks!"
plugin_id="object:3fee25617fbc"
Because of this cache memory increases and td-agent fails to send messages to graylog
I have tried setting the buffer_chunk_limit to 8m and flush_interval time to 5sec

Fluentd: How to solve fluentd forward plugin error?

I'm trying to transfer log from one server(A) to another server(B) using fluentd.
I use forward_output plugin in A and use forward input server in B, but I got there following error.
2017-10-26 16:59:27 +0900 [warn]: #0 [forward_output] detached forwarding server 'boxil-log' host="xxx.xxx.xxx.xxx" port=24224 hard_timeout=true
2017-10-26 17:00:43 +0900 [warn]: #0 [forward_output] failed to flush the buffer. retry_time=0 next_retry_seconds=2017-10-26 17:00:44 +0900 chunk="55c6e86321afbab5bed145d53e679865" error_class=Errno::ETIMEDOUT error="Connection timed out - connect(2) for \"xxx.xxx.xxx.xxx\" port 24224"
2017-10-26 17:01:42 +0900 [warn]: #0 [forward_output] failed to flush the buffer. retry_time=6 next_retry_seconds=2017-10-26 17:01:42 +0900 chunk="55c6e86321afbab5bed145d53e679865" error_class=Fluent::Plugin::ForwardOutput::NoNodesAvailable error="no nodes are available"
This is the code for output in A.
<match system.*.*>
#type forward
#id forward_output
<server>
name boxil-log
host xxx.xxx.xxx.xxx
port 24224
</server>
</match>
This is the code for input in B.
<source>
#type forward
#id forward_input
</source>
I want to know the meaning of this error, the reason I got this and the way solving this.
Thank you.

Resources