Configuration of fluent-plugin-concat makes the logs dissappear

Configuration of fluent-plugin-concat makes the logs dissappear - docker

my configuration of ” fluent-plugin-concat” is causing my long logs to disappear instead of be concatenated and sent to Kinesis steam.
I use fluentd to send logs from containers deployed on AWS/ECS to a kinesis stream. ( and then to ES cluster somewhere)
on rare occasions, some of the logs are very big. most of the time they are under the docker limit of 16K. However, those rare long logs are very important and we don't want to miss them.
My configuration file is attached.
Just before the final match sequence, I added:
<filter>
#type concat
key log
stream_identity_key container_id
partial_key partial_message
partial_value true
separator “”
</filter>
Another configuration I tried:
with the bellow options only the second partial log is sent too ES, the first part can only be seen in the fluentd logs. Adding the logs of this config as a file.
<filter>
#type concat
key log
stream_identity_key partial_id
use_partial_metadata true
separator ""
</filter>
and
<filter>
#type concat
key log
use_partial_metadata true
separator ""
</filter>
The log I’m testing with is also attached as a json document.
If I removed this configuration, this log will be sent in 2 chunks.
What am I doing wrong? (edited)
the full config file:
<system>
log_level info
</system>
# just listen on the unix socket in a dir mounted from host
# input is a json object, with the actual log line in the `log` field
<source>
#type unix
path /var/fluentd/fluentd.sock
</source>
# tag log line as json or text
<match service.*.*>
#type rewrite_tag_filter
<rule>
key log
pattern /.*"logType":\s*"application"/
tag application.${tag}.json
</rule>
<rule>
key log
pattern /.*"logType":\s*"exception"/
tag exception.${tag}.json
</rule>
<rule>
key log
pattern /.*"logType":\s*"audit"/
tag audit.${tag}.json
</rule>
<rule>
key log
pattern /^\{".*\}$/
tag default.${tag}.json
</rule>
<rule>
key log
pattern /.+/
tag default.${tag}.txt
</rule>
</match>
<filter *.service.*.*.*>
#type record_transformer
<record>
service ${tag_parts[2]}
childService ${tag_parts[3]}
</record>
</filter>
<filter *.service.*.*.json>
#type parser
key_name log
reserve_data true
remove_key_name_field true
<parse>
#type json
</parse>
</filter>
<filter *.service.*.*.*>
#type record_transformer
enable_ruby
<record>
#timestamp ${ require 'time'; Time.now.utc.iso8601(3) }
</record>
</filter>
<filter>
#type concat
key log
stream_identity_key container_id
partial_key partial_message
partial_value true
separator ""
</filter>
<match exception.service.*.*.*>
#type copy
<store>
#type kinesis_streams
region "#{ENV['AWS_DEFAULT_REGION']}"
stream_name the-name-ex
debug false
<instance_profile_credentials>
</instance_profile_credentials>
<buffer>
flush_at_shutdown true
flush_interval 10
chunk_limit_size 16m
flush_thread_interval 1.0
flush_thread_burst_interval 1.0
flush_thread_count 1
</buffer>
</store>
<store>
#type stdout
</store>
</match>
<match audit.service.*.*.json>
#type copy
<store>
#type kinesis_streams
region "#{ENV['AWS_DEFAULT_REGION']}"
stream_name the-name-sa
debug false
<instance_profile_credentials>
</instance_profile_credentials>
<buffer>
flush_at_shutdown true
flush_interval 1
chunk_limit_size 16m
flush_thread_interval 0.1
flush_thread_burst_interval 0.01
flush_thread_count 15
</buffer>
</store>
<store>
#type stdout
</store>
</match>
<match *.service.*.*.*>
#type copy
<store>
#type kinesis_streams
region "#{ENV['AWS_DEFAULT_REGION']}"
stream_name the-name-apl
debug false
<instance_profile_credentials>
</instance_profile_credentials>
<buffer>
flush_at_shutdown true
flush_interval 10
chunk_limit_size 16m
flush_thread_interval 1.0
flush_thread_burst_interval 1.0
flush_thread_count 1
</buffer>
</store>
<store>
#type stdout
</store>
</match>
<match **>
#type stdout
</match>
example log message - long single line:
{"message": "some message", "longlogtest": "averylongjsonline", "service": "longlog-service", "logType": "application", "log": "aaa .... ( ~18000 chars )..longlogThisIsTheEndOfTheLongLog"}
fluentd-container-log ... contains only the first part of the message:
and the following error message:
dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not match with data
2021-03-05 13:45:41.886672929 +0000 fluent.warn: {"error":"#<Fluent::Plugin::Parser::ParserError: pattern not match with data '{\"message\": \"some message\", \"longlogtest\": \"averylongjsonline\", \"service\": \"longlog-service\", \"logType\": \"application\", \"log\": \"aaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilonglogaaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilonglogaaaasssssdddddjjjjjjjkkkkkkklllllllwewww
< .....Manay lines of original log erased here...... >
djjjjjjjkkkkkkklllllllwewwwiiiaaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilongloglonglogaaaassss'>","location":null,"tag":"application.service.longlog.none.json","time":1614951941,"record":{"source":"stdout","log":"{\"message\": \"some message\", \"longlogtest\": \"averylongjsonline\", \"service\": \"longlog-service\", \"logType\": \"application\", \"log\": \"aaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilonglogaaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilonglogaaaasssssdddddjjjjjjjkkkkkkklllllllwewww
< .....Manay lines of original log erased here...... >
wwiiiiiilonglogaaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiaaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilongloglonglogaaaassss","partial_message":"true","partial_id":"5c752c1bbfda586f1b867a8ce2274e0ed0418e8e10d5e8602d9fefdb8ad2b7a1","partial_ordinal":"1","partial_last":"false","container_id":"803c0ebe4e6875ea072ce21179e4ac2d12e947b5649ce343ee243b5c28ad595a","container_name":"/ecs-longlog-18-longlog-b6b5ae85ededf4db1f00","service":"longlog","childService":"none"},"message"
:"dump an error event: error_class=Fluent::Plugin::Parser::ParserError error=\"pattern not match with data '{\\\"message\\\": \\\"some message\\\", \\\"longlogtest\\\": \\\"averylongjsonline\\\", \\\"service\\\": \\\"longlog-service\\\", \\\"logType\\\": \\\"application\\\", \\\"log\\\": \\\"aaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilonglogaaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilonglogaaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilonglogaaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilonglogaaaasss

Related

How to inject `time` attribute based on certain json key value?

I am still new on fluentd, I've tried various configuration, but I am stuck.
Suppose I have this record pushed to fluend that has _epoch to tell the epoch time the record is created.
{"data":"dummy", "_epoch": <epochtime_in_second>}
Instead of using time attribute being processed by fluentd, I want to override the time with this _epoch field. How to produce fluentd output with time overriden?
I've tried this
# TCP input to receive logs from the forwarders
<source>
#type forward
bind 0.0.0.0
port 24224
</source>
# HTTP input for the liveness and readiness probes
<source>
#type http
bind 0.0.0.0
port 9880
</source>
# rds2fluentd_test
<filter rds2fluentd_test.*>
#type parser
key_name _epoch
reserve_data true
<parse>
#type regexp
expression /^(?<time>.*)$/
time_type unixtime
utc true
</parse>
</filter>
<filter rds2fluentd_test.*>
#type stdout
</filter>
<match rds2fluentd_test.*>
#type s3
#log_level debug
aws_key_id "#{ENV['AWS_ACCESS_KEY']}"
aws_sec_key "#{ENV['AWS_SECRET_KEY']}"
s3_bucket foo-bucket
s3_region ap-southeast-1
path ingestion-test-01/${_db}/${_table}/%Y-%m-%d-%H-%M/
#s3_object_key_format %{path}%{time_slice}_%{index}.%{file_extension}
# if you want to use ${tag} or %Y/%m/%d/ like syntax in path / s3_object_key_format,
# need to specify tag for ${tag} and time for %Y/%m/%d in <buffer> argument.
<buffer time,_db,_table>
#type file
path /var/log/fluent/s3
timekey 1m # 5 minutes partition
timekey_wait 10s
timekey_use_utc true # use utc
chunk_limit_size 256m
</buffer>
time_slice_format %Y%m%d%H
store_as json
</match>
But upon receiving data like above, it shows warning error like this:
#0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="parse failed no implicit conversion of Integer into Hash" location="/usr/local/bundle/gems/fluentd-1.10.4/lib/fluent/plugin/filter_parser.rb:110:in `rescue in filter_with_time'" tag="rds2fluentd_test." time=1590578507 record={....

was getting the same warning message, setting hash_value_field parsed under filter section solved the issue.

FluentD configuration to index on all the fields in the log in elastic

Hi I have the below log from springboot microservice. What to create a index on all the below fields like timestamp, level, logger etc in elastic. How to achieve this in fluentd configuration? Tried the below and it didnt work
Log
timestamp:2020-04-27 09:37:56.996 level:INFO level_value:20000 thread:http-nio-8080-exec-2 logger:com.scb.nexus.service.phoenix.components.ApplicationEventListener context:default message:org.springframework.web.context.support.ServletRequestHandledEvent traceId:a122e51aa3d24d4a spanId:a122e51aa3d24d4a spanExportable:false X-Span-Export:false X-B3-SpanId:a122e51aa3d24d4a X-B3-TraceId:a122e51aa3d24d4a
fluentd conf
<match **>
#type elasticsearch
time_as_integer true
include_timestamp true
host host
port 9200
user userName
password password
scheme https
ssl_verify false
ssl_version TLSv1_2
index_name testIndex
</match>
<filter **>
#type parser
key_name log
reserve_data true
<parse>
#type json
</parse>
</filter>

Logs are not in JSON format, therefore you cant use the Json parser. You have the following options to solve this issue
1- use regex parser as described here https://docs.fluentd.org/parser/regexp
2- use record_reformer plugin and extract items manually
example:
<match **>
#type record_reformer
tag parsed.${tag_suffix[2]}
renew_record false
enable_ruby true
<record>
timestamp ${record['log'].scan(/timestamp:(?<param>[^ ]+ [^ ]+)/).flatten.compact.sort.first}
log_level ${record['log'].scan(/level:(?<param>[^ ]+)/).flatten.compact.sort.first}
level_value ${record['log'].scan(/level_value:(?<param>[^ ]+)/).flatten.compact.sort.first}
</record>
</match>
<match parsed.**>
#type elasticsearch
time_as_integer true
include_timestamp true
host host
port 9200
user userName
password password
scheme https
ssl_verify false
ssl_version TLSv1_2
index_name testIndex
</match>

Fluentd (td-agent) secondary type should be same with primary one error

I'm using td-agent with http plugin for sending the log data to another server.
But when start td-agent with my config file, I got the warning message like below,
2019-09-06 11:02:15 +0900 [warn]: #0 secondary type should be same
with primary one primary="Fluent::TreasureDataLogOutput"
secondary="Fluent::Plugin::FileOutput"
Here is my config file,
<source>
#type tail
path /pub/var/log/mylog_%Y%m%d.log
pos_file /var/log/td-agent/www_log.log.pos
tag my.log
format /^(?<log_time>\d{4}\/\d{2}\/\d{2}\s\d{2}:\d{2}:\d{2}) \[INFO\] (?<message>.*)$/
</source>
<filter my.log>
#type parser
key_name message
<parse>
#type json
</parse>
</filter>
<match my.log>
#type http
endpoint_url http://localhost:3000/
custom_headers {"user-agent": "td-agent"}
http_method post
raise_on_error true
</match>
It is sending the log data correctly but I need to resolve the warning message too. How can I resolve the warning?

Configure fluentd to properly parse and ship java stacktrace,which is formatted using docker json-file logging driver,to elastic as single message

Our service runs as a docker instance.
Given limitation is that the docker logging driver cannot be changed to anything different than the default json-file driver.
The (scala micro)service outputs a log that looks like this
{"log":"10:30:12.375 [application-akka.actor.default-dispatcher-13] [WARN] [rulekeepr-615239361-v5mtn-7]- c.v.r.s.logic.RulekeeprLogicProvider(91) - decision making have failed unexpectedly\n","stream":"stdout","time":"2017-05-08T10:30:12.376485994Z"}
{"log":"java.lang.RuntimeException: Error extracting fields to make a lookup for a rule at P2: [failed calculating amount/amountEUR/directive: [failed getting accountInfo of companyId:3303 from deadcart: unexpected status returned: 500]]\n","stream":"stdout","time":"2017-05-08T10:30:12.376528449Z"}
{"log":"\u0009at org.assbox.rulekeepr.services.BasicRuleService$$anonfun$lookupRule$2.apply(BasicRuleService.scala:53)\n","stream":"stdout","time":"2017-05-08T10:30:12.376537277Z"}
{"log":"\u0009at org.assbox.rulekeepr.services.BasicRuleService$$anonfun$lookupRule$2.apply(BasicRuleService.scala:53)\n","stream":"stdout","time":"2017-05-08T10:30:12.376542826Z"}
{"log":"\u0009at scala.concurrent.Future$$anonfun$transform$1$$anonfun$apply$2.apply(Future.scala:224)\n","stream":"stdout","time":"2017-05-08T10:30:12.376548224Z"}
{"log":"Caused by: java.lang.RuntimeException: failed calculating amount/amountEUR/directive: [failed getting accountInfo of companyId:3303 from deadcart: unexpected status returned: 500]\n","stream":"stdout","time":"2017-05-08T10:30:12.376674554Z"}
{"log":"\u0009at org.assbox.rulekeepr.services.logic.TlrComputedFields$$anonfun$calculatedFields$1.applyOrElse(AbstractComputedFields.scala:39)\n","stream":"stdout","time":"2017-05-08T10:30:12.376680922Z"}
{"log":"\u0009at org.assbox.rulekeepr.services.logic.TlrComputedFields$$anonfun$calculatedFields$1.applyOrElse(AbstractComputedFields.scala:36)\n","stream":"stdout","time":"2017-05-08T10:30:12.376686377Z"}
{"log":"\u0009at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)\n","stream":"stdout","time":"2017-05-08T10:30:12.376691228Z"}
{"log":"\u0009... 19 common frames omitted\n","stream":"stdout","time":"2017-05-08T10:30:12.376720255Z"}
{"log":"Caused by: java.lang.RuntimeException: failed getting accountInfo of companyId:3303 from deadcart: unexpected status returned: 500\n","stream":"stdout","time":"2017-05-08T10:30:12.376724303Z"}
{"log":"\u0009at org.assbox.rulekeepr.services.mixins.DCartHelper$$anonfun$accountInfo$1.apply(DCartHelper.scala:31)\n","stream":"stdout","time":"2017-05-08T10:30:12.376729945Z"}
{"log":"\u0009at org.assbox.rulekeepr.services.mixins.DCartHelper$$anonfun$accountInfo$1.apply(DCartHelper.scala:24)\n","stream":"stdout","time":"2017-05-08T10:30:12.376734254Z"}
{"log":"\u0009... 19 common frames omitted\n","stream":"stdout","time":"2017-05-08T10:30:12.37676087Z"}
How can I harness fluentd directives for properly combining the following log event that contains a stack trace, so it all be shipped to elastic as single message?
I have full control of the logback appender pattern used, so I can change the order of occurrence of log values to something else, and even change the appender class.
We're working with k8s and it turns out its not straight forward to change the docker logging driver so we're looking for a solution that will be able to handle the given example.
I don't care so much about extracting the loglevel, thread, logger into specific keys so I could later easily filter by them in kibana. it would be nice to have, but less important.
What is important is to accurately parse the timestamp, down to the milliseconds and use it as the actual log even timestamp as it shipped to elastic.

You can use fluent-plugin-concat.
For example with Fluentd v0.14.x,
<source>
#type tail
path /var/log/containers/*.log
pos_file /var/log/fluentd-containers.log.pos
tag kubernetes.*
read_from_head true
<parse>
#type json
</parse>
#label #INPUT
</source>
<label #INPUT>
<filter kubernetes.**>
#type concat
key log
multiline_start_regexp ^\d{2}:\d{2}:\d{2}\.\d+
continuous_line_regexp ^(\s+|java.lang|Caused by:)
separator ""
flush_interval 3s
timeout_label #PARSE
</filter>
<match kubernetes.**>
#type relabel
#label #PARSE
</match>
</label>
<label #PARSE>
<filter kubernetes.**>
#type parser
key_name log
inject_key_prefix log.
<parse>
#type multiline_grok
grok_failure_key grokfailure
<grok>
pattern YOUR_GROK_PATTERN
</grok>
</parse>
</filter>
<match kubernetes.**>
#type relabel
#label #OUTPUT
</match>
</label>
<label #OUTPUT>
<match kubernetes.**>
#type stdout
</match>
</label>
Similar issues:
https://github.com/fluent/fluent-plugin-grok-parser/issues/36
https://github.com/fluent/fluent-plugin-grok-parser/issues/37

You can try using the fluentd-plugin-grok-parser - but I am having the same issue - it seems that the \u0009 tab character is not being recognized and so using fluentd-plugin-detect-exceptions will not detect the multiline exceptions - at least not yet in my attempts... .

In fluentd 1.0 I was able to achieve this with fluent-plugin-concat. The concat plugin starts and continues concatenation until it sees multiline_start_regexp pattern again. This captures JAVA exceptions and multiline slf4j log statements. Adjust your multiline_start_regexp pattern to match your slf4j log output line.
Any line, including exceptions starting starting with a timestamp matching pattern 2020-10-05 18:01:52.871
will be concatenated
ex:
2020-10-05 18:01:52.871 ERROR 1 --- [nio-8088-exec-3] c.i.printforever.DemoApplication multiline statement
I am using container_id as the identity key,
<system>
log_level debug
</system>
# Receive events from 24224/tcp
# This is used by log forwarding and the fluent-cat command
<source>
#type forward
#id input1
#label #mainstream
port 24224
</source>
# All plugin errors
<label #ERROR>
<match **>
#type file
#id error
path /fluentd/log/docker/error/error.%Y-%m-%d.%H%M
compress gzip
append true
<buffer>
#type file
path /fluentd/log/docker/error
timekey 60s
timekey_wait 10s
timekey_use_utc true
total_limit_size 200mb
</buffer>
</match>
</label>
<label #mainstream>
<filter docker.**>
#type concat
key log
stream_identity_key container_id
multiline_start_regexp /^\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}\.\d{1,3}/
</filter>
# Match events with docker tag
# Send them to S3
<match docker.**>
#type copy
<store>
#type s3
#id output_docker_s3
aws_key_id "#{ENV['AWS_KEY_ID']}"
aws_sec_key "#{ENV['AWS_SECRET_KEY']}"
s3_bucket "#{ENV['S3_BUCKET']}"
path "#{ENV['S3_OBJECT_PATH']}"
store_as gzip
<buffer tag,time>
#type file
path /fluentd/log/docker/s3
timekey 300s
timekey_wait 1m
timekey_use_utc true
total_limit_size 200mb
</buffer>
time_slice_format %Y%m%d%H
</store>
<store>
#type stdout
</store>
<store>
#type file
#id output_docker_file
path /fluentd/log/docker/file/${tag}.%Y-%m-%d.%H%M
compress gzip
append true
<buffer tag,time>
#type file
timekey_wait 1m
timekey 1m
timekey_use_utc true
total_limit_size 200mb
path /fluentd/log/docker/file/
</buffer>
</store>
</match>
<match **>
#type file
#id output_file
path /fluentd/log/docker/catch-all/data.*.log
</match>
</label>

Fluentd: How to place the time key inside the json string

This is the record that fluentd write to my log.
2016-02-22 14:38:59 {"login_id":123,"login_email":"abc#gmail.com"}
The date time is the time key of fluentd. How can i place that time inside the json string ?

My friend helped me this. He used this fluentd plugin:
http://docs.fluentd.org/articles/filter-plugin-overview
This is the config:
<filter trackLog>
type record_modifier
<record>
fluentd_time ${Time.now.strftime("%Y-%m-%d %H:%M:%S")}
</record>
</filter>
<match trackLog>
type record_modifier
tag trackLog.finished
</match>
<match trackLog.finished>
type webhdfs
host localhost
port 50070
path /data/trackLog/%Y%m%d_%H
username hdfs
output_include_tag false
remove_prefix trackLog.finished
output_include_time false
buffer_type file
buffer_path /mnt/ramdisk/trackLog
buffer_chunk_limit 4m
buffer_queue_limit 50
flush_interval 5s
</match>

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Configuration of fluent-plugin-concat makes the logs dissappear - docker

Related

How to inject `time` attribute based on certain json key value?

FluentD configuration to index on all the fields in the log in elastic

Fluentd (td-agent) secondary type should be same with primary one error

Configure fluentd to properly parse and ship java stacktrace,which is formatted using docker json-file logging driver,to elastic as single message

Fluentd: How to place the time key inside the json string

Categories

Resources