Configuration of fluent-plugin-concat makes the logs dissappear - docker
my configuration of ” fluent-plugin-concat” is causing my long logs to disappear instead of be concatenated and sent to Kinesis steam.
I use fluentd to send logs from containers deployed on AWS/ECS to a kinesis stream. ( and then to ES cluster somewhere)
on rare occasions, some of the logs are very big. most of the time they are under the docker limit of 16K. However, those rare long logs are very important and we don't want to miss them.
My configuration file is attached.
Just before the final match sequence, I added:
<filter>
#type concat
key log
stream_identity_key container_id
partial_key partial_message
partial_value true
separator “”
</filter>
Another configuration I tried:
with the bellow options only the second partial log is sent too ES, the first part can only be seen in the fluentd logs. Adding the logs of this config as a file.
<filter>
#type concat
key log
stream_identity_key partial_id
use_partial_metadata true
separator ""
</filter>
and
<filter>
#type concat
key log
use_partial_metadata true
separator ""
</filter>
The log I’m testing with is also attached as a json document.
If I removed this configuration, this log will be sent in 2 chunks.
What am I doing wrong? (edited)
the full config file:
<system>
log_level info
</system>
# just listen on the unix socket in a dir mounted from host
# input is a json object, with the actual log line in the `log` field
<source>
#type unix
path /var/fluentd/fluentd.sock
</source>
# tag log line as json or text
<match service.*.*>
#type rewrite_tag_filter
<rule>
key log
pattern /.*"logType":\s*"application"/
tag application.${tag}.json
</rule>
<rule>
key log
pattern /.*"logType":\s*"exception"/
tag exception.${tag}.json
</rule>
<rule>
key log
pattern /.*"logType":\s*"audit"/
tag audit.${tag}.json
</rule>
<rule>
key log
pattern /^\{".*\}$/
tag default.${tag}.json
</rule>
<rule>
key log
pattern /.+/
tag default.${tag}.txt
</rule>
</match>
<filter *.service.*.*.*>
#type record_transformer
<record>
service ${tag_parts[2]}
childService ${tag_parts[3]}
</record>
</filter>
<filter *.service.*.*.json>
#type parser
key_name log
reserve_data true
remove_key_name_field true
<parse>
#type json
</parse>
</filter>
<filter *.service.*.*.*>
#type record_transformer
enable_ruby
<record>
#timestamp ${ require 'time'; Time.now.utc.iso8601(3) }
</record>
</filter>
<filter>
#type concat
key log
stream_identity_key container_id
partial_key partial_message
partial_value true
separator ""
</filter>
<match exception.service.*.*.*>
#type copy
<store>
#type kinesis_streams
region "#{ENV['AWS_DEFAULT_REGION']}"
stream_name the-name-ex
debug false
<instance_profile_credentials>
</instance_profile_credentials>
<buffer>
flush_at_shutdown true
flush_interval 10
chunk_limit_size 16m
flush_thread_interval 1.0
flush_thread_burst_interval 1.0
flush_thread_count 1
</buffer>
</store>
<store>
#type stdout
</store>
</match>
<match audit.service.*.*.json>
#type copy
<store>
#type kinesis_streams
region "#{ENV['AWS_DEFAULT_REGION']}"
stream_name the-name-sa
debug false
<instance_profile_credentials>
</instance_profile_credentials>
<buffer>
flush_at_shutdown true
flush_interval 1
chunk_limit_size 16m
flush_thread_interval 0.1
flush_thread_burst_interval 0.01
flush_thread_count 15
</buffer>
</store>
<store>
#type stdout
</store>
</match>
<match *.service.*.*.*>
#type copy
<store>
#type kinesis_streams
region "#{ENV['AWS_DEFAULT_REGION']}"
stream_name the-name-apl
debug false
<instance_profile_credentials>
</instance_profile_credentials>
<buffer>
flush_at_shutdown true
flush_interval 10
chunk_limit_size 16m
flush_thread_interval 1.0
flush_thread_burst_interval 1.0
flush_thread_count 1
</buffer>
</store>
<store>
#type stdout
</store>
</match>
<match **>
#type stdout
</match>
example log message - long single line:
{"message": "some message", "longlogtest": "averylongjsonline", "service": "longlog-service", "logType": "application", "log": "aaa .... ( ~18000 chars )..longlogThisIsTheEndOfTheLongLog"}
fluentd-container-log ... contains only the first part of the message:
and the following error message:
dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not match with data
2021-03-05 13:45:41.886672929 +0000 fluent.warn: {"error":"#<Fluent::Plugin::Parser::ParserError: pattern not match with data '{\"message\": \"some message\", \"longlogtest\": \"averylongjsonline\", \"service\": \"longlog-service\", \"logType\": \"application\", \"log\": \"aaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilonglogaaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilonglogaaaasssssdddddjjjjjjjkkkkkkklllllllwewww
< .....Manay lines of original log erased here...... >
djjjjjjjkkkkkkklllllllwewwwiiiaaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilongloglonglogaaaassss'>","location":null,"tag":"application.service.longlog.none.json","time":1614951941,"record":{"source":"stdout","log":"{\"message\": \"some message\", \"longlogtest\": \"averylongjsonline\", \"service\": \"longlog-service\", \"logType\": \"application\", \"log\": \"aaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilonglogaaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilonglogaaaasssssdddddjjjjjjjkkkkkkklllllllwewww
< .....Manay lines of original log erased here...... >
wwiiiiiilonglogaaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiaaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilongloglonglogaaaassss","partial_message":"true","partial_id":"5c752c1bbfda586f1b867a8ce2274e0ed0418e8e10d5e8602d9fefdb8ad2b7a1","partial_ordinal":"1","partial_last":"false","container_id":"803c0ebe4e6875ea072ce21179e4ac2d12e947b5649ce343ee243b5c28ad595a","container_name":"/ecs-longlog-18-longlog-b6b5ae85ededf4db1f00","service":"longlog","childService":"none"},"message"
:"dump an error event: error_class=Fluent::Plugin::Parser::ParserError error=\"pattern not match with data '{\\\"message\\\": \\\"some message\\\", \\\"longlogtest\\\": \\\"averylongjsonline\\\", \\\"service\\\": \\\"longlog-service\\\", \\\"logType\\\": \\\"application\\\", \\\"log\\\": \\\"aaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilonglogaaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilonglogaaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilonglogaaaasssssdddddjjjjjjjkkkkkkklllllllwewwwiiiiiilonglogaaaasss
Related
How to inject `time` attribute based on certain json key value?
I am still new on fluentd, I've tried various configuration, but I am stuck. Suppose I have this record pushed to fluend that has _epoch to tell the epoch time the record is created. {"data":"dummy", "_epoch": <epochtime_in_second>} Instead of using time attribute being processed by fluentd, I want to override the time with this _epoch field. How to produce fluentd output with time overriden? I've tried this # TCP input to receive logs from the forwarders <source> #type forward bind 0.0.0.0 port 24224 </source> # HTTP input for the liveness and readiness probes <source> #type http bind 0.0.0.0 port 9880 </source> # rds2fluentd_test <filter rds2fluentd_test.*> #type parser key_name _epoch reserve_data true <parse> #type regexp expression /^(?<time>.*)$/ time_type unixtime utc true </parse> </filter> <filter rds2fluentd_test.*> #type stdout </filter> <match rds2fluentd_test.*> #type s3 #log_level debug aws_key_id "#{ENV['AWS_ACCESS_KEY']}" aws_sec_key "#{ENV['AWS_SECRET_KEY']}" s3_bucket foo-bucket s3_region ap-southeast-1 path ingestion-test-01/${_db}/${_table}/%Y-%m-%d-%H-%M/ #s3_object_key_format %{path}%{time_slice}_%{index}.%{file_extension} # if you want to use ${tag} or %Y/%m/%d/ like syntax in path / s3_object_key_format, # need to specify tag for ${tag} and time for %Y/%m/%d in <buffer> argument. <buffer time,_db,_table> #type file path /var/log/fluent/s3 timekey 1m # 5 minutes partition timekey_wait 10s timekey_use_utc true # use utc chunk_limit_size 256m </buffer> time_slice_format %Y%m%d%H store_as json </match> But upon receiving data like above, it shows warning error like this: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="parse failed no implicit conversion of Integer into Hash" location="/usr/local/bundle/gems/fluentd-1.10.4/lib/fluent/plugin/filter_parser.rb:110:in `rescue in filter_with_time'" tag="rds2fluentd_test." time=1590578507 record={....
was getting the same warning message, setting hash_value_field parsed under filter section solved the issue.
FluentD configuration to index on all the fields in the log in elastic
Hi I have the below log from springboot microservice. What to create a index on all the below fields like timestamp, level, logger etc in elastic. How to achieve this in fluentd configuration? Tried the below and it didnt work Log timestamp:2020-04-27 09:37:56.996 level:INFO level_value:20000 thread:http-nio-8080-exec-2 logger:com.scb.nexus.service.phoenix.components.ApplicationEventListener context:default message:org.springframework.web.context.support.ServletRequestHandledEvent traceId:a122e51aa3d24d4a spanId:a122e51aa3d24d4a spanExportable:false X-Span-Export:false X-B3-SpanId:a122e51aa3d24d4a X-B3-TraceId:a122e51aa3d24d4a fluentd conf <match **> #type elasticsearch time_as_integer true include_timestamp true host host port 9200 user userName password password scheme https ssl_verify false ssl_version TLSv1_2 index_name testIndex </match> <filter **> #type parser key_name log reserve_data true <parse> #type json </parse> </filter>
Logs are not in JSON format, therefore you cant use the Json parser. You have the following options to solve this issue 1- use regex parser as described here https://docs.fluentd.org/parser/regexp 2- use record_reformer plugin and extract items manually example: <match **> #type record_reformer tag parsed.${tag_suffix[2]} renew_record false enable_ruby true <record> timestamp ${record['log'].scan(/timestamp:(?<param>[^ ]+ [^ ]+)/).flatten.compact.sort.first} log_level ${record['log'].scan(/level:(?<param>[^ ]+)/).flatten.compact.sort.first} level_value ${record['log'].scan(/level_value:(?<param>[^ ]+)/).flatten.compact.sort.first} </record> </match> <match parsed.**> #type elasticsearch time_as_integer true include_timestamp true host host port 9200 user userName password password scheme https ssl_verify false ssl_version TLSv1_2 index_name testIndex </match>
Fluentd (td-agent) secondary type should be same with primary one error
I'm using td-agent with http plugin for sending the log data to another server. But when start td-agent with my config file, I got the warning message like below, 2019-09-06 11:02:15 +0900 [warn]: #0 secondary type should be same with primary one primary="Fluent::TreasureDataLogOutput" secondary="Fluent::Plugin::FileOutput" Here is my config file, <source> #type tail path /pub/var/log/mylog_%Y%m%d.log pos_file /var/log/td-agent/www_log.log.pos tag my.log format /^(?<log_time>\d{4}\/\d{2}\/\d{2}\s\d{2}:\d{2}:\d{2}) \[INFO\] (?<message>.*)$/ </source> <filter my.log> #type parser key_name message <parse> #type json </parse> </filter> <match my.log> #type http endpoint_url http://localhost:3000/ custom_headers {"user-agent": "td-agent"} http_method post raise_on_error true </match> It is sending the log data correctly but I need to resolve the warning message too. How can I resolve the warning?
Configure fluentd to properly parse and ship java stacktrace,which is formatted using docker json-file logging driver,to elastic as single message
Our service runs as a docker instance. Given limitation is that the docker logging driver cannot be changed to anything different than the default json-file driver. The (scala micro)service outputs a log that looks like this {"log":"10:30:12.375 [application-akka.actor.default-dispatcher-13] [WARN] [rulekeepr-615239361-v5mtn-7]- c.v.r.s.logic.RulekeeprLogicProvider(91) - decision making have failed unexpectedly\n","stream":"stdout","time":"2017-05-08T10:30:12.376485994Z"} {"log":"java.lang.RuntimeException: Error extracting fields to make a lookup for a rule at P2: [failed calculating amount/amountEUR/directive: [failed getting accountInfo of companyId:3303 from deadcart: unexpected status returned: 500]]\n","stream":"stdout","time":"2017-05-08T10:30:12.376528449Z"} {"log":"\u0009at org.assbox.rulekeepr.services.BasicRuleService$$anonfun$lookupRule$2.apply(BasicRuleService.scala:53)\n","stream":"stdout","time":"2017-05-08T10:30:12.376537277Z"} {"log":"\u0009at org.assbox.rulekeepr.services.BasicRuleService$$anonfun$lookupRule$2.apply(BasicRuleService.scala:53)\n","stream":"stdout","time":"2017-05-08T10:30:12.376542826Z"} {"log":"\u0009at scala.concurrent.Future$$anonfun$transform$1$$anonfun$apply$2.apply(Future.scala:224)\n","stream":"stdout","time":"2017-05-08T10:30:12.376548224Z"} {"log":"Caused by: java.lang.RuntimeException: failed calculating amount/amountEUR/directive: [failed getting accountInfo of companyId:3303 from deadcart: unexpected status returned: 500]\n","stream":"stdout","time":"2017-05-08T10:30:12.376674554Z"} {"log":"\u0009at org.assbox.rulekeepr.services.logic.TlrComputedFields$$anonfun$calculatedFields$1.applyOrElse(AbstractComputedFields.scala:39)\n","stream":"stdout","time":"2017-05-08T10:30:12.376680922Z"} {"log":"\u0009at org.assbox.rulekeepr.services.logic.TlrComputedFields$$anonfun$calculatedFields$1.applyOrElse(AbstractComputedFields.scala:36)\n","stream":"stdout","time":"2017-05-08T10:30:12.376686377Z"} {"log":"\u0009at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)\n","stream":"stdout","time":"2017-05-08T10:30:12.376691228Z"} {"log":"\u0009... 19 common frames omitted\n","stream":"stdout","time":"2017-05-08T10:30:12.376720255Z"} {"log":"Caused by: java.lang.RuntimeException: failed getting accountInfo of companyId:3303 from deadcart: unexpected status returned: 500\n","stream":"stdout","time":"2017-05-08T10:30:12.376724303Z"} {"log":"\u0009at org.assbox.rulekeepr.services.mixins.DCartHelper$$anonfun$accountInfo$1.apply(DCartHelper.scala:31)\n","stream":"stdout","time":"2017-05-08T10:30:12.376729945Z"} {"log":"\u0009at org.assbox.rulekeepr.services.mixins.DCartHelper$$anonfun$accountInfo$1.apply(DCartHelper.scala:24)\n","stream":"stdout","time":"2017-05-08T10:30:12.376734254Z"} {"log":"\u0009... 19 common frames omitted\n","stream":"stdout","time":"2017-05-08T10:30:12.37676087Z"} How can I harness fluentd directives for properly combining the following log event that contains a stack trace, so it all be shipped to elastic as single message? I have full control of the logback appender pattern used, so I can change the order of occurrence of log values to something else, and even change the appender class. We're working with k8s and it turns out its not straight forward to change the docker logging driver so we're looking for a solution that will be able to handle the given example. I don't care so much about extracting the loglevel, thread, logger into specific keys so I could later easily filter by them in kibana. it would be nice to have, but less important. What is important is to accurately parse the timestamp, down to the milliseconds and use it as the actual log even timestamp as it shipped to elastic.
You can use fluent-plugin-concat. For example with Fluentd v0.14.x, <source> #type tail path /var/log/containers/*.log pos_file /var/log/fluentd-containers.log.pos tag kubernetes.* read_from_head true <parse> #type json </parse> #label #INPUT </source> <label #INPUT> <filter kubernetes.**> #type concat key log multiline_start_regexp ^\d{2}:\d{2}:\d{2}\.\d+ continuous_line_regexp ^(\s+|java.lang|Caused by:) separator "" flush_interval 3s timeout_label #PARSE </filter> <match kubernetes.**> #type relabel #label #PARSE </match> </label> <label #PARSE> <filter kubernetes.**> #type parser key_name log inject_key_prefix log. <parse> #type multiline_grok grok_failure_key grokfailure <grok> pattern YOUR_GROK_PATTERN </grok> </parse> </filter> <match kubernetes.**> #type relabel #label #OUTPUT </match> </label> <label #OUTPUT> <match kubernetes.**> #type stdout </match> </label> Similar issues: https://github.com/fluent/fluent-plugin-grok-parser/issues/36 https://github.com/fluent/fluent-plugin-grok-parser/issues/37
You can try using the fluentd-plugin-grok-parser - but I am having the same issue - it seems that the \u0009 tab character is not being recognized and so using fluentd-plugin-detect-exceptions will not detect the multiline exceptions - at least not yet in my attempts... .
In fluentd 1.0 I was able to achieve this with fluent-plugin-concat. The concat plugin starts and continues concatenation until it sees multiline_start_regexp pattern again. This captures JAVA exceptions and multiline slf4j log statements. Adjust your multiline_start_regexp pattern to match your slf4j log output line. Any line, including exceptions starting starting with a timestamp matching pattern 2020-10-05 18:01:52.871 will be concatenated ex: 2020-10-05 18:01:52.871 ERROR 1 --- [nio-8088-exec-3] c.i.printforever.DemoApplication multiline statement I am using container_id as the identity key, <system> log_level debug </system> # Receive events from 24224/tcp # This is used by log forwarding and the fluent-cat command <source> #type forward #id input1 #label #mainstream port 24224 </source> # All plugin errors <label #ERROR> <match **> #type file #id error path /fluentd/log/docker/error/error.%Y-%m-%d.%H%M compress gzip append true <buffer> #type file path /fluentd/log/docker/error timekey 60s timekey_wait 10s timekey_use_utc true total_limit_size 200mb </buffer> </match> </label> <label #mainstream> <filter docker.**> #type concat key log stream_identity_key container_id multiline_start_regexp /^\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}\.\d{1,3}/ </filter> # Match events with docker tag # Send them to S3 <match docker.**> #type copy <store> #type s3 #id output_docker_s3 aws_key_id "#{ENV['AWS_KEY_ID']}" aws_sec_key "#{ENV['AWS_SECRET_KEY']}" s3_bucket "#{ENV['S3_BUCKET']}" path "#{ENV['S3_OBJECT_PATH']}" store_as gzip <buffer tag,time> #type file path /fluentd/log/docker/s3 timekey 300s timekey_wait 1m timekey_use_utc true total_limit_size 200mb </buffer> time_slice_format %Y%m%d%H </store> <store> #type stdout </store> <store> #type file #id output_docker_file path /fluentd/log/docker/file/${tag}.%Y-%m-%d.%H%M compress gzip append true <buffer tag,time> #type file timekey_wait 1m timekey 1m timekey_use_utc true total_limit_size 200mb path /fluentd/log/docker/file/ </buffer> </store> </match> <match **> #type file #id output_file path /fluentd/log/docker/catch-all/data.*.log </match> </label>
Fluentd: How to place the time key inside the json string
This is the record that fluentd write to my log. 2016-02-22 14:38:59 {"login_id":123,"login_email":"abc#gmail.com"} The date time is the time key of fluentd. How can i place that time inside the json string ?
My friend helped me this. He used this fluentd plugin: http://docs.fluentd.org/articles/filter-plugin-overview This is the config: <filter trackLog> type record_modifier <record> fluentd_time ${Time.now.strftime("%Y-%m-%d %H:%M:%S")} </record> </filter> <match trackLog> type record_modifier tag trackLog.finished </match> <match trackLog.finished> type webhdfs host localhost port 50070 path /data/trackLog/%Y%m%d_%H username hdfs output_include_tag false remove_prefix trackLog.finished output_include_time false buffer_type file buffer_path /mnt/ramdisk/trackLog buffer_chunk_limit 4m buffer_queue_limit 50 flush_interval 5s </match>