Fluentd element emitts message record - fluentd

We have a td_agent.conf file with the following tag:
#this filter is used for C API which remove "[stdout]" from log
#if CLOG Unified Logging C API won't be used, this filter can be removed
<filter k.**.log>
#type parser
format /^(\[stdout\])*(?<log>.+)$/
key_name log
suppress_parse_error_log true
</filter>
and the following sample log line:
{"host":"omer","level":"TRACE","log":{"classname":"Manager:452","message":"^~\"DD\"-^ TRACE Added context","stacktrace":"","threadname":"Processing-ThreadPool-2"},"process":"Context","service":"","time":"2020-11-04T13:37:12.979Z","timezone":"Kolkata","type":"log"}
When having the above logic in Fluentd, we get the log outputted, with the log: {} emitted, that means not having the info that we want in the elastic db. When removing the tag, it all works fine.
Can anyone explain why this is needed?
The start of the td-agent is:
<source>
#type tail
path /var/log/containers/*s*.log
pos_file /var/log/td-agent/containers.json.access.pos
tag k.*
#read_from_head true
<parse>
#type regexp
expression /(^(?<header>[^\{]+)?(?<message>\{.+\})$)|(^(?<log>[^\{].+))/
</parse>
</source>
<filter k.var.log.containers.**.log>
#type parser
key_name message
format json
#time_parse false
time_key time
time_format %iso8601
keep_time_key true
</filter>
#this filter is used for C API which remove "[stdout]" from log
#if CLOG Unified Logging C API won't be used, this filter can be removed
<filter k.**.log>
#type parser
format /^(\[stdout\])*(?<log>.+)$/
key_name log
suppress_parse_error_log true
</filter>

Related

How to unnest/flatten json in fluentd using parser filter?

This is my current config.
<source>
#type dummy
rate 1
tag eggsample
dummy {"event":"signup","context":{"ip":"105.175.82.28"}}
</source>
<filter eggsample>
#type parser
key_name context
reserve_data true
<parse>
#type json
</parse>
</filter>
<match eggsample>
#type stdout
</match>
As shown in the example above, I'm trying to unnest/flatten the "context" object
Hence:
{"event":"purchase","context":{"ip":"105.175.82.28"}} > {"event":"purchase","ip":"105.175.82.28"}
However, I'm getting a error where the parser plugin is raising pattern not matched error, even though my config should be similar to that of the documentation.
2022-11-28 07:43:03 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not matched with data '{\"ip\"=>\"105.175.82.28\"}'" location=nil tag="eggsample" time=2022-11-28 07:43:03.006905641 +0000 record={"event"=>"signup", "log"=>{"ip"=>"105.175.82.28"}}
2022-11-28 07:43:03.006905641 +0000 eggsample: {"event":"signup","log":{"ip":"105.175.82.28"}}

fluentd output json value as json without message_key

I'm creating a log pipeline with filtering, transformations and multiple output routes.
I have a problem with outputing the raw log (without the "message_key").
Currently, the log looks like:
{"log": {"type": "debug", "log" :"This is the log message" , <More Entries>} }
I would like to drop the "log" message_key and output:
{"type": "debug", "log" :"This is the log message", <More Entries>}
I've tried:
1.
<filter *>
#type parser
key_name log
<parse>
#type json
</parse>
</filter>
And got an error probally since the the type is already a json.
2.
<filter *>
#type parser
key_name log
<parse>
#type none
</parse>
</filter>
And got this output (message "message_key" instead of the current "log"):
{"message": {"type": "debug", "log" :"This is the log message"} }
Tried to use the #type record_transformer, but the <record> want's a key-value and I would like to select the value only.
Tried to format under with single value, but the output was:
{"type" => "debug", "log" => "This is the log message"}
How can this be done? What's the best way to drop the message_key before outputing the log?
After skimming through the fluentd plugins here I didn't find a way to do what I wanted, so I've ended out writing my own plugin.
I'm not going to accept my answer since I hope someone will provide a better using a certified plugin.
Just in case you are desperate for a solution, here's the plugin:
require "fluent/plugin/filter"
module Fluent
module Plugin
class JsonRecordByKeyFilter < Fluent::Plugin::Filter
Fluent::Plugin.register_filter("json_record_by_key", self)
config_param :key
def filter(tag, time, record)
record[#key]
end
end
end
end
Usage:
<filter *>
#type json_record_by_key
key log
</filter>
Ran into this same problem, but found a solution as outlined on Fluentd's website using the remove_key_name_field command:
https://docs.fluentd.org/filter/parser#remove_key_name_field
<filter *>
#type parser
key_name log
reserve_data true
remove_key_name_field true
<parse>
#type json
</parse>
</filter>

How to inject `time` attribute based on certain json key value?

I am still new on fluentd, I've tried various configuration, but I am stuck.
Suppose I have this record pushed to fluend that has _epoch to tell the epoch time the record is created.
{"data":"dummy", "_epoch": <epochtime_in_second>}
Instead of using time attribute being processed by fluentd, I want to override the time with this _epoch field. How to produce fluentd output with time overriden?
I've tried this
# TCP input to receive logs from the forwarders
<source>
#type forward
bind 0.0.0.0
port 24224
</source>
# HTTP input for the liveness and readiness probes
<source>
#type http
bind 0.0.0.0
port 9880
</source>
# rds2fluentd_test
<filter rds2fluentd_test.*>
#type parser
key_name _epoch
reserve_data true
<parse>
#type regexp
expression /^(?<time>.*)$/
time_type unixtime
utc true
</parse>
</filter>
<filter rds2fluentd_test.*>
#type stdout
</filter>
<match rds2fluentd_test.*>
#type s3
#log_level debug
aws_key_id "#{ENV['AWS_ACCESS_KEY']}"
aws_sec_key "#{ENV['AWS_SECRET_KEY']}"
s3_bucket foo-bucket
s3_region ap-southeast-1
path ingestion-test-01/${_db}/${_table}/%Y-%m-%d-%H-%M/
#s3_object_key_format %{path}%{time_slice}_%{index}.%{file_extension}
# if you want to use ${tag} or %Y/%m/%d/ like syntax in path / s3_object_key_format,
# need to specify tag for ${tag} and time for %Y/%m/%d in <buffer> argument.
<buffer time,_db,_table>
#type file
path /var/log/fluent/s3
timekey 1m # 5 minutes partition
timekey_wait 10s
timekey_use_utc true # use utc
chunk_limit_size 256m
</buffer>
time_slice_format %Y%m%d%H
store_as json
</match>
But upon receiving data like above, it shows warning error like this:
#0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="parse failed no implicit conversion of Integer into Hash" location="/usr/local/bundle/gems/fluentd-1.10.4/lib/fluent/plugin/filter_parser.rb:110:in `rescue in filter_with_time'" tag="rds2fluentd_test." time=1590578507 record={....
was getting the same warning message, setting hash_value_field parsed under filter section solved the issue.

FluentD configuration to index on all the fields in the log in elastic

Hi I have the below log from springboot microservice. What to create a index on all the below fields like timestamp, level, logger etc in elastic. How to achieve this in fluentd configuration? Tried the below and it didnt work
Log
timestamp:2020-04-27 09:37:56.996 level:INFO level_value:20000 thread:http-nio-8080-exec-2 logger:com.scb.nexus.service.phoenix.components.ApplicationEventListener context:default message:org.springframework.web.context.support.ServletRequestHandledEvent traceId:a122e51aa3d24d4a spanId:a122e51aa3d24d4a spanExportable:false X-Span-Export:false X-B3-SpanId:a122e51aa3d24d4a X-B3-TraceId:a122e51aa3d24d4a
fluentd conf
<match **>
#type elasticsearch
time_as_integer true
include_timestamp true
host host
port 9200
user userName
password password
scheme https
ssl_verify false
ssl_version TLSv1_2
index_name testIndex
</match>
<filter **>
#type parser
key_name log
reserve_data true
<parse>
#type json
</parse>
</filter>
Logs are not in JSON format, therefore you cant use the Json parser. You have the following options to solve this issue
1- use regex parser as described here https://docs.fluentd.org/parser/regexp
2- use record_reformer plugin and extract items manually
example:
<match **>
#type record_reformer
tag parsed.${tag_suffix[2]}
renew_record false
enable_ruby true
<record>
timestamp ${record['log'].scan(/timestamp:(?<param>[^ ]+ [^ ]+)/).flatten.compact.sort.first}
log_level ${record['log'].scan(/level:(?<param>[^ ]+)/).flatten.compact.sort.first}
level_value ${record['log'].scan(/level_value:(?<param>[^ ]+)/).flatten.compact.sort.first}
</record>
</match>
<match parsed.**>
#type elasticsearch
time_as_integer true
include_timestamp true
host host
port 9200
user userName
password password
scheme https
ssl_verify false
ssl_version TLSv1_2
index_name testIndex
</match>

Fluentd (td-agent) secondary type should be same with primary one error

I'm using td-agent with http plugin for sending the log data to another server.
But when start td-agent with my config file, I got the warning message like below,
2019-09-06 11:02:15 +0900 [warn]: #0 secondary type should be same
with primary one primary="Fluent::TreasureDataLogOutput"
secondary="Fluent::Plugin::FileOutput"
Here is my config file,
<source>
#type tail
path /pub/var/log/mylog_%Y%m%d.log
pos_file /var/log/td-agent/www_log.log.pos
tag my.log
format /^(?<log_time>\d{4}\/\d{2}\/\d{2}\s\d{2}:\d{2}:\d{2}) \[INFO\] (?<message>.*)$/
</source>
<filter my.log>
#type parser
key_name message
<parse>
#type json
</parse>
</filter>
<match my.log>
#type http
endpoint_url http://localhost:3000/
custom_headers {"user-agent": "td-agent"}
http_method post
raise_on_error true
</match>
It is sending the log data correctly but I need to resolve the warning message too. How can I resolve the warning?

Resources