fluentD, how to use variable at <match> - fluentd

elasticsearch index Can not use names like this?
data: {"key1": "val1"}
fluentD conf:
<match **>
#type elasticsearch
host localhost
port 9200
logstash_format true
logstash_prefix ${key1}
time_key #timestamp
include_timestamp true
</ match>
Error:
[warn]: #0 chunk key placeholder '' not replaced. template:${key1}

Maybe you must set chunk_keys.
<match **>
#type elasticsearch
# snip
<buffer tag,key1>
logstash_prefix ${key1}
</buffer>
</match>

Maybe you can specify a custom index, like this:
<match **>
#type elasticsearch
index_name whatever.${key1}
<buffer tag, key1>
logstash_prefix ${key1}
</buffer>
</match>

Related

how to set hotsname in index name in fluentd conf file

I want to set hostname in index_name of fluentd conf file. I am setting like this but it is not working
<match output.**>
#type copy
<store>
#type elasticsearch
host elasticsearch
ssl_version TLSv1_2
ssl_verify false
type_name _doc
port 443
scheme https
flush_interval 10s
index_name abc-${hostname}
</store>
<store>
#type stdout
</store>
</match>
How can I achieve that?
Your question is not very clear but let me try to answer anyway.
You can achive it in your source part.
Example
<source>
type tail
#format json
path path_to_the_file
pos_file /var/log/td-agent/buffer/somename
tag hostname #in plain text(there are other methods too)
</source>
Now add
include_tag_key true
logstash_prefix ${tag}
logstash_format true
to your <match> part after host and remove index_name..
It's not a solution to your problem but it will give you a direction hopefully.

create index for elasticsearch as namespaces names

im useing elasticsearch opendistro whith fluentd and i want to collect my kubernetes cluster logs , i want collect logs per namespace in index's . im lookin this answer but still having problem.also i added Fluentd-${record['kubernetes']['namespace_name']} but it couldn't defined my namespaces.
im using this conf for source
## logs from podman
<source>
#type tail
#id in_tail_container_logs
#label #KUBERNETES
path /var/log/containers/*.log
pos_file /var/log/fluentd-containers.log.pos
tag kubernetes.*
read_from_head true
<parse>
#type multi_format
<pattern>
format json
time_key time
time_type string
time_format "%Y-%m-%dT%H:%M:%S.%NZ"
keep_time_key false
</pattern>
<pattern>
format regexp
expression /^(?<time>.+) (?<stream>stdout|stderr)( (.))? (?<log>.*)$/
time_format '%Y-%m-%dT%H:%M:%S.%NZ'
keep_time_key false
</pattern>
</parse>
emit_unmatched_lines true
</source>
and about filters.conf
<label #KUBERNETES>
<match kubernetes.var.log.containers.fluentd**>
#type relabel
#label #FLUENT_LOG
</match>
<filter kubernetes.**>
#type kubernetes_metadata
#id filter_kube_metadata
</filter>
<filter kubernetes.**>
#id filter_parser
#type parser
key_name log
reserve_data true
remove_key_name_field true
<parse>
#type multi_format
<pattern>
format json
</pattern>
<pattern>
format none
</pattern>
</parse>
</filter>
<match **>
#type relabel
#label #OUTPUT
</match>
</label>
and finally in output
04_outputs.conf: |-
<label #OUTPUT>
<match **>
#type elasticsearch
host myhost
port 9200
user myuser
password mypass
scheme https
ssl_verify false
logstash_prefix Fluentd-${record['kubernetes']['namespace_name']}
logstash_format true
<buffer tag, $.kubernetes.namespace_name>
flush_thread_count 8
flush_interval 5s
chunk_limit_size 2M
queue_limit_length 32
retry_max_interval 30
retry_forever true
</buffer>
</match>
</label>
but in index still i haven't anything
I was recently working on a fluent-bit -> fluentd -> opensearch setup so just putting my solution here.
In my case, I was also getting the literal ${record['kubernetes']['namespace_name']} as my index instead of the actual namespace (tried different variations like accessor pattern, with or without quotes, double/single etc but didn't work). If you do not need the tag, you can use it to pass the index name by rewriting it:
<match kube.**>
#type rewrite_tag_filter
<rule>
key $['kubernetes']['namespace_name']
pattern ^(.+)$
tag $1
</rule>
</match>
And on your output,
logstash_prefix fluentd-${tag}
logstash_format true
Hope it helps even though this can be considered a hack.

data loss while sending from fluentd to aws kinesis firehose

We are using fluentd to send logs to aws kinesis firehose. We can see few records not sent to aws kinesis firehose every now and then.
Here is our settings in fluentd.
<system>
log_level info
</system>
<source>
#type tail
path "/var/log/app/tracy.log*"
pos_file "/var/tmp/tracy.log.pos"
pos_file_compaction_interval 72h
#log_level "error"
tag "tracylog"
<parse>
#type "json"
time_key False
</parse>
</source>
<source>
#type monitor_agent
bind 127.0.0.1
port 24220
</source>
<match tracylog>
#type "kinesis_firehose"
region "${awsRegion}"
delivery_stream_name "${delivery_stream_name}"
<instance_profile_credentials>
</instance_profile_credentials>
<buffer>
# Frequency of ingestion
flush_interval 30s
flush_thread_count 4
chunk_limit_size 1m
</buffer>
</match>
A few changes in the config fixed my issue:
<system>
log_level info
</system>
<source>
#type tail
path "/var/log/app/tracy.log*"
pos_file "/var/tmp/tracy.log.pos"
pos_file_compaction_interval 72h
read_from_head true
follow_inodes true
#log_level "error"
tag "tracylog"
<parse>
#type "json"
time_key False
</parse>
</source>
<source>
#type monitor_agent
bind 127.0.0.1
port 24220
</source>
<match tracylog>
#type "kinesis_firehose"
region "${awsRegion}"
delivery_stream_name "${delivery_stream_name}"
<instance_profile_credentials>
</instance_profile_credentials>
<buffer>
flush_interval 2
flush_thread_interval 0.1
flush_thread_burst_interval 0.01
flush_thread_count 8
</buffer>

fluentd localtime is working for stdout, but not elasticsearch

I'm tailing a syslog file which doesn't have the timezone. By default fluentd (incorrectly) assumes the timezone is UTC, so it shifts the time off by several hours.
I can fix this for stdout, using 'localtime true', but I can't find a setting to do the same thing for elasticsearch:
<source>
#type tail
# read_from_head true
<parse>
#type syslog
</parse>
path /tmp/syslog
pos_file /tmp/var_log_syslog.pos
tag syslog.file
</source>
<match syslog.**>
#type copy
<store>
#type elasticsearch
host elasticsearch
port 9200
logstash_format true
logstash_prefix fluentd
logstash_dateformat %Y%m%d
include_tag_key true
type_name access_log
tag_key #log_name
flush_interval 1s
utc_index false
</store>
<store>
#type stdout
localtime true
</store>
</match>
It looks like the desired behavior is the default behavior. Fluentd seems to use the localtime zone, but I was running it in a docker container and I forgot to set the container's timezone.

Fluentd logging driver sends unstructured log message

My environment has a setup where docker container logs are forwarded to fluentd, then fluentd forwards to splunk.
I have a issue with fluentd, some of the docker container logs are not in structured format. From the documentation i see that:
fluentd log driver sends the following metadata in the structured log message:
container_id,
container_name,
source,
log
My issue is few of the logs have unstructured metadata information:
for example:
Log 1:
{"log":"2019/03/12 13:59:49 [info] 6#6: *2425596 client closed connection while waiting for request, client: 10.17.84.12, server: 0.0.0.0:80","container_id":"789459f8f8a52c8b4b","container_name":"testingcontainer-1ed-fwij4-EcsTaskDefinition-1TF1DH,"source":"stderr"}
Log 2:
{"container_id":"26749a26500dd04e92fc","container_name":"/4C4DTHQR2V6C-EcsTaskDefinition-1908NOZPKPKY0-1","source":"stdout","log":"\u001B[0mGET \u001B[32m200 \u001B[0m0.634 ms - -\u001B[0m"}
These two logs have different order of metadata information(log1-[log, conatiner-name, container_id, source])(log2- [container_id, conatiner-name, source, log]). Because of this i'm getting some issues in splunk. How can i resolve this to get same order of metadata info?
my fluend config file is
<source>
#type forward
#id input1
#label #mainstream
#log_level trace
port 24224
</source>
<label #mainstream>
<match *.**>
type copy
<store>
#type file
#id output_docker1
path /fluentd/log/docker.*.log
symlink_path /fluentd/log/docker.log
append true
time_slice_format %Y%m%d
time_slice_wait 1m
time_format %Y%m%dT%H%M%S%z
utc
buffer_chunk_limit 512m
</store>
<store>
#type s3
#id output_docker2
#log_level trace
s3_bucket bucketwert-1
s3_region us-east-1
path logs/
buffer_path /fluentd/log/docker.log
s3_object_key_format %{path}%{time_slice}_sbx_docker_%{index}.%{file_extension}
flush_interval 3600s
time_slice_format %Y%m%d
time_format %Y%m%dT%H%M%S%z
utc
buffer_chunk_limit 512m
</store>
</match>
</label>
How about fluent-plugin-record-sort?
Or you can use built-in record_trandformer plugin like the following if you know all keys in a record:
<source>
#type dummy
tag dummy
dummy [
{"log": "log1", "container_id": "123", "container_name": "name1", "source": "stderr"},
{"container_id": "456", "container_name": "name2", "source": "stderr", "log": "log2"}
]
</source>
<filter dummy>
#type record_transformer
renew_record true
keep_keys log,container_id,container_name,source
</filter>
<match dummy>
#type stdout
</match>
UPDATE(not tested):
<source>
#type forward
#id input1
#label #mainstream
#log_level trace
port 24224
</source>
<label #mainstream>
<filter>
#type record_transformer
renew_record true
keep_keys log,container_id,container_name,source
</filter>
<match *.**>
#type copy
<store>
#type file
#id output_docker1
path /fluentd/log/docker.*.log
symlink_path /fluentd/log/docker.log
append true
time_slice_format %Y%m%d
time_slice_wait 1m
time_format %Y%m%dT%H%M%S%z
utc
buffer_chunk_limit 512m
</store>
<store>
#type s3
#id output_docker2
#log_level trace
s3_bucket bucketwert-1
s3_region us-east-1
path logs/
buffer_path /fluentd/log/docker.log
s3_object_key_format %{path}%{time_slice}_sbx_docker_%{index}.%{file_extension}
flush_interval 3600s
time_slice_format %Y%m%d
time_format %Y%m%dT%H%M%S%z
utc
buffer_chunk_limit 512m
</store>
</match>
</label>

Resources