Fluentd file output does not output to file - fluentd

On Ubuntu 18.04, I am running td-agent v4 which uses Fluentd v1.0 core. First I configured it with TCP input and stdout output. It receives and outputs the messages fine. I then configure it to output to file with a 10s flush interval, yet I do not see any output files generated in the destination path.
This is my file output configuration:
<match>
#type file
path /var/log/td-agent/test/access.%Y-%m-%d.%H:%M:%S.log
<buffer time>
timekey 10s
timekey_use_utc true
timekey_wait 2s
flush_interval 10s
</buffer>
</match>
I perform this check every 10s to see if log files are generated, but all I see is a directory with a name that still has the placeholders that I set for the path param:
ls -la /var/log/td-agent/test
total 12
drwxr-xr-x 3 td-agent td-agent 4096 Feb 5 23:14 .
drwxr-xr-x 6 td-agent td-agent 4096 Feb 6 00:17 ..
drwxr-xr-x 2 td-agent td-agent 4096 Feb 5 23:14 access.%Y-%m-%d.%H:%M:%S.log
From following the Fluentd docs, I was expecting this should be fairly straight forward since the file output and buffering plugins are bundled with Fluentd's core.
Am I missing something trivial here?

I figured it out, and it works now. I had two outputs, one to file and another to stdout. Apparently that won't work if they're both defined separately in the config file with their own <match> ... </match>. I believe output to stdout was read first in the config, so Fluentd outputted to that and not to file. They should both instead be nested under the copy output like this:
<match>
#type copy
<store>
#type file
...
</store>
<store>
#type stdout
</store>
</match>

Related

Fluentd - file buffer setup for redis_output (or elasticsearch) with multiprocess workers

Can someone help me how to configure the file buffer for multiprocess workers in fluentd?
I use this config, but when I add #type file+id to buffer for redis_store plugin, it throws this error:
failed to configure sub output copy: Plugin 'file' does not support multi workers configuration"
without id it failed with:
failed to configure sub output copy: Other 'redis_store' plugin already use same buffer path
But there is a tag in path and for different outputs (file) it works, it doesn't work only with Redis output.
I don't want to use the default memory buffer for this because of increasing memory when there is too much data. Is it possible to config this combo? (multiprocess+file buffer for redis_store plugin or Elasticsearch plugin?)
Configuration:
<system>
workers 4
root_dir /fluentd/log/buffer/
</system>
<worker 0-3>
<source>
#type forward
bind 0.0.0.0
port 9880
</source>
<label #TEST>
<match test.**>
#type forest
subtype copy
<template>
<store>
#type file
#id "file_${tag_parts[2]}/${tag_parts[3]}/${tag_parts[3]}-#{worker_id}"
#log_level debug
path "fluentd/log/${tag_parts[2]}/${tag_parts[3]}/${tag_parts[3]}-#{worker_id}.*.log"
append true
<buffer>
flush_mode interval
flush_interval 3
flush_at_shutdown true
</buffer>
<format>
#type single_value
message_key log
</format>
</store>
<store>
#type redis_store
host server_ip
port 6379
key test
store_type list
<buffer>
##type file CANT USE
#id test_${tag_parts[2]}/${tag_parts[3]}/${tag_parts[3]}-#{worker_id} WITH ID - DOESNT SUPPORT MULTIPROCESS..
#path fluentd/log/${tag_parts[2]}/${tag_parts[3]}/${tag_parts[3]}-#{worker_id}.*.log WITHOUT ID - OTHER PLUGIN USE SAME BUFFER PATH
flush_mode interval
flush_interval 3
flush_at_shutdown true
flush_thread_count 4
</buffer>
</store>
</template>
</match>
</label>
</worker>
Versions:
Fluentd v1.14.3
fluent-plugin-redis-store v0.2.0
fluent-plugin-forest v0.3.3
Thanks!
The redis_store config was wrong, correct version has id under FIRST #type:
<store>
#type redis_store
#id test_${tag_parts[2]}/${tag_parts[3]}/${tag_parts[3]}-#{worker_id}
host server_ip
port 6379
key test
store_type list
<buffer>
#type file
flush_mode interval
flush_interval 3
flush_at_shutdown true
flush_thread_count 4
</buffer>
</store>
Thank you for your time Azeem :)

Unable to Run a Simple Python Script on Fluentd

I have a python script called script.py. When I run this script, it creates a logs folder on the Desktop and downloads all the necessary logs from a website and writes them as .log files in this logs folder. I want Fluentd to run this script every 5 minutes and do nothing more. The next source I have on the config file does the real job of sending this log data to another place. If I already have the logs folder on the Desktop, this log files are uploaded correctly to the next destination. But the script never runs. If I delete my logs folder locally, this is the output fluentd gives:
2020-07-27 10:20:42 +0200 [trace]: #0 plugin/buffer.rb:350:enqueue_all: enqueueing all chunks in buffer instance=47448563172440
2020-07-27 10:21:09 +0200 [trace]: #0 plugin/buffer.rb:350:enqueue_all: enqueueing all chunks in buffer instance=47448563172440
2020-07-27 10:21:36 +0200 [debug]: #0 plugin_helper/child_process.rb:255:child_process_execute_once: Executing command title=:exec_input spawn=[{}, "python /home/zohair/Desktop/script.py"] mode=[:read] stderr=:discard
This never gives a logs folder on my Desktop which the script normally does output if run locally like python script.py
If I already have the logs folder, I can see the logs on the stdout normally. Here is my config file:
<source>
#type exec
command python /home/someuser/Desktop/script.py
run_interval 5m
<parse>
#type none
keys none
</parse>
<extract>
tag_key none
</extract>
</source>
<source>
#type tail
read_from_head true
path /home/someuser/Desktop/logs/*
tag sensor_1.log-raw-data
refresh_interval 5m
<parse>
#type none
</parse>
</source>
<match sensor_1.log-raw-data>
#type stdout
</match>
I just need fluentd to run the script and do nothing else, and let the other source take this data and send it to somewhere else. Any solutions?
Problem was solved by creating another #type exec for pip install -r requirements.txt which fulfilled the missing module error which was not being shown on the fluentd error log (Was running fluentd as superuser).

td-agent buffer violates size & only worker 0 accumulates logs

buffer accumulation goes beyond size limit
-rw-r--r--. 1 root root 867 Oct 13 08:42 worker0/buffer.q594ca13cc1d99db732af807368a5b95a.log
-rw-r--r--. 1 root root 105 Oct 13 08:42 worker0/buffer.q594ca13cc1d99db732af807368a5b95a.log.meta
-rw-r--r--. 1 root root 867 Oct 13 08:43 worker0/buffer.q594ca175fa6044eca4e6d229bf0b0855.log
-rw-r--r--. 1 root root 105 Oct 13 08:43 worker0/buffer.q594ca175fa6044eca4e6d229bf0b0855.log.meta
-rw-r--r--. 1 root root 867 Oct 13 08:45 worker0/buffer.q594ca1e86b5ad4ba4c27a47525449337.log
-rw-r--r--. 1 root root 105 Oct 13 08:45 worker0/buffer.q594ca1e86b5ad4ba4c27a47525449337.log.meta
-rw-r--r--. 1 root root 867 Oct 13 08:46 worker0/buffer.q594ca221a3bfb7c8f8c8615f67ccdabc.log
-rw-r--r--. 1 root root 105 Oct 13 08:46 worker0/buffer.q594ca221a3bfb7c8f8c8615f67ccdabc.log.meta
-rw-r--r--. 1 root root 867 Oct 13 08:47 worker0/buffer.q594ca25adc02e19e978cd80b3d606ecc.log
-rw-r--r--. 1 root root 105 Oct 13 08:47 worker0/buffer.q594ca25adc02e19e978cd80b3d606ecc.log.meta
The above some logs only.
du -sh command gives size 1.7M whereas buffer limit is set to 1M only.
Also, all logs are being collected in worker0 folder.
td-agent logs shows only worker #0 processing.
td-agent on host 172.168.3.10 is made down to check buffer condition works properly
<system>
workers 2
log_level warn
suppress_repeated_stacktrace true
</system>
<worker 0>
<source>
#type tcp
port 8514
bind 0.0.0.0
format /(^(?<header>[^\{]+)?(?<message>\{.+type.+\})$)|(^(?<log>[^\{].+))/
tag system
</source>
</worker>
<source>
#type syslog
port 5140
bind 0.0.0.0
message_length_limit 6144
format /(^(?<header>[^\{]+)?(?<message>\{.+type.+\})$)|(^(?<log>[^\{].+))/
tag syslog
</source>
<match fwd.company.logging.product*.172.168.3.10**>
#type copy
<store>
#type forward
<server>
host 172.168.3.10
port 24224
</server>
<buffer>
#type file
path /appdata/td-agent/log/buffer/forward-
buffer/company.logging.product*.172.168.3.10
flush_mode interval
flush_interval 10s
timekey 60
retry_forever true
retry_max_interval 5s
overflow_action drop_oldest_chunk
total_limit_size 1m
flush_at_shutdown false
</buffer>
</store>
</match>
Buffer shouldn't go beyond it's size and both workers need to process logs buffers.
Actually, i got answer myself.
du -sh includes sam directory size.
so to avoid confusion , i just added the size of all buffer files.
Still it was more than 1M.
I did some googling, finally came to know that td-agent buffer just takes .log files in account while limiting size. I mean, just add the .log(real buffers) files not .log.meta(info for buffer) then i found the size to be 1M which is correct.
So, i conclude saying td-agent respects buffer size.
For log accumulation on worker0 , since only worker0 is set to listen on port 8514. That's why all logs which were coming on this port were processed by worker0
Please correct me if i'm wrong.

How to add configuration to Logging Agent from Docker Container?

I'm trying to run a docker container on Compute Engine, everything works fine, my PHP app is correctly returning all data but i want to Increase log verbosity.
For now I've added two config files for fluentd inside a container config dir:
This one for nginx:
<source>
#type tail
format nginx
path /var/log/feedbacks/nginx-access.log
pos_file /var/lib/google-fluentd/pos/nginx-access.pos
read_from_head true
tag nginx-access
</source>
<source>
#type tail
format none
path /var/log/feedbacks/nginx-error.log
pos_file /var/lib/google-fluentd/pos/nginx-error.pos
read_from_head true
tag nginx-error
</source>
And this one for PHP log output :
<source>
#type tail
format /^\[(?<time>[\d\-]+ [\d\:]+)\] (?<channel>.+)\.(?<level>(DEBUG|INFO|NOTICE|WARNING|ERROR|CRITICAL|ALERT|EMERGENCY))\: (?<message>[^\{\}]*) (?<context>(\{.+\})|(\[.*\])) (?<extra>(\{.+\})|(\[.*\]))\s*$/
path /var/log/feedbacks/structured.log
pos_file /var/lib/google-fluentd/pos/feedbacks.pos
read_from_head true
tag feedbacks
</source>
I've mounted this 2 config files as follow with the corresponding logs files:
container path: /usr/src/app/var/logs/, host path: /var/log/feedbacks/, mode: r/w
container path: /usr/src/app/docker/runnable/fluentd/, host path: /etc/google-fluentd/config.d/, mode: r/w
But when I /bin/bash to these directories inside the stackdriver-logging-agent there is nothing inside, maybe i'm missing something ...
Thanks for helping !
stackdriver-logging-agent reads a container's logs through the equivalent of docker logs [container]. This provides a consistent API for processes on the host OS to gather container logs.
By default, the container's stdout|stderr are sent to docker logs and it's this stream that the stackdriver-logging-agent is collecting and onsending to the Stackdriver service.
IIUC correctly, you'd need to ensure that your PHP app is generating the richer logs and that these are being sent to stdout|stderr.
If you were to use Nginx's stock Docker image, it does this:
lrwxrwxrwx 1 root root 11 May 8 03:01 access.log -> /dev/stdout
lrwxrwxrwx 1 root root 11 May 8 03:01 error.log -> /dev/stderr
See Docker's documentation here:
https://docs.docker.com/config/containers/logging/
I was unable to find a good explanation for this for Container OS on Google's site.

How do I configure the timezone of Fluentd?

my system time is :Tue Jan 6 09:44:49 CST 2015
td-agent.conf :
<match apache.access>
type webhdfs
host Page on test.com
port 50070
path /apache/%Y%m%d_%H/access.log.${hostname}
time_slice_format %Y%m%d
time_slice_wait 10m
time_format %Y-%m-%dT%H:%M:%S.%L%:z
timezone +08:00
flush_interval 1s
</match>
the time of dir is right!
[hadoop#node1 ~]$ hadoop fs -ls /apache/20150106_09
Found 1 items
-rw-r--r-- 2 webuser supergroup 17496 2015-01-06 09:47 /apache/20150106_09/access.log.node1.Page on test.com
but the time of log is wrong,I don't know why?
2015-01-06T01:47:00.000+00:00 apache.access {"host":"::1","user":null,"method":"GET","path":"/06","code":404,"size":275,"referer":null,"agent":"ApacheBench/2.3"}
20
my config is ok.
type webhdfs
host 192.168.80.41
port 50070
path /log/fluent_%Y%m%d_%H.log
time_format %Y-%m-%d %H:%M:%S
localtime
flush_interval 10s
add localtime on config.

Resources