Exclude logs from fluentd using exclude directive not working - fluentd

Trying to exclude logs using the grep's exclude directive.
<filter kubernetes.var.log.containers.**>
#type grep
<exclude>
key kubernetes.pod_name
pattern /^podname-*/
</exclude>
</filter>
I tried with different key names e.g. container and namespace as well. I am trying to exclude logs from a certain pod using the pattern but it's not working. Using type forward source type to send logs.
Want to exclude logs from certain pods starting with the same name from var log containers.

Related

Why isn't telegraf reading environmental variables?

My goal is to put my telegraf config into source control. To do so, I have a repo in my user's home directory with the appropriate config file which has already been tested and proven working.
I have added the path to the new config file in the "default" environment variables file:
/etc/default/telegraf
like this:
TELEGRAF_CONFIG_PATH="/home/ubuntu/some_repo/telegraf.conf"
... as well as other required variables such as passwords.
However, when I attempt to run
telegraf --test
It says No config file specified, and could not find one in $TELEGRAF_CONFIG_PATH etc.
Further, if I force it by
telegraf --test --config /home/ubuntu/some_repo/telegraf.conf
Then the process fails because it is missing the other required variables.
Questions:
What am I doing wrong?
Is there not also a way of specifying a config directory too (I would like to break my file down into separate input files)?
Perhaps as an alternative to all of this... is there not a way of specifying additional configuration files to be included from within the default /etc/telegraf/telegraf.conf file? (I've been unable to find any mention of this in documentation).
What am I doing wrong?
See what user:group owns /etc/default/telegraf. This file is better used when running telegraf as a service via systemd. Additionally, if you run env do you see the TELEGRAF_CONFIG_PATH variable? What about your other variables? If not, then you probably need to source the file first.
Is there not also a way of specifying a config directory too (I would like to break my file down into separate input files)?
Yes! Take a look at all the options of telegraf with telegraf --help and you will find:
--config-directory <directory> directory containing additional *.conf files
Perhaps as an alternative to all of this... is there not a way of specifying additional configuration files to be included from within the default /etc/telegraf/telegraf.conf file? (I've been unable to find any mention of this in documentation).
That is not the method I would suggest going down. Check out the config directory option above I mentioned.
Ok, after a LOT of trial and error, I figured everything out. For those facing similar issues, here is your shortcut to the answer:
Firstly, remember that when adding variables to the /etc/default/telegraf file, it must effectively be reloaded. So for example using ubuntu systemctl, that requires a restart.
You can verify that the variables have been loaded successfully using this:
$ sudo strings /proc/<pid>/environ
where <pid> is the "Main PID" from the telegraf status output
Secondly, when testing (eg telegraf --test) then (this is the part that is not necessarily intuitive and isn't documented) you will have to ALSO load the same environmental variables into the current user (eg: SET var=value) such that running
$ env
shows the same results as the previous command.
Hint: This is a good method for loading the current env file directly rather than doing it manually.

Using fluentD to catch logs when the same file is created again

I have a log file that is continuously deleted and re-created with the same structure but different data.
I'd like to use fluentD to export that file when a new version of the file is created. I tried various set of options but it looks like fluentD misses the updates unless I manually add some lines to the file.
Is this a use case that is supported by default sources/parsers?
Here is a config file is use
<source>
#type tail
tag file.keepalive
open_on_every_update true
read_from_head true
encoding UTF-8
multiline_flush_interval 1
...
</source>
Try tail plugin, but instead of specifying a path to a file, specify a path to a parent directory like dir/*: https://docs.fluentd.org/input/tail#path
Try adding a datetime to filename everytime you recreate it - this will 100% force it to read all.

How to configure handling of when logs sometimes don't exist? fluentd

I am adjusting our fluentd configuration to include a specific log file and send to S3. The issue I am trying to wrap my head around is this.... Only some instance types in our datacenter will contain this specific log. Other instances will not (because they are not running the app that we are logging). How do you modify the configuration so that fluentd can handle the file existing or not existing?
So in the below example Input, this log file will not be on every server instance -- that is expected. Do we have to configure the security.conf file to look for this and skip if missing? Or will fluentd just not include what it doesn't find?
## Inputs:
<source>
#type tail
path /var/log/myapp/myapp-scan.log.*
pos_file /var/log/td-agent/myapp-scan.log.pos
tag s3.system.security.myapp-scan
format none
</source>

How to filter generated Swagger JSON (yaml)

I have over 5k lines long swagger.json file describing hundreds of paths and objects. I want to generate a TypeScript client (using swagger-codegen) using only a part of the endpoints. I don't want the generated TypeScript application to contain classes or interfaces connected with unused part of the swagger.json
How to filter out only a part of the Swagger documentation, describing the specified group of paths (e.g. all paths starting with /api/*)? Especially I want the filtered JSON to not contain definitions for unused data structures.
I think you can do it, using task automation (grunt, gulp, shell, whatever).
Basically it could be a 3 steps task:
get the swagger.json (or call the swagger code-gen to get the json, with something like java -jar swagger-codegen-cli-x.x.x.jar generate -i <URL> -l swagger -o GeneratedCodeSwagger )
remove the definitions/paths that you want to exclude and create a modified swagger.json
call the code-gen passing the modified json with java -jar swagger-codegen-cli-x.x.x.jar generate -i GeneratedCodeSwagger\swagger.json -l typescript-angular
Finally, we created swagger-json-filter – a command-line tool allowing filtering a Swagger documentation. It can be easily used along other commands in bash:
cat input.json | swagger-json-filter --include-paths="^\/api\/.*" > output.json
The tool is performing a logic needed to filter out undesired definitions (nested also) from the output.

How to find source hostname with fluentd?

I'm looking for a way to send source_hostname to the fluentd destination server.
I was on logstash but we have agent/server side and we have variables to get the source hostname in the logstash server config file.
I search a similar way to do it with FluentD but the only thing that I find is to set the hostname in the source tag "#{Socket.gethostname}". But in this way i can't use the hostname in the path of the destinatation log file.
Based on source : http://docs.fluentd.org/articles/config-file#embedded-ruby-code
In the server-side, this is why i would like to do :
<source>
type forward
port 24224
bind 192.168.245.100
</source>
<match apache.access.*>
type file
path /var/log/td-agent/apache2/#{hostname}/access
</match>
<match apache.error.*>
type file
path /var/log/td-agent/apache2/#{hostname}/error
</match>
Should someone can help me to something like this please ?
Thank you in advance for your time.
You can evaluate the Ruby code with #{} in " quoted string.
So you can change it to,
path /var/log/td-agent/apache2/"#{hostname}"/access
Refer the docs - http://docs.fluentd.org/articles/config-file#embedded-ruby-code
You can try using record-reformer plugin here or forest plugin here

Resources