FluentD customisation for huge volume of container logs - fluentd

I am new to FluentD logging mechanism. We are dealing with an issue in our AKS fleet where FluentD is used as the mechanism for shipping logs to Kafka. Most of the applications/containers are able to send logs thru Kafka except just one api app which throws huge volume of logs... Any recommendations on fluentD configs to tackle this heavy log inputs?
We did not try any options with volume

Related

Log to ELK from Nomad without using container technology

We are using Hashicorp Nomad to run microservices on Windows. We experienced that allocations come and go, but we would like to have centralized logging solution (ideally ELK) for all logs from all jobs and tasks from multiple environments. It is quite simple to do it with dockerized environments, but how can I do it if I run raw_exec tasks?
There's nothing specific to containers for log shipping other than the output driver. If containers write their logs to volumes, which Nomad can be configured to do, then the answer is the same.
Assuming your raw_exec jobs write logs into the local filesystem, then you need a log shipper product such as Filebeat or Fluentd to watch those paths, then push that data to Elastic / Logstash.

Docker CE and syslog

Docker logging drivers are specified online, and these limitations.
Limitations of logging drivers
Users of Docker Enterprise can make use of “dual logging”, which enables you to use the docker logs command for any logging driver. Refer to reading logs when using remote logging drivers for information about using docker logs to read container logs locally for many third party logging solutions, including:
syslog
gelf
fluentd
awslogs
splunk
etwlogs
gcplogs
Logentries
When using Docker Community Engine, the docker logs command is only available on the following drivers:
local
json-file
journald
Reading log information requires decompressing rotated log files, which causes a temporary increase in disk usage (until the log entries from the rotated files are read) and an increased CPU usage while decompressing.
The capacity of the host storage where the Docker data directory resides determines the maximum size of the log file information.
I am using Docker CE, but I have a question about this documentation. Does this mean, using CE, I cant do syslog at all? or just that I cant do syslog and have docker logs?
There is nothing stopping you from using syslog within the container, but you can't read those logs using the 'docker logs' command. There is also nothing stopping you from writing your logs to stdout and piping your logs to as many log shippers as you want.
Here's an article that explains how to do syslog in a docker container: https://medium.com/better-programming/docker-centralized-logging-with-syslog-97b9c147bd30
I think that fluentd and fluent-bit are better choices than syslog these days given the structure they provide to the msg field, though syslog-ng looks interesting. Fluent-bit is incredibly good though, so you might want to take a look at it.

Docker + Fluentd in K8s for log rotation: Does Docker need to know the existence of Fluentd?

I am trying to understand the interaction between Docker and Fluentd in a K8s cluster. I have seen places where you need to configure Docker to output to a logging driver, and Fluentd can be used as logging driver, like here.
On the other hand, I have seen posts (like this or this) where Docker does not know the existence of Fluentd as a DaemonSet.
My whole intention is to do log rotation, however I am not sure if having Fluentd in place will actually rotate the logs Docker writes on, so I do not end up with the whole storage space in the node taken up by the logs over time. Is it enough to use FluentD DaemonSet without Docker knowing the existence of Fluentd?, o I need to somehow connect Docker to Fluentd with a driver as well?
Per official k8s logging architecture docker (or any other runtime) does not need to know about FluentBit, Fluentd, Filebeat, or any other log collector you use. In fact, you can use multiple log collectors a time!
The same document states that k8s is not responsible for log rotation, so you set up a logrotate yourself. Fluentd/FluentBit daemon on the other end also does not rotate log files, but it does able to track log rotation and adjust the tail cursor accordingly (by default).
By far the easiest way to implement the architecture is
Leave kubelet & docker settings at default
Ensure the app logs stdout/stderr
Ensure there's logrotate: many k8s worker AMIs, e.g. EKS already have it.
Setup FluentBit log collector daemonset https://github.com/helm/charts/tree/master/stable/fluent-bit

How to forward application logs to Splunk from docker container?

We're interested in forwarding the logs from a node.js server running in a Docker container to Splunk.
Some options we've considered include a side-car container running a Splunk forwarder. The side-car would write to a shared volume that the side-car would observe and send on.
Ideally, we would just use a syslog drain or another mechanism, but I can't seem to find any documentation on how to set that up?
There are a lot of options to send logs from containers to Splunk.
For logs, sent to Standard Output and Error:
Splunk Logging Driver https://docs.docker.com/v17.09/engine/admin/logging/splunk/
Splunk Docker logging plugin https://github.com/splunk/docker-logging-plugin - an improved version of Splunk Logging Driver
For application logs (logs written inside of the container):
Sidecars with UF
Our company (https://www.outcoldsolutions.com) offers one solution that can simply forward container (https://www.outcoldsolutions.com/docs/monitoring-docker/v5/) and application logs (https://www.outcoldsolutions.com/docs/monitoring-docker/v5/annotations/#application-logs) from the Docker hosts, and collect metrics. We also provide you with an application in Splunk for tracking the health and performance of your clusters https://splunkbase.splunk.com/app/3723/. Our application is not free, but cheap compared to the time you can spend building something similar.
Another option is using fluentd as an intermediary.
Fluentd exists as docker logging driver as well, but you can use it to redirect the logs to several backends (Splunk, Elasticsearch). You are not as tightly coupled to Splunk.
Additionally that's the way proposed by Openshift.
It looks like Docker has a logging driver that handles this
https://docs.docker.com/v17.09/engine/admin/logging/splunk/

How to put fluentd containers behind a load balancer in ECS?

So in ECS, we have ALB's which only route traffic on http/https. As fluentd containers listen on a different tcp port, how can we load balance them in ECS? I know we can use classic loadbalancer in ECS, but I wanted to avoid using classic ELB in ECS.
To give a background, in our SOA architecture there are many services in ECS and I want to use the docker fluentd logging driver to route the logs to the fluentd container.
Is it a good practice to put multiple fluentd containers behind a loadbalancer? Any other suggestions welcome.
I run a number of fluentd services in ECS behind classic ELB load balancers. It works really well and the inability to use dynamic ports hasn't been an issue given the number of hosts in the cluster.
You also now have the option of using the network load balancer. An NLB allows you to use dynamic port mapping with TCP. Read the documentation and see if it works for you. I didn't use it because it doesn't work across VPC peering connections.

Resources