We re trying to eliminate Datadog agents from our infrastructure. I am trying to find a solution to forward the containers standard output logs to be visualised on datadog but without the agents and without changing the dockerfiles because there are hundreds of them.
I was thinking about trying to centralize the logs with rsyslog but I dont know if its a good idea. Any suggestions ?
This doc will show you a comprehensive list of all integrations that involve log collection. Some of these include other common log shippers, which can also be used to forward logs to Datadog. Among these you'd find...
Fluentd
Logstash
Rsyslog (for linux)
Syslog-ng (for linux, windows)
nxlog (for windows)
That said, you can still just use the Datadog agent to collect logs only (they want you to collect everything with their agent, that's why they warn you against collecting just their logs).
If you want to collect logs from docker containers, the Datadog agent is an easy way to do that, and it has the benefit of adding lots of relevant docker-metadata as tags to your logs. (Docker log collection instructions here.)
If you don't want to do that, I'd look at Fluentd first on the list above -- it has a good reputation for containerized log collection, promotes JSON log formatting (for easier processing), and scales reasonably well.
Related
I'm having issues connecting Fluentd to Kafka for a centralized logging PoC I'm working on.
I'm currently using the following configuration:
Minikube
Fluentd
fluent/fluentd-kubernetes-daemonset:v1.14.3-debian-kafka2-1.0 (docker)
Configuration: I have the FLUENT_KAFKA2_BROKERS=<INTERNAL KAFKA BOOTSTRAP IP>:9092 and FLUENT_KAFKA2_DEFAULT_TOPIC=logs env set in my yaml for fluentd daemonset.
Kafka
I was sort of expecting to see the logs appear in a Kafka consumer running against the same broker listening on the "logs" topic. No dice.
Could anyone recommend next steps for troubleshooting and or a good reference? I've done a good bit of searching and have only found a few people posting about setting up with the fluentd-kafka plugin. Also would it make sense for me to explore Fluent Bit Kafka setup as an alternative?
In general, to configure forwarding of log events to Kafka topic you would definitely need to use output plugins for Fluentd.
Fluentd delivers fluent-plugin-kafka plugin, as specified in Fluentd docs, for both input and output use cases. For output case, this plugin has Kafka Producer functions to publishes messages into topics. kafka-connect-fluentd plugin can also be used as an alternative.
Fluent Bit - being the sub-project of Fluentd - a good lightweight alternative for Fluentd, but which one to use depends on your particular use case.
Fluent Bit has limited amount of filtering options, it is not as pluggable and flexible as Fluentd. The later has more configuration options and filters, it can be integrated with a much larger amount of input and output sources. It is essentially designed to deal with heavy throughput — aggregating from multiple inputs, processing data and routing to different outputs. More on comparison here and here.
I have an app that is dynamically creating docker containers and I can't intercept the way it is created.
I want to see logs from all the machines that are up. no matter if it was via docker-compose or just docker command line. I need to see all the logs.
Is it possible?
right no I need to run docker ps, see all the created machines and run docker log container.
I can't really monitor what is going inside.
Thanks
An approach is to use a dedicated logging container that can gather log events from other containers, aggregate them, then store or forward the events to a third-party service, this approach eliminates the dependencies on a host.
Further, dedicated logging containers can automatically collect, monitor, and analyze log events, It can scale your log events automatically without configuration. It can retrieve logs through multiple streams of log events, stats, and Docker API data.
You can check this link also for some help.
Docker Logging Best Practices
The Kubernetes documentation states it's possible to use Elasticsearch and Kibana for cluster level logging.
Is this possible to do this on the instance of Kubernetes that's shipped with Docker for Windows as per the documentation? I'm not interested in third party Kubernetes manifests or Helm charts that mimic this behavior.
Kubernetes is an open-source system for automating deployment, scaling,
and management of containerized applications.
It is a complex environment with a huge amount of information regarding the state of cluster and events
processed during execution of pods lifecycle and health checking off all nodes and whole Kubernetes
cluster.
I do not have practice with Docker for Windows, so my point of view is based on Kubernetes with Linux containers
perspective.
To collect and analyze all of this information there are some tools like Fluentd, Logstash
and they are accompanied by tools such as Elasticsearch and Kibana.
Those cluster-level log aggregation can be realized using Kubernetes orchestration framework.
So we can expect that some running containers take care of gathering data and other containers
take care of other aspects of abstractions like analyzing and presentation layer.
Please notice that some solutions depend on cloud platform features where Kubernetes environment
is running. For example, GCP offers Stackdriver Logging.
We can mention some layers of log probes and analyses:
monitoring a pod
is the most rudimentary form of viewing Kubernetes logs.
You use the kubectl commands to fetch log data for each pod individually.
These logs are stored in the pod and when the pod dies, the logs die with them.
monitoring a node. Collected log for each node are stored in a JSON file. This file can get really large.
Node-level logs are more persistent than pod-level ones.
monitoring a cluster.
Kubernetes doesn’t provide a default logging mechanism for the entire cluster, but leaves this up
to the user and third-party tools to figure out. One approach is to build on the node-level logging.
This way, you can assign an agent to log every node and combine their output.
As you see, there is a niche on cluster level monitoring, so there is a reason to aggregate current logs and
offer a practical way to analyze and present results.
On the node level logging, popular log aggregator is Fluentd. It is implemented as a Docker container,
and it is run parallel with pod lifecycle. Fluentd does not store the logs themselves.
Instead, it sends their logs to an Elasticsearch cluster that stores the log information in a replicated set of nodes.
It looks like Elasticsearch is used as a data store of aggregated logs of working nodes.
This aggregator cluster consists of a pod with two instances of Elasticsearch.
The aggregated logs in the Elasticsearch cluster can be viewed using Kibana.
This presents a web interface, which provides a more convenient interactive method for querying the ingested logs
The Kibana pods are also monitored by the Kubernetes system to ensure they are running healthily and the expected
number of replicas are present.
The lifecycle of these pods is controlled by a replication-controller specification similar in nature to how the
Elasticsearch cluster was configured.
Back to your question. I'm pretty sure that the mentioned above also works with Kubernetes and Dockers
for Windows. From the other hand, I think the cloud platform or the Linux premise environment
is a natural space to live for them.
Answer was inspired by Cluster-level Logging of Containers with Containers and Kubernetes Logging articles.
I also like Configuring centralized logging from Kubernetes page and used An Introduction
to logging in Kubernetes at my beginning with Kubernetes.
Currently, to my understanding, kubernetes offers no logging solutions on it's own and it also does not allow one to specify the logging driver when using docker as the container technology due to scope encapsulation concerns.
This leaves folks with the ugly solution of tailing json logs from shared volumes using either fluentd, filebeat, or some other file tailing demon, parsing these, then sending them to the desired storage backend.
My question is, is there any repo or public knowledge config store for this type of scenario for people that have gone through this before? My use case would involve tailing the logs of a nginx docker image and writing out the fluentd/grok pattern myself seems really painful, plus i wouldn't want to struggle on an issue already solved by someone else.
Thanks
We tried logdna and the integration with k8s is pretty solid. Most of the time I just tail the log of some container using kubectl logs -f [CONTAINER_ID]. I'm guessing you're looking for a persistent approach.
What is the common practice to get metrics from services running inside Docker containers, using tools like CollectD, or InfluxDD Telegraph?
These tools are normally configured to run as agents in the system and get metrics from localhost.
I have read collectd docs and some plugins allow to get metrics from remote systems so I could have for example, an NGINX container and then a collectd container to get the metrics, but there isnt a simpler way?
Also, I dont want to use Supervisor or similar tools to run more that "1 process per container".
I am thinking about this in conjunction with a System like DC/OS or Kubernetes.
What do you think?
Thank you for your help.