We are using ELK for logging in our spring application using docker setup. I have configured log stash to read the log file from a given path(where the application generates the logs) and pass it to elastic search. The initial setup works fine and all the logs are passed to kibana instantly. However, as the size of the logs increase (or some form of application logging happens), the response time for application increases exponentially and ultimately brings down the application and everything within the docker network.
Logstash conf file:
input {
file {
type => "java"
path => ["/logs/application.log"]
}
filter {
multiline {
pattern => "^%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}.*"
negate => "true"
what => "previous"
periodic_flush => false
}
if [message] =~ "\tat" {
grok {
match => ["message", "^(\tat)"]
add_tag => ["stacktrace"]
}
}
grok {
match => [ "message",
"(?<timestamp>%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}) %{LOGLEVEL:level} %{NUMBER:pid} --- \[(?<thread>[A-Za-z0-9-]+)\] [A-Za-z0-9.]*\.(?<class>[A-Za-z0-9#_]+)\s*:\s+(?<logmessage>.*)",
"message",
"(?<timestamp>%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}) %{LOGLEVEL:level} %{NUMBER:pid} --- .+? :\s+(?<logmessage>.*)"
]
}
#Parsing out timestamps which are in timestamp field thanks to previous grok section
date {
match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss.SSS" ]
}
}
output {
# Sending properly parsed log events to elasticsearch
elasticsearch {
hosts => ["elasticsearch:9200"] // elastic search is the name if the service in docker-compose file for elk
}
}}
Logstash Docker file:
FROM logstash
ADD config/logstash.conf /tmp/config/logstash.conf
Volume $HOME/Documents/logs /logs
RUN touch /tmp/config/logstash.conf
EXPOSE 5000
ENTRYPOINT ["logstash", "agent","-v","-f","/tmp/config/logstash.conf"]
docker compose for ELK:
version: '2'
services:
elasticsearch:
image: elasticsearch:2.3.3
command: elasticsearch -Des.network.host=0.0.0.0
ports:
- "9200:9200"
- "9300:9300"
networks:
- elk
logstash:
build: image/logstash
volumes:
- $HOME/Documents/logs:/logs
ports:
- "5000:5000"
networks:
- elk
kibana:
image: kibana:4.5.1
ports:
- "5601:5601"
networks:
- elk
networks:
elk:
Note: My spring-boot application and elk are on different networks. The performance issue remains same even if they are on same container.
Is this a performance issue because of the continuous writing/polling of a log file which is causing read/write lock issues?
Related
Tracing makes finding parts in code, worthwhile a developers time and attention, much easier. For that reason, I attached Jaeger as tracer to a set of microservices inside Docker containers. I use Traefik as ingress controller/ service-mesh to route and proxy requests.
The problem I am facing is, that something's wrong with the tracing config in Traefik. Jaeger can not find the span context to connect the single/ service-dependend spans to a whole trace.
The following line appears in the logs:
{
"level":"debug",
"middlewareName":"tracing",
"middlewareType":"TracingEntryPoint",
"msg":"Failed to extract the context: opentracing: SpanContext not found in Extract carrier",
"time":"2021-02-02T23:16:51+01:00"
}
What I tried/ searched/ confirmed so far:
I already checked ports (they are open inside the Docker host network) and everything's reachable. So interconnectivity is not the problem here.
The forwarding of headers is set via Docker Compose labels: loadbalancer.passhostheader=true.
The following snippets describe the Docker Compose setup.
Traefik: Ingress Controller
This is a stripped down version of the traefik Container.
# Network
ROOT_DOMAIN=example.test
DEFAULT_NETWORK=traefik
---
version: '3'
services:
image: "traefik:2.4.2"
hostname: "controller"
restart: on-failure
security_opt:
- no-new-privileges:true
ports:
- "443:443"
- "80:80"
# The Web UI (enabled by --api.insecure=true)
- "8080:8080"
- "8082:8082"
- "8083:8083"
networks:
- default
working_dir: /etc/traefik
volumes:
- /private/etc/localtime:/etc/localtime:ro
- ${PWD}/controller/static.yml:/etc/traefik/traefik.yml:ro
- ${PWD}/controller/dynamic.yml:/etc/traefik/dynamic.yml:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
- cert-storage:/usr/local/share/ca-certificates:ro
- ${PWD}/logs/traefik:/var/log/traefik
volumes:
cert-storage:
driver_opts:
type: none
o: bind
device: ${PWD}/certs/certs
networks:
default:
external: true
name: ${DEFAULT_NETWORK}
Traefik is set up using the file provider as base and Docker Compose labels on top of it:
# static.yaml (Traefik conf)
debug: true
log:
level: DEBUG
filePath: /var/log/traefik/error.log
format: json
serversTransport:
insecureSkipVerify: true
api:
dashboard: true
insecure: true
debug: true
providers:
docker:
exposedByDefault: false
swarmMode: false
watch: true
defaultRule: "Host(`{{ normalize .Name }}.example.test`)"
endpoint: "unix:///var/run/docker.sock"
network: traefik
file:
filename: /etc/traefik/dynamic.yml
watch: true
tracing:
serviceName: "controller"
spanNameLimit: 250
jaeger:
samplingType: const
samplingParam: 1.0
samplingServerURL: http://tracer:5778/sampling
localAgentHostPort: 127.0.0.1:6831
gen128Bit: true
propagation: jaeger
traceContextHeaderName: "traefik-trace-id"
collector:
endpoint: http://tracer:14268/api/traces?format=jaeger.thrift
Jaeger: Open Tracing/ Open Telemetry
---
version: '3'
services:
tracer:
image: "jaegertracing/all-in-one:1.21.0"
hostname: "tracer"
command:
- "--log-level=info"
- "--admin.http.host-port=:14269"
- "--query.ui-config=/usr/local/share/jaeger/ui/conf.json"
environment:
SPAN_STORAGE_TYPE: memory
restart: on-failure
security_opt:
- no-new-privileges:true
expose:
- 5775/udp
- 6831/udp
- 6832/udp
- 5778
- 14250
- 14268
- 14269
- 14271
- 16686
- 16687
volumes:
- /private/etc/localtime:/etc/localtime:ro
- ${PWD}/tracer/conf:/usr/local/share/jaeger
- ${PWD}/logs/jaeger:/var/log/#TODO
- cert-storage:/usr/local/share/ca-certificates
networks:
- default
labels:
- "traefik.enable=true"
- "traefik.docker.network=${DEFAULT_NETWORK}"
# Admin UI router
- "traefik.http.routers.tracer-router.rule=Host(`tracer.$ROOT_DOMAIN`)"
- "traefik.http.routers.tracer-router.entrypoints=https"
- "traefik.http.routers.tracer-router.tls=true"
- "traefik.http.routers.tracer-router.tls.options=default"
- "traefik.http.routers.tracer-router.service=tracer"
# Service/ Load Balancer
- "traefik.http.services.tracer.loadbalancer.passhostheader=true"
- "traefik.http.services.tracer.loadbalancer.server.port=16686"
- "traefik.http.services.tracer.loadbalancer.server.scheme=http"
I'm not 100% sure what the problem is you're experiencing, but here's some things to consider.
According to this post on the Traefik forums, that message you're seeing is debug level because it's not something you should be worried about. It's just logging that no trace context was found, so a new one will be created. That second part is not in the message, but apparently that's what happens.
You should check to see if you're getting data appearing in Jaeger. If you are, that message is probably nothing to worry.
If you are getting data in Jaeger, but it's not connected, that will be because Traefik can only only work with trace context that is already in inbound requests, but it can't add trace context to outbound requests. Within your application, you'll need to implement trace propagation so that your outbound requests include the trace context that was received as part of the incoming request. Without doing that, every request will be sent without trace context and will start a new trace when it is received at the next Traefik ingress point.
The problem actually was with the traceContextHeaderName. Sadly I can not tell exactly what the problem was as the git diff only shows that nothing changed around traefik and jaeger at the point where I fixed it. I assume config got "stuck" somehow. I tracked down the related lines in source, but as I am no Go-Dev, I can only guess if there's a bug.
What I did was to switch back to uber-trace-id, which magically "fixed" it. After I ran some traces and connected another service (node, npm jaeger-client with process.env.TRACER_STATE_HEADER_NAME set to an equal value), I switched back to traefik-trace-id and things worked.
I am trying to capture syslog messages sent over the network using rsyslog, and then have rsyslog capture, transform and send these messages to elasticsearch.
I found a nice article on the configuration on https://www.reddit.com/r/devops/comments/9g1nts/rsyslog_elasticsearch_logging/
Problem is that rsyslog keeps popping up an error at startup that it cannot connect to Elasticsearch on the same machine on port 9200. Error I get is
Failed to connect to localhost port 9200: Connection refused
2020-03-20T12:57:51.610444+00:00 53fd9e2560d9 rsyslogd: [origin software="rsyslogd" swVersion="8.36.0" x-pid="1" x-info="http://www.rsyslog.com"] start
rsyslogd: omelasticsearch: we are suspending ourselfs due to server failure 7: Failed to connect to localhost port 9200: Connection refused [v8.36.0 try http://www.rsyslog.com/e/2007 ]
Anyone can help on this?
Everything is running in docker on a single machine. I use below docker compose file to start the stack.
version: "3"
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.6.1
environment:
- discovery.type=single-node
- xpack.security.enabled=false
ports:
- 9200:9200
networks:
- logging-network
kibana:
image: docker.elastic.co/kibana/kibana:7.6.1
depends_on:
- logstash
ports:
- 5601:5601
networks:
- logging-network
rsyslog:
image: rsyslog/syslog_appliance_alpine:8.36.0-3.7
environment:
- TZ=UTC
- xpack.security.enabled=false
ports:
- 514:514/tcp
- 514:514/udp
volumes:
- ./rsyslog.conf:/etc/rsyslog.conf:ro
- rsyslog-work:/work
- rsyslog-logs:/logs
volumes:
rsyslog-work:
rsyslog-logs:
networks:
logging-network:
driver: bridge
rsyslog.conf file below:
global(processInternalMessages="on")
#module(load="imtcp" StreamDriver.AuthMode="anon" StreamDriver.Mode="1")
module(load="impstats") # config.enabled=`echo $ENABLE_STATISTICS`)
module(load="imrelp")
module(load="imptcp")
module(load="imudp" TimeRequery="500")
module(load="omstdout")
module(load="omelasticsearch")
module(load="mmjsonparse")
module(load="mmutf8fix")
input(type="imptcp" port="514")
input(type="imudp" port="514")
input(type="imrelp" port="1601")
# includes done explicitely
include(file="/etc/rsyslog.conf.d/log_to_logsene.conf" config.enabled=`echo $ENABLE_LOGSENE`)
include(file="/etc/rsyslog.conf.d/log_to_files.conf" config.enabled=`echo $ENABLE_LOGFILES`)
#try to parse a structured log
action(type="mmjsonparse")
# this is for index names to be like: rsyslog-YYYY.MM.DD
template(name="rsyslog-index" type="string" string="rsyslog-%$YEAR%.%$MONTH%.%$DAY%")
# this is for formatting our syslog in JSON with #timestamp
template(name="json-syslog" type="list") {
constant(value="{")
constant(value="\"#timestamp\":\"") property(name="timereported" dateFormat="rfc3339")
constant(value="\",\"host\":\"") property(name="hostname")
constant(value="\",\"severity\":\"") property(name="syslogseverity-text")
constant(value="\",\"facility\":\"") property(name="syslogfacility-text")
constant(value="\",\"program\":\"") property(name="programname")
constant(value="\",\"tag\":\"") property(name="syslogtag" format="json")
constant(value="\",") property(name="$!all-json" position.from="2")
# closing brace is in all-json
}
# this is where we actually send the logs to Elasticsearch (localhost:9200 by default)
action(type="omelasticsearch" template="json-syslog" searchIndex="rsyslog-index" dynSearchIndex="on")
#################### default ruleset begins ####################
# we emit our own messages to docker console:
syslog.* :omstdout:
include(file="/config/droprules.conf" mode="optional") # this permits the user to easily drop unwanted messages
action(name="main_utf8fix" type="mmutf8fix" replacementChar="?")
include(text=`echo $CNF_CALL_LOG_TO_LOGFILES`)
include(text=`echo $CNF_CALL_LOG_TO_LOGSENE`)
First of all you need to run all the containers on the same docker network which in this case are not. Second , after running the containers on the same network , login to rsyslog container and check if 9200 is available.
I'm running multiple containers that contain Apache. I'd like all these specific set of containers to log their log output to a single location - either a file or - or possibly journald?
Just some way in which I can aggregate their logs together - to be viewed together
I'm not looking for a heavy solution like fluentd / ELK stack.
How can I achieve the above? Currently all the containers are logging out to /dev/stdout and hence get collected in the 'docker logs'. But these are don't seem possible to aggregate together.
According to Save docker-compose logs to a file it seems I might be able to set a 'log path' - but how? Logging driver? And can this log file be shared between multiple containers?
Is the systemd logging driver a suitable option?
So I've had some luck with the journald logging driver. I've set some labels on a container like so:
version: "3"
services:
nginx-lb:
labels:
- "node_service=nginx"
logging:
driver: "journald"
options:
labels: "node_service=nginx"
restart: always
network_mode: host
build: .
ports:
- "80:80"
- "443:443"
But now, how do I filter by these lables when viewing them with journalctl?
Here is an example journald entry generated:
{ "__CURSOR" : "s=b300aa41db4946f1bcc528e2522627ce;i=1087c;b=e6decf90a91f40c2ad7507e342fda85a;m=8744b1cdfa;t=5934bdb103a24;x=9ba66ecb768eb67", "__REALTIME_TIMESTAMP" : "1569328890657316", "__MONOTONIC_TIMESTAMP" : "580973088250", "_BOOT_ID" : "e6decf90a91f40c2ad7507e342fda85a", "_MACHINE_ID" : "c1339882251041f48f4612e758675ff3", "_HOSTNAME" : "staging", "PRIORITY" : "6", "_UID" : "0", "_GID" : "0", "_CAP_EFFECTIVE" : "3fffffffff", "_SELINUX_CONTEXT" : "unconfined\n", "_SYSTEMD_SLICE" : "system.slice", "_TRANSPORT" : "journal", "_PID" : "3969", "_COMM" : "dockerd", "_EXE" : "/usr/bin/dockerd", "_CMDLINE" : "/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock", "_SYSTEMD_CGROUP" : "/system.slice/docker.service", "_SYSTEMD_UNIT" : "docker.service", "_SYSTEMD_INVOCATION_ID" : "9f1488b462ae478a84bec6e64d72886b", "CONTAINER_NAME" : "3b9b51b4cda1a1e3b21a01f6fe80c7748fb3d231_apache_1", "CONTAINER_TAG" : "497b2f965b76", "SYSLOG_IDENTIFIER" : "497b2f965b76", "CONTAINER_ID" : "497b2f965b76", "CONTAINER_ID_FULL" : "497b2f965b767f897786f3bb8c4789dd91db1a91fe34e5ede368172f44fb3aac", "MESSAGE" : "192.168.240.1 - - [24/Sep/2019:12:41:30 +0000] \"GET / HTTP/1.0\" 200 2697 \"-\" \"curl/7.58.0\"", "_SOURCE_REALTIME_TIMESTAMP" : "1569328890657297" }
I instead used the tag logging option.
version: "3"
services:
nginx-lb:
labels:
- "node_service=nginx"
logging:
driver: "journald"
options:
labels: "node_service=nginx"
tag: "nginx"
restart: always
network_mode: host
build: .
ports:
- "80:80"
- "443:443"
And then to view / filter:
journalctl CONTAINER_TAG=nginx --since "1 hour ago"
I've met similar problem and this could probably work,
run journalctl command with option CONTAINER_NAME repeatedly, for example:
journalctl -f CONTAINER_NAME=containerA CONTAINER_NAME=containerB
will output recent logs from containerA and containerB.
I want to run filebeat as a sidecar container next to my main application container to collect application logs. I'm using docker-compose to start both services together, filebeat depending on the application container.
This is all working fine. I'm using a shared volume for the application logs.
However I would like to collect docker container logs (stdout JSON driver) as well in filebeat.
Filebeat provides a docker/container input module for this purpose. Here is my configuration. First part is to get the application logs. Second part should get docker logs:
filebeat.inputs:
- type: log
paths:
- /path/to/my/application/*.log.json
exclude_lines: ['DEBUG']
- type: docker
containers.ids: '*'
json.message_key: message
json.keys_under_root: true
json.add_error_key: true
json.overwrite_keys: true
tags: ["docker"]
What I don't like it the containers.ids: '*'. Here I would want to point filebeat to the direct application container, ignoring all others.
Since I don't know the container ID before I run docker-compose up starting both containers, I was wondering if there is a easy way to get the container ID from my application container in my filebeat container (via docker-comnpose?) to filter on this ID?
I think you may work around the problem:
first set all the logs from the contianer to a syslog:
driver: "syslog"
options:
syslog-address: "tcp://localhost:9000"
then configure filebeat to get the logs from that syslog server like this:
filebeat.inputs:
- type: syslog
protocol.udp:
host: "localhost:9000"
This is also not really answering the question, but should work as a solution as well.
The main idea is to use label within the filebeat autodiscovery filter.
Taken from this post: https://discuss.elastic.co/t/filebeat-autodiscovery-filtering-by-container-labels/120201/5
filebeat.yml
filebeat.autodiscover:
providers:
- type: docker
templates:
- condition:
contains:
docker.container.labels.somelabel: "somevalue"
config:
- type: docker
containers.ids:
- "${data.docker.container.id}"
output.console:
pretty: true
docker-compose.yml:
version: '3'
services:
filebeat:
image: docker.elastic.co/beats/filebeat:6.2.1
command: "--strict.perms=false -v -e -d autodiscover,docker"
user: root
volumes:
- ./filebeat.yml:/usr/share/filebeat/filebeat.yml
- /var/lib/docker/containers:/var/lib/docker/containers
- /var/run/docker.sock:/var/run/docker.sock
test:
image: alpine
command: "sh -c 'while true; do echo test; sleep 1; done'"
depends_on:
- filebeat
labels:
somelabel: "somevalue"
I am setting up a docker-compose with several services, all of them writing to a common syslog container / service... which actually is a logstash service (a complete elk image as a matter of fact) with the logstash-input-syslog plugin enabled..
kind of as follows (using custom 5151 port since default 514 was giving me a hard time due to permission issues):
services:
elk-custom:
image: some_elk_image
ports:
- 5601:5601
- 9200:9200
- 5044:5044
- 5151:5151
service1:
image: myservice1_image
logging:
driver: syslog
options:
syslog-address: "tcp://127.0.0.1:5151"
service2:
image: myservice2_image
logging:
driver: syslog
options:
syslog-address: "tcp://127.0.0.1:5151"
My question is how can I add a field (an option rather under logging) so that each log entry in logstash ends up with a field, whose value will help determine whether the log came from service1 or service2.
I kind of managed to do this using the tag field, but the information ends up being part of the message, and not a separate field which I can use for queries in elasticsearch.
For the time being, kibana displays log entries as follows:
#timestamp:September 26th 2017, 12:00:47.684 syslog_severity_code:5
port:53,422 syslog_facility:user-level #version:1 host:172.18.0.1
syslog_facility_code:1 message:<27>Sep 26 12:00:47 7705a2f9b22a[2128]:
[pid: 94|app: 0|req: 4/7] 172.18.0.1 () {40 vars in 461 bytes} [Tue
Sep 26 09:00:47 2017] GET /api/v1/apikeys => generated 74 bytes in 5
msecs (HTTP/1.1 401) 2 headers in 81 bytes (1 switches on core 0)
type:syslog syslog_severity:notice tags:_grokparsefailure
_id:AV69atD4zBS_tKzDPfyh _type:syslog _index:logstash-2017.09.26 _score: -
From what I also know, we cannot define custom syslog-facilities since they are predefined by the syslog RFC.
Thx.
Ended up using port multiplexing and adding custom field based on this condition:
docker-compose.yml
elk-custom:
image: some_elk_image
ports:
- 5601:5601
- 9200:9200
- 5044:5044
- 5151:5151
- 5152:5152
service1:
image: myservice2_image
logging:
driver: syslog
options:
syslog-address: "tcp://127.0.0.1:5151"
service2:
image: myservice2_image
logging:
driver: syslog
options:
syslog-address: "tcp://127.0.0.1:5152"
logstash-conf
input {
tcp {
port => 5151
type => syslog
add_field => {'received_from' => 'service1'}
}
tcp {
port => 5152
type => syslog
add_field => {'received_from' => 'service2'}
}
}