Can't see logs coming from fluent forward receiver

Can't see logs coming from fluent forward receiver - fluentd

I am trying to collect application level logs using fluent-bit, I want to listen these logs on otel-collector-contrib
I am generating dummy logs for testing using the command given below
docker run --log-driver=fluentd -t ubuntu echo "Testing a log message"
Here, I am expecting my collector to show the logs on terminal because I have set the loglevel to debug
Git Repo Link : https://github.com/Bhogayata-Keval/fluentbit-demo.git
Did some debugging and found that issue is
failed to parse forward mode event: unknown type of value: 1655265172 at Time at Time at Entries/0
Also found that current timestamp is going as uint64 where expected is int64.
How to typecast timestamp to integer. Time_as_Integer property of [OUTPUT] does not seem to work.

test_agent_log_generate:
image: httpd
ports:
- "80:80"
logging:
driver: "fluentd"
options:
fluentd-address: https://localhost:8006
tag: httpd.access
command: /bin/bash -c "while sleep 2; do echo \"Testing a log message\"; done"
Directly sending data to otel-endpoint like this worked !
Using forward plugin as both, input and output -- was somehow sending timestamp into uint64 format, which is not acceptable for fluentforwardreceiver

Related

How to monitor the size of a mounted directory via Telegraf docker container?

I am not particularly fluent with linux, so would appreciate some help.
My system is running with an external drive mounted at /mnt/SSD_240GB and contains two directories that I am trying to monitor the size of, with Telegraf and InfluxDB. These directories are:
/mnt/SSD_240GB/docker_data/InfluxDB/data
and
/mnt/SSD_240GB/docker_data/InfluxDB/wal
Telegraf and InfluxDB are both running in seperate docker containers.
Following this answer, I have made a shell script which contains the following code, which just uses du to get the sizes of any directories that you pass in as arguments:
#!/bin/bash
echo "["
du -s -B1 "$#" | awk '{if (NR!=1) {printf ",\n"};printf " { \"dir_size_bytes\": "$1", \"path\": \""$2"\" }";}'
echo
echo "]"
Running this script directly inside the Telegraf container works as expected, giving the correct file sizes :
Now I try to get Telegraf to automatically run this script at regular intervals by using the exec plugin, by adding the following code to my telegraf.conf file:
[[inputs.exec]]
commands = [ "/etc/telegraf/scripts/get_disk_usage.sh /mnt/SSD_240GB/docker_data/InfluxDB/wal /mnt/SSD_240GB/docker_data/InfluxDB/data" ]
timeout = "1m"
name_override = "du"
name_suffix = ""
data_format = "json"
tag_keys = [ "path" ]
The problem is that now the data which arrives in InfluxDB does not match this. In fact the file sizes returned are always the same (20480 bytes and 4096 bytes, respectively):
Does anyone know how to resolve this? Thanks!
Here is the telegraf section of the docker-compose.yaml file:
telegraf:
image: telegraf
container_name: telegraf_container
restart: always
ports:
- 8125:8125
networks:
- docker_monitoring_network
volumes:
- /mnt/SSD_240GB/docker_config_files/Telegraf/telegraf.conf:/etc/telegraf/telegraf.conf
- /mnt/SSD_240GB/docker_config_files/Telegraf/scripts:/etc/telegraf/scripts
- /mnt/SSD_240GB:/mnt/SSD_240GB:ro

Keycloak 17.0.1 Import Realm on Docker / Docker-Compose Startup

I am trying to find a way to import a realm in Keycloak version 17.0.1 that can be done at starting up a docker container (with docker-compose). I want to be able to do this in "start" mode and not "start-dev" mode as in my experience so far "start-dev" in 17 is forcing an H2/in-mem database and not allowing me to point to an external db which I would like to do to more closely resemble dev/prod environments when running locally.
Things I've tried:
1) It appears that according to recent conversations on Github (Issue 10216 and Issue 10754 to name a couple) that the environment variable that used to allow this (KEYCLOAK_IMPORT or KC_IMPORT_REALM in some versions) is no longer a trigger for this. In my attempts it also did not work for version 17.0.1.
2) I've also tried appending the following command in my docker-compose setup for keycloak and had no luck (also tried with just "start") - It appears to just ignore the command (no error or anything):
command: ["start-dev", "-Dkeycloak.import=/tmp/my-realm.json"]
3) I tried running the kc.sh command "import" in the Dockerfile (both before and after Entrypoint/start) but got error: Unmatched arguments from index 1: '/opt/keycloak/bin/kc.sh', 'im port', '--file', '/tmp/my-realm.json'
4) I've shifted gears and have tried to see if it is possible to just do it after the container starts (even with manual intervention) just to get some sanity restored. I attempted to use the admin-cli but after quite a few different attempts at different points/endpoints etc. I just get that localhost refuses to connect.
bin/kcadm.sh config credentials --server http://localhost:8080/auth --realm master --user admin --password adminpassword
Responds when hitting the following ports as shown:
8080: Failed to send request - Connect to localhost:8080 [localhost/127.0.0.1] failed: Connection refused (Connection refused)
8443: Failed to send request - localhost:8443 failed to respond
I am sure there are other ways that I've tried and am forgetting - I've kind of spun my wheels at this point.
My code (largely the same as the latest docs on the Keycloak website):
Dockerfile:
FROM quay.io/keycloak/keycloak:17.0.1 as builder
ENV KC_METRICS_ENABLED=true
ENV KC_FEATURES=token-exchange
ENV KC_DB=postgres
RUN /opt/keycloak/bin/kc.sh build
FROM quay.io/keycloak/keycloak:17.0.1
COPY --from=builder /opt/keycloak/lib/quarkus/ /opt/keycloak/lib/quarkus/
WORKDIR /opt/keycloak
# for demonstration purposes only, please make sure to use proper certificates in production instead
ENV KC_HOSTNAME=localhost
RUN keytool -genkeypair -storepass password -storetype PKCS12 -keyalg RSA -keysize 2048 -dname "CN=server" -alias server -ext "SAN:c=DNS:localhost,IP:127.0.0.1" -keystore conf/server.keystore
ENTRYPOINT ["/opt/keycloak/bin/kc.sh", "start" ]
Docker-compose.yml:
version: "3"
services:
keycloak:
build:
context: .
volumes:
- ./my-realm.json:/tmp/my-realm.json:ro
env_file:
- .env
environment:
KC_DB_URL: ${POSTGRESQL_URL}
KC_DB_USERNAME: ${POSTGRESQL_USER}
KC_DB_PASSWORD: ${POSTGRESQL_PASS}
KEYCLOAK_ADMIN: admin
KEYCLOAK_ADMIN_PASSWORD: adminpassword
ports:
- 8080:8080
- 8443:8443 # <-- I've tried with only 8080 and with only 8443 as well. 8443 appears to be the only that I can get the admin console ui to even work on though.
networks:
- my_net
networks:
my_net:
name: my_net
Any suggestion on how to do this in a programmatic + "dev-opsy" way would be greatly appreciated. I'd really like to get this to work but am confused on how to get past this.

Importing realm upon docker initialization thru configuration is not supported yet. See https://github.com/keycloak/keycloak/issues/10216. They might release this feature in next release v18.
The workarounds people had shared in github thread is create own docker image and import the realm thru json file when building it.
FROM quay.io/keycloak/keycloak:17.0.1
# Make the realm configuration available for import
COPY realm-and-users.json /opt/keycloak_import/
# Import the realm and user
RUN /opt/keycloak/bin/kc.sh import --file /opt/keycloak_import/realm-and-users.json
# The Keycloak server is configured to listen on port 8080
EXPOSE 8080
EXPOSE 8443
# Import the realm on start-up
CMD ["start-dev"]

As #tboom said, it was not supported yet by keycloak 17.x. But it is now supported by keycloak 18.x using the --import-realm option :
bin/kc.[sh|bat] [start|start-dev] --import-realm
This feature does not work as it was before. The JSON file path must not be specified anymore: the JSON file only has to be copied in the <KEYCLOAK_DIR>/data/import directory (multiple JSON files supported). Note that the import operation is skipped if the realm already exists, so incremental updates are not possible anymore (at least for the time being).
This feature is documented on https://www.keycloak.org/server/importExport#_importing_a_realm_during_startup.

Filebeat not running using docker-compose: setting 'filebeat.prospectors' has been removed

I'm trying to launch filebeat using docker-compose (I intend to add other services later on) but every time I execute the docker-compose.yml file, the filebeat service always ends up with the following error:
filebeat_1 | 2019-08-01T14:01:02.750Z ERROR instance/beat.go:877 Exiting: 1 error: setting 'filebeat.prospectors' has been removed
filebeat_1 | Exiting: 1 error: setting 'filebeat.prospectors' has been removed
I discovered the error by accessing the docker-compose logs.
My docker-compose file is as simple as it can be at the moment. It simply calls a filebeat Dockerfile and launches the service immediately after.
Next to my Dockerfile for filebeat I have a simple config file (filebeat.yml), which is copied to the container, replacing the default filebeat.yml.
If I execute the Dockerfile using the docker command, the filebeat instance works just fine: it uses my config file and identifies the "output.json" file as well.
I'm currently using version 7.2 of filebeat and I know that the "filebeat.prospectors" isn't being used. I also know for sure that this specific configuration isn't coming from my filebeat.yml file (you'll find it below).
It seems that, when using docker-compose, the container is accessing another configuration file instead of the one that is being copied to the container, by the Dockerfile, but so far I haven't been able to figure it out how, why and how can I fix it...
Here's my docker-compose.yml file:
version: "3.7"
services:
filebeat:
build: "./filebeat"
command: filebeat -e -strict.perms=false
The filebeat.yml file:
filebeat.inputs:
- paths:
- '/usr/share/filebeat/*.json'
fields_under_root: true
fields:
tags: ['json']
output:
logstash:
hosts: ['localhost:5044']
The Dockerfile file:
FROM docker.elastic.co/beats/filebeat:7.2.0
COPY filebeat.yml /usr/share/filebeat/filebeat.yml
COPY output.json /usr/share/filebeat/output.json
USER root
RUN chown root:filebeat /usr/share/filebeat/filebeat.yml
RUN mkdir /usr/share/filebeat/dockerlogs
USER filebeat
The output I'm expecting should be similar to the following, which comes from the successful executions I'm getting when I'm executing it as a single container.
The ERROR is expected because I don't have logstash configured at the moment.
INFO crawler/crawler.go:72 Loading Inputs: 1
INFO log/input.go:148 Configured paths: [/usr/share/filebeat/*.json]
INFO input/input.go:114 Starting input of type: log; ID: 2772412032856660548
INFO crawler/crawler.go:106 Loading and starting Inputs completed. Enabled inputs: 1
INFO log/harvester.go:253 Harvester started for file: /usr/share/filebeat/output.json
INFO pipeline/output.go:95 Connecting to backoff(async(tcp://localhost:5044))
ERROR pipeline/output.go:100 Failed to connect to backoff(async(tcp://localhost:5044)): dial tcp [::1]:5044: connect: cannot assign requested address
INFO pipeline/output.go:93 Attempting to reconnect to backoff(async(tcp://localhost:5044)) with 1 reconnect attempt(s)
ERROR pipeline/output.go:100 Failed to connect to backoff(async(tcp://localhost:5044)): dial tcp [::1]:5044: connect: cannot assign requested address
INFO pipeline/output.go:93 Attempting to reconnect to backoff(async(tcp://localhost:5044)) with 2 reconnect attempt(s)

I managed to figure out what the problem was.
I needed to map the location of the config file and logs directory in the docker-compose file, using the volumes tag:
version: "3.7"
services:
filebeat:
build: "./filebeat"
command: filebeat -e -strict.perms=false
volumes:
- ./filebeat/filebeat.yml:/usr/share/filebeat/filebeat.yml
- ./filebeat/logs:/usr/share/filebeat/dockerlogs
Finally I just had to execute the docker-compose command and everything start working properly:
docker-compose -f docker-compose.yml up -d

How to know if my program is completely started inside my docker with compose

In my CI chain I execute end-to-end tests after a "docker-compose up". Unfortunately my tests often fail because even if the containers are properly started, the programs contained in my containers are not.
Is there an elegant way to verify that my setup is completely started before running my tests ?

You could poll the required services to confirm they are responding before running the tests.
curl has inbuilt retry logic or it's fairly trivial to build retry logic around some other type of service test.
#!/bin/bash
await(){
local url=${1}
local seconds=${2:-30}
curl --max-time 5 --retry 60 --retry-delay 1 \
--retry-max-time ${seconds} "${url}" \
|| exit 1
}
docker-compose up -d
await http://container_ms1:3000
await http://container_ms2:3000
run-ze-tests
The alternate to polling is an event based system.
If all your services push notifications to an external service, scaeda gave the example of a log file or you could use something like Amazon SNS. Your services emit a "started" event. Then you can subscribe to those events and run whatever you need once everything has started.
Docker 1.12 did add the HEALTHCHECK build command. Maybe this is available via Docker Events?

If you have control over the docker engine in your CI setup you could execute docker logs [Container_Name] and read out the last line which could be emitted by your application.
RESULT=$(docker logs [Container_Name] 2>&1 | grep [Search_String])
logs output example:
Agent pid 13
Enter passphrase (empty for no passphrase): Enter same passphrase again: Identity added: id_rsa (id_rsa)
#host SSH-2.0-OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.6
#host SSH-2.0-OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.6
parse specific line:
RESULT=$(docker logs ssh_jenkins_test 2>&1 | grep Enter)
result:
Enter passphrase (empty for no passphrase): Enter same passphrase again: Identity added: id_rsa (id_rsa)

Docker apps logging with Filebeat and Logstash

I have a set of dockerized applications scattered across multiple servers and trying to setup production-level centralized logging with ELK. I'm ok with the ELK part itself, but I'm a little confused about how to forward the logs to my logstashes.
I'm trying to use Filebeat, because of its loadbalance feature.
I'd also like to avoid packing Filebeat (or anything else) into all my dockers, and keep it separated, dockerized or not.
How can I proceed?
I've been trying the following. My Dockers log on stdout so with a non-dockerized Filebeat configured to read from stdin I do:
docker logs -f mycontainer | ./filebeat -e -c filebeat.yml
That appears to work at the beginning. The first logs are forwarded to my logstash. The cached one I guess. But at some point it gets stuck and keep sending the same event
Is that just a bug or am I headed in the wrong direction? What solution have you setup?

Here's one way to forward docker logs to the ELK stack (requires docker >= 1.8 for the gelf log driver):
Start a Logstash container with the gelf input plugin to reads from gelf and outputs to an Elasticsearch host (ES_HOST:port):
docker run --rm -p 12201:12201/udp logstash \
logstash -e 'input { gelf { } } output { elasticsearch { hosts => ["ES_HOST:PORT"] } }'
Now start a Docker container and use the gelf Docker logging driver. Here's a dumb example:
docker run --log-driver=gelf --log-opt gelf-address=udp://localhost:12201 busybox \
/bin/sh -c 'while true; do echo "Hello $(date)"; sleep 1; done'
Load up Kibana and things that would've landed in docker logs are now visible. The gelf source code shows that some handy fields are generated for you (hat-tip: Christophe Labouisse): _container_id, _container_name, _image_id, _image_name, _command, _tag, _created.
If you use docker-compose (make sure to use docker-compose >= 1.5) and add the appropriate settings in docker-compose.yml after starting the logstash container:
log_driver: "gelf"
log_opt:
gelf-address: "udp://localhost:12201"

Docker allows you to specify the logDriver in use. This answer does not care about Filebeat or load balancing.
In a presentation I used syslog to forward the logs to a Logstash (ELK) instance listening on port 5000.
The following command constantly sends messages through syslog to Logstash:
docker run -t -d --log-driver=syslog --log-opt syslog-address=tcp://127.0.0.1:5000 ubuntu /bin/bash -c 'while true; do echo "Hello $(date)"; sleep 1; done'

Using filebeat you can just pipe docker logs output as you've described. Behavior you are seeing definitely sounds like a bug, but can also be the partial line read configuration hitting you (resend partial lines until newline symbol is found).
A problem I see with piping is possible back pressure in case no logstash is available. If filebeat can not send any events, it will buffer up events internally and at some point stop reading from stdin. No idea how/if docker protects from stdout becoming unresponsive. Another problem with piping might be restart behavior of filebeat + docker if you are using docker-compose. docker-compose by default reuses images + image state. So when you restart, you will ship all old logs again (given the underlying log file has not been rotated yet).
Instead of piping you can try to read the log files written by docker to the host system. The default docker log driver is the json log driver . You can and should configure the json log driver to do log-rotation + keep some old files (for buffering up on disk). See max-size and max-file options. The json driver puts one line of 'json' data for every line to be logged. On the docker host system the log files are written to /var/lib/docker/containers/container_id/container_id-json.log . These files will be forwarded by filebeat to logstash. If logstash or network becomes unavailable or filebeat is restarted, it continues forwarding log lines where it left of (given files have been not deleted due to log rotation). No events will be lost. In logstash you can use the json_lines codec or filter to parse the json lines and a grok filter to gain some more information from your logs.
There has been some discussion about using libbeat (used by filebeat for shipping log files) to add a new log driver to docker. Maybe it is possible to collect logs via dockerbeat in the future by using the docker logs api (I'm not aware of any plans about utilising the logs api, though).
Using syslog is also an option. Maybe you can get some syslog relay on your docker host load balancing log events. Or have syslog write log files and use filebeat to forward them. I think rsyslog has at least some failover mode. You can use logstash syslog input plugin and rsyslog to forward logs to logstash with failover support in case the active logstash instance becomes unavailable.

I created my own docker image using the Docker API to collect the logs of the containers running on the machine and ship them to Logstash thanks to Filebeat. No need to install or configure anything on the host.
Check it out and tell me if it suits your needs: https://hub.docker.com/r/bargenson/filebeat/.
The code is available here: https://github.com/bargenson/docker-filebeat

Just for helping others that need to do this, you can simply use Filebeat to ship the logs. I would use the container by #brice-argenson, but I needed SSL support so I went with a locally installed Filebeat instance.
The prospector from filebeat is (repeat for more containers):
- input_type: log
paths:
- /var/lib/docker/containers/<guid>/*.log
document_type: docker_log
fields:
dockercontainer: container_name
It sucks a bit that you need to know the GUIDs as they could change on updates.
On the logstash server, setup the usual filebeat input source for logstash, and use a filter like this:
filter {
if [type] == "docker_log" {
json {
source => "message"
add_field => [ "received_at", "%{#timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
mutate {
rename => { "log" => "message" }
}
date {
match => [ "time", "ISO8601" ]
}
}
}
This will parse the JSON from the Docker logs, and set the timestamp to the one reported by Docker.
If you are reading logs from the nginx Docker image, you can add this filter as well:
filter {
if [fields][dockercontainer] == "nginx" {
grok {
match => { "message" => "(?m)%{IPORHOST:targethost} %{COMBINEDAPACHELOG}" }
}
mutate {
convert => { "[bytes]" => "integer" }
convert => { "[response]" => "integer" }
}
mutate {
rename => { "bytes" => "http_streamlen" }
rename => { "response" => "http_statuscode" }
}
}
}
The convert/renames are optional, but fixes an oversight in the COMBINEDAPACHELOG expression where it does not cast these values to integers, making them unavailable for aggregation in Kibana.

I verified what erewok wrote above in a comment:
According to the docs, you should be able to use a pattern like this
in your prospectors.paths: /var/lib/docker/containers/*/*.log – erewok
Apr 18 at 21:03
The docker container guids, represented as the first '*', are correctly resolved when filebeat starts up. I do not know what happens as containers are added.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart