I have setup an elk stack infrastructure with docker.
I can't see files being processed by logstash.
Filebeat is configured to send .csv files to logstash from logstash, to elasticsearch. I see the logstash filebeat listner staring. Logstash to elasticsearch pipeline works however there is no document/index written.
Please advise
filebeat.yml
filebeat.prospectors:
- input_type: log
paths:
- logs/sms/*.csv
document_type: sms
paths:
- logs/voip/*.csv
document_type: voip
output.logstash:
enabled: true
hosts: ["logstash:5044"]
logging.to_files: true
logging.files:
logstash.conf
input {
beats {
port => "5044"
}
}
filter {
if [document_type] == "sms" {
csv {
columns => ['Date', 'Time', 'PLAN', 'CALL_TYPE', 'MSIDN', 'IMSI', 'IMEI']
separator => " "
skip_empty_columns => true
quote_char => "'"
}
}
if [document_type] == "voip" {
csv {
columns => ['Date', 'Time', 'PostDialDelay', 'Disconnect-Cause', 'Sip-Status','Session-Disposition', 'Calling-RTP-Packets-Lost','Called-RTP-Packets-Lost', 'Calling-RTP-Avg-Jitter','Called-RTP-Avg-Jitter', 'Calling-R-Factor', 'Called-R-Factor', 'Calling-MOS', 'Called-MOS', 'Ingress-SBC', 'Egress-SBC', 'Originating-Trunk-Group', 'Terminating-Trunk-Group']
separator => " "
skip_empty_columns => true
quote_char => "'"
}
}
}
output {
if [document_type] == "sms"{
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "smscdr_index"
}
stdout {
codec => rubydebug
}
}
if [document_type] == "voip" {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "voipcdr_index"
}
stdout {
codec => rubydebug
}
}
}
Logstash partial logs
[2019-12-05T12:48:38,227][INFO ][logstash.inputs.beats ] Beats inputs: Starting input listener {:address=>"0.0.0.0:5044"}
[2019-12-05T12:48:38,411][INFO ][logstash.pipeline ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x4ffc5251 run>"}
[2019-12-05T12:48:38,949][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2019-12-05T12:48:39,077][INFO ][org.logstash.beats.Server] Starting server on port: 5044
==========================================================================================
[2019-12-05T12:48:43,518][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["http://elasticsearch:9200"]}
[2019-12-05T12:48:43,745][INFO ][logstash.pipeline ] Pipeline started successfully {:pipeline_id=>".monitoring-logstash", :thread=>"#<Thread:0x46e8e60c run>"}
[2019-12-05T12:48:43,780][INFO ][logstash.agent ] Pipelines running {:count=>2, :running_pipelines=>[:".monitoring-logstash", :main], :non_running_pipelines=>[]}
[2019-12-05T12:48:45,770][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
filebeat log sample
2019-12-05T12:55:33.119Z INFO log/harvester.go:255 Harvester started for file: /usr/share/filebeat/logs/voip/voip_cdr_1595.csv
2019-12-05T12:55:33.126Z INFO log/harvester.go:255 Harvester started for file: /usr/share/filebeat/logs/voip/voip_cdr_2004.csv
2019-12-05T12:55:33.130Z INFO log/harvester.go:255 Harvester started for file: /usr/share/filebeat/logs/voip/voip_cdr_2810.csv
======================================================================================================
2019-12-05T13:00:44.002Z INFO log/harvester.go:280 File is inactive: /usr/share/filebeat/logs/voip/voip_cdr_563.csv. Closing because close_inactive of 5m0s reached.
2019-12-05T13:00:44.003Z INFO log/harvester.go:280 File is inactive: /usr/share/filebeat/logs/voip/voip_cdr_2729.csv. Closing because close_inactive of 5m0s reached.
2019-12-05T13:00:44.004Z INFO log/harvester.go:280 File is inactive: /usr/share/filebeat/logs/voip/voip_cdr_2308.csv. Closing because close_inactive of 5m0s reached.
2019-12-05T13:00:49.218Z INFO log/harvester.go:280 File is inactive: /usr/share/filebeat/logs/voip/voip_cdr_981.csv. Closing because close_inactive of 5m0s reached.
docker-compose ps
docker-compose -f docker-compose_stash.yml ps
The system cannot find the path specified.
Name Command State Ports
---------------------------------------------------------------------------------------------------------------------
elasticsearch_cdr /usr/local/bin/docker-entr ... Up 0.0.0.0:9200->9200/tcp, 9300/tcp
filebeat_cdr /usr/local/bin/docker-entr ... Up
kibana_cdr /usr/local/bin/kibana-docker Up 0.0.0.0:5601->5601/tcp
logstash_cdr /usr/local/bin/docker-entr ... Up 0.0.0.0:5000->5000/tcp, 0.0.0.0:5044->5044/tcp, 9600/tcp
In logstash you have a conditional check in the field document_type, but this field is not generated by filebeat, you need to correct your filebeat config.
Try this config for your inputs.
filebeat.prospectors:
- input_type: log
paths:
- logs/sms/*.csv
fields:
document_type: sms
paths:
- logs/voip/*.csv
fields:
document_type: voip
This will create a field named fields with a nested field named document_type, like the example below.
{ "fields" : { "document_type" : "voip" } }
And change your logstash conditionals to check agains the field fields.document_type, like the example below.
if [fields][document_type] == "sms" {
your filters
}
If you want, you can use the option fields_under_root: true in filebeat to create the document_type in the root of your document, so you will not need to change your logstash conditionals.
filebeat.prospectors:
- input_type: log
paths:
- logs/sms/*.csv
fields:
document_type: sms
fields_under_root: true
Related
Below is my configs for FileBeat and Logstash, i just followed this code. but i'm trying to push the docker container logs to ClickHouse.
Running the filebeat with below docker cmd. I think the Clickhouse log table and logstash log is not matching.
docker run --rm -it --name=filebeat --user=root \
--volume="/var/log/:/var/log/" \
--volume="/var/lib/docker/containers/:/var/lib/docker/containers/" \
--volume="$(pwd)/filebeat.docker.yml:/usr/share/filebeat/filebeat.yml" \
docker.elastic.co/beats/filebeat:5.6.3 filebeat -e -strict.perms=false
filebeat.yml
filebeat.prospectors:
- input_type: log
paths:
- /var/log/*.log
- /var/lib/docker/containers/*.log
exclude_files: ['.gz$']
output.logstash:
hosts: ['localhost:5044']
logstash.conf
input {
beats {
port => 5044
}
}
filter {
grok {
match => [ "message" , "%{COMBINEDAPACHELOG}+%{GREEDYDATA:extra_fields}"]
}
mutate {
convert => ["response", "integer"]
convert => ["bytes", "integer"]
convert => ["responsetime", "float"]
remove_field => ["#version", "host", "message", "beat", "offset", "type", "tags", "input_type", "source"]
}
date {
match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
remove_field => [ "timestamp", "#timestamp" ]
target => [ "logdatetime" ]
}
ruby {
code => "tstamp = event.get('logdatetime').to_i
event.set('logdatetime', Time.at(tstamp).strftime('%Y-%m-%d %H:%M:%S'))
event.set('logdate', Time.at(tstamp).strftime('%Y-%m-%d'))"
}
useragent {
source => "agent"
}
prune {
interpolate => true
whitelist_names => ["^logdate$" ,"^logdatetime$" ,"^request$" ,"^agent$" ,"^os$" ,"^minor$" ,"^auth$" ,"^ident$" ,"^verb$" ,"^patch$" ,"^referrer$" ,"^major$" ,"^build$" ,"^response$","^bytes$","^clientip$" ,"^name$" ,"^os_name$" ,"^httpversion$" ,"^device$" ]
}
}
output {
clickhouse {
http_hosts => ["http://localhost:8123"]
table => "nginx_access"
request_tolerance => 1
flush_size => 1000
pool_max => 1000
}
}
filebeat logs
2023/02/09 09:45:22.247315 spooler.go:63: INFO Starting spooler: spool_size: 2048; idle_timeout: 5s
2023/02/09 09:45:22.247331 prospector.go:124: INFO Starting prospector of type: log; id: 7589403446011535719
2023/02/09 09:45:22.247356 crawler.go:58: INFO Loading and starting Prospectors completed. Enabled prospectors: 1
2023/02/09 09:45:22.247929 log.go:91: INFO Harvester started for file: /var/log/falcond.log
2023/02/09 09:45:22.247982 log.go:91: INFO Harvester started for file: /var/log/yum.log
2023/02/09 09:45:22.248088 log.go:91: INFO Harvester started for file: /var/log/falconctl.log
2023/02/09 09:45:22.248622 log.go:91: INFO Harvester started for file: /var/log/boot.log
2023/02/09 09:45:22.248665 log.go:91: INFO Harvester started for file: /var/log/falcon-sensor.log
2023/02/09 09:45:52.239459 metrics.go:39: INFO Non-zero metrics in the last 30s: filebeat.harvester.open_files=5 filebeat.harvester.running=5 filebeat.harvester.started=5 libbeat.logstash.call_count.PublishEvents=1 libbeat.logstash.publish.read_bytes=24 libbeat.logstash.publish.write_bytes=3764 libbeat.logstash.published_and_acked_events=77 libbeat.publisher.published_events=77 publish.events=82 registrar.states.current=5 registrar.states.update=82 registrar.writes=2
2023/02/09 09:46:22.239403 metrics.go:34: INFO No non-zero metrics in the last 30s
2023/02/09 09:46:52.239518 metrics.go:34: INFO No non-zero metrics in the last 30s
2023/02/09 09:47:22.239436 metrics.go:34: INFO No non-zero metrics in the last 30s
I'm interested why you have chosen Logstash and not something Fluentbit / Vector for log tailing...both of those have native ClickHouse integrations.
https://clickhouse.com/blog/storing-log-data-in-clickhouse-fluent-bit-vector-open-telemetry
I have a few Docker containers running on my ec2 instance.
I want to save logs from these containers directly to Logstash (Elastic Cloud).
When I tried to install Filebeat manually, everything worked allright.
I have downloaded it using
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.2.0-linux-x86_64.tar.gz
I have unpacked it, changed filebeat.yml configuration to
filebeat.inputs:
- type: log
enabled: true
fields:
application: "myapp"
fields_under_root: true
paths:
- /var/lib/docker/containers/*/*.log
cloud.id: "iamnotshowingyoumycloudidthisisjustfake"
cloud.auth: "elastic:mypassword"
This worked just fine, I could find my logs after searching application: "myapp" in Kibana.
However, when I tried to run Filebeat from Docker, no success.
This is filebeat part of my docker-compose.yml
filebeat:
image: docker.elastic.co/beats/filebeat:7.2.0
container_name: filebeat
volumes:
- ./filebeat/filebeat.yml:/usr/share/filebeat/filebeat.yml:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- /var/run/docker.sock:/var/run/docker.sock #needed for autodiscover
My previous filebeat.yml from manual execution doesn't work, so I have tried many examples, but nothing worked. This is one example which I think should work, but it doesn't. Docker container starts no problem, but it can't read from logfiles somehow.
filebeat.autodiscover:
providers:
- type: docker
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/lib/docker/containers/*/*.log
json.keys_under_root: true
json.add_error_key: true
fields_under_root: true
fields:
application: "myapp"
cloud.id: "iamnotshowingyoumycloudidthisisjustfake"
cloud.auth: "elastic:mypassword"
I have also tried something like this
filebeat.autodiscover:
providers:
- type: docker
templates:
config:
- type: docker
containers.ids:
- "*"
filebeat.inputs:
- type: docker
containers.ids:
- "*"
processors:
- add_docker_metadata:
fields:
application: "myapp"
fields_under_root: true
cloud.id: "iamnotshowingyoumycloudidthisisjustfake"
cloud.auth: "elastic:mypassword"
I have no clue what else to try, filebeat logs still shows
"harvester":{"open_files":0,"running":0}}
I am 100% sure that logs from containers are under /var/lib/docker/containers/*/*.log ... as I said, Filebeat worked, when installed manually, not as docker image.
Any suggesions ?
Output log from Filebeat
2019-07-23T05:35:58.128Z INFO instance/beat.go:292 Setup Beat: filebeat; Version: 7.2.0
2019-07-23T05:35:58.128Z INFO [index-management] idxmgmt/std.go:178 Set output.elasticsearch.index to 'filebeat-7.2.0' as ILM is enabled.
2019-07-23T05:35:58.129Z INFO elasticsearch/client.go:166 Elasticsearch url: https://123456789.us-east-1.aws.found.io:443
2019-07-23T05:35:58.129Z INFO [publisher] pipeline/module.go:97 Beat name: e3e5163f622d
2019-07-23T05:35:58.136Z INFO [monitoring] log/log.go:118 Starting metrics logging every 30s
2019-07-23T05:35:58.142Z INFO instance/beat.go:421 filebeat start running.
2019-07-23T05:35:58.142Z INFO registrar/migrate.go:104 No registry home found. Create: /usr/share/filebeat/data/registry/filebeat
2019-07-23T05:35:58.142Z INFO registrar/migrate.go:112 Initialize registry meta file
2019-07-23T05:35:58.144Z INFO registrar/registrar.go:108 No registry file found under: /usr/share/filebeat/data/registry/filebeat/data.json. Creating a new registry file.
2019-07-23T05:35:58.146Z INFO registrar/registrar.go:145 Loading registrar data from /usr/share/filebeat/data/registry/filebeat/data.json
2019-07-23T05:35:58.146Z INFO registrar/registrar.go:152 States Loaded from registrar: 0
2019-07-23T05:35:58.146Z INFO crawler/crawler.go:72 Loading Inputs: 1
2019-07-23T05:35:58.146Z WARN [cfgwarn] docker/input.go:49 DEPRECATED: 'docker' input deprecated. Use 'container' input instead. Will be removed in version: 8.0.0
2019-07-23T05:35:58.150Z INFO log/input.go:148 Configured paths: [/var/lib/docker/containers/*/*.log]
2019-07-23T05:35:58.150Z INFO input/input.go:114 Starting input of type: docker; ID: 11882227825887812171
2019-07-23T05:35:58.150Z INFO crawler/crawler.go:106 Loading and starting Inputs completed. Enabled inputs: 1
2019-07-23T05:35:58.150Z WARN [cfgwarn] docker/docker.go:57 BETA: The docker autodiscover is beta
2019-07-23T05:35:58.153Z INFO [autodiscover] autodiscover/autodiscover.go:105 Starting autodiscover manager
2019-07-23T05:36:28.144Z INFO [monitoring] log/log.go:145 Non-zero metrics in the last 30s
{"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":10,"time":{"ms":17}},"total":{"ticks":40,"time":{"ms":52},"value":40},"user":{"ticks":30,"time":{"ms":35}}},"handles":{"limit":{"hard":4096,"soft":1024},"open":9},"info":{"ephemeral_id":"4427db93-2943-4a8d-8c55-6a2e64f19555","uptime":{"ms":30111}},"memstats":{"gc_next":4194304,"memory_alloc":2118672,"memory_total":6463872,"rss":28352512},"runtime":{"goroutines":34}},"filebeat":{"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0}},"output":{"type":"elasticsearch"},"pipeline":{"clients":1,"events":{"active":0}}},"registrar":{"states":{"current":0},"writes":{"success":1,"total":1}},"system":{"cpu":{"cores":1},"load":{"1":0.31,"15":0.03,"5":0.09,"norm":{"1":0.31,"15":0.03,"5":0.09}}}}}}
Hmm, I don't see anything obvious in the Filebeat config on why its not working, I have a very similar config running for a 6.x Filebeat.
I would suggest doing a docker inspect on the container and confirming that the mounts are there, maybe check on permissions but errors would have probably shown in the logs.
Also could you try looking into using container input? I believe it is the recommended method for container logs in 7.2+: https://www.elastic.co/guide/en/beats/filebeat/7.2/filebeat-input-container.html
I'm following this tutorial to get logs from my docker containers stored in elasticsearch via filebeat and logstash Link to tutorial
However, nothing is being shown in kibana and when I run a docker-logs on my filebeat container, I'm getting the following error
2019-03-30T22:22:40.353Z ERROR log/harvester.go:281 Read line error: parsing CRI timestamp: parsing time "-03-30T21:59:16,113][INFO" as "2006-01-02T15:04:05Z07:00": cannot parse "-03-30T21:59:16,113][INFO" as "2006"; File: /usr/share/dockerlogs/data/2f3164397450efdd5851c3fad67fe405ab3dd822bbea1d807a993844e9143d5e/2f3164397450efdd5851c3fad67fe405ab3dd822bbea1d807a993844e9143d5e-json.log
My containers are hosted on a linux virtual machine where the virtual machine is running on a windows machine (Not sure if this could be causing the error due to the locations specified)
Below I'll describe what's running and some files in case the article is deleted in the future etc
One container is running which is simply running the following command, printing lines that filebeat should be able to read
CMD while true; do sleep 2 ; echo "{\"app\": \"dummy\", \"foo\": \"bar\"}"; done
My filebeat.yml file is as follows
filebeat.inputs:
- type: docker
combine_partial: true
containers:
path: "/usr/share/dockerlogs/data"
stream: "stdout"
ids:
- "*"
exclude_files: ['\.gz$']
ignore_older: 10m
processors:
# decode the log field (sub JSON document) if JSON encoded, then maps it's fields to elasticsearch fields
- decode_json_fields:
fields: ["log", "message"]
target: ""
# overwrite existing target elasticsearch fields while decoding json fields
overwrite_keys: true
- add_docker_metadata:
host: "unix:///var/run/docker.sock"
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
# setup filebeat to send output to logstash
output.logstash:
hosts: ["logstash"]
# Write Filebeat own logs only to file to avoid catching them with itself in docker log files
logging.level: error
logging.to_files: false
logging.to_syslog: false
loggins.metrice.enabled: false
logging.files:
path: /var/log/filebeat
name: filebeat
keepfiles: 7
permissions: 0644
ssl.verification_mode: none
Any suggestions on why filebeat is failing to forward my logs and how to fix it would be appreciated. Thanks
I'm new in Docker and I'm having problems running a simple logstash.conf with Docker.
My Dockerfile:
FROM docker.elastic.co/logstash/logstash:5.0.0
RUN rm -f ~/desktop/docker_logstash/logstash.conf
Logstash.conf:
input {
file {
path => "~/desktop/filename.log"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
grok {
match => {
"message" => "%{COMBINEDAPACHELOG}"
}
}
}
output {
stdout {
codec => rubydebug
}
}
Docker commands:
docker build -t logstashexample .
docker run logstashexample
I can build the container but when I run it it is stuck on:
Sending Logstash logs to /usr/share/logstash/logs which is now configured via log4j2.properties.
[2017-11-22T11:08:23,040][INFO][logstash.inputs.beats ] Beats inputs: Starting input listener {:address=>"0.0.0.0:5044"}
[2017-11-22T11:08:24,501][INFO][logstash.pipeline ] Starting pipeline {"id"=>"main", "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>125}
[2017-11-22T11:08:24,520][INFO ][logstash.pipeline ] Pipeline main started [2017-11-22T11:08:24,593][INFO][org.logstash.beats.Server] Starting server on port: 5044
[2017-11-22T11:08:25,054][INFO ][logstash.agent] Successfully started Logstash API endpoint {:port=>9600}
What am I doing wrong? Thanks.
I am using docker to host my logstash and elasticsearch.
Logstash joins the cluster and then it disconnect after 2 sec's.
Below is the exception i am getting.
[2015-08-31 23:30:40,880][INFO ][cluster.service ] [Ms.
MODOK] removed
{[logstash-da1b6e0a073b-1-11622][G_hYr0mcTZ6G-IOia1g5Cg][da1b6e0a073b][inet[/172.17.5.146:9300]]{data=false,
client=true},}, reason:
zen-disco-node_failed([logstash-da1b6e0a073b-1-11622][G_hYr0mcTZ6G-IOia1g5Cg][da1b6e0a073b][inet[/172.17.5.146:9300]]{data=false,
client=true}), reason transport disconnected
My logstash configuration file.
input {
stdin { }
}
output {
elasticsearch {
host => elasticsearch
}
stdout { codec => rubydebug }
}
I missed it.
Need to add log file location, created volumes in docker and pointed input of logstash and everything started working.
file {
path => [ "/opt/logs/test_log.json" ]
codec => json {
charset => "UTF-8"
}
}