Parse docker logs with logstash - docker

I have a docker container that log to stdout/stderr. Docker save it's output into /var/lib/docker/containers//-logs.json
The log has lines with the following structure
{"log":"This is a message","stream":"stderr","time":"2015-03-12T19:27:27.310818102Z"}
which input/codec/filter should I use to get only the log field as the message ?
Thanks!

Use the json codec to parse the JSON string (you could instead use the json filter), then rename the "log" field to "message" with the mutate filter and finally use the date filter to parse the "time" field.
filter {
mutate {
rename => ["log", "message"]
}
date {
match => ["time", "ISO8601"]
remove_field => ["time"]
}
}

Related

Why key of hash is not parsed

I'm working with hash like this, the first key is a hash
hash = { { a: 'a', b: 'b' } => { c: { d: 'd', e: 'e' } } }
when I convert it to json, I get this:
data_json = hash.to_json
# => "{\"{:a=\\u003e\\\"a\\\", :b=\\u003e\\\"b\\\"}\":{\"c\":{\"d\":\"d\",\"e\":\"e\"}}}"
But when I parse the data, the first key is not parsed:
JSON.parse data_json
# => {"{:a=>\"a\", :b=>\"b\"}"=>{"c"=>{"d"=>"d", "e"=>"e"}}}
Why JSON.parse acts like that? and How can I fix it ?
In your original data structure, you have a Hash containing a single key-value pair. However, the key for this pair is itself a hash. This is not allowed in JSON (as only string keys are allowed in JSON), resulting in Ruby's JSON library to trying to do something sensible here: it inspects the key and uses the resulting String as the key in the created JSON object.
Unfortunately, this operation is not reversible when parsing the JSON object again. To solve this, You should try to adapt your source data structure to match what is allowed in JSON, i.e. to only use String keys in your hashes.

Extracting hash key that may or may not be an array

I'm making an API call that returns XML (JSON also available) and the response body will show errors if any. There may be only one error or multiple errors. When the XML (or JSON) is parsed into a hash, the key that holds the errors will be an array when multiple errors are present but will be just a standard key when only one error is present. This makes parsing difficult as I can't seem to come up with one line of code that would fit both cases
The call to the API returns this when one error
<?xml version=\"1.0\" encoding=\"utf-8\"?><response><version>1.0</version><code>6</code><message>Data validation failed</message><errors><error><parameter>rptFilterValue1</parameter><message>Parameter is too small</message></error></errors></response>
And this when multiple errors
<?xml version=\"1.0\" encoding=\"utf-8\"?><response><version>1.0</version><code>6</code><message>Data validation failed</message><errors><error><parameter>rptFilterValue1</parameter><message>Parameter is too small</message></error><error><parameter>rptFilterValue2</parameter><message>Missing required parameter</message></error></errors></response>
I use the following to convert the XML to a Hash
Hash.from_xml(response.body).deep_symbolize_keys
This returns the following hash.
When there is only one error, the hash looks like this
{:response=>{:version=>"1.0", :code=>"6", :message=>"Data validation failed", :errors=>{:error=>{:parameter=>"rptFilterValue1", :message=>"Parameter is too small"}}}}
When there are 2 errors, the hash looks like this
{:response=>{:version=>"1.0", :code=>"6", :message=>"Data validation failed", :errors=>{:error=>[{:parameter=>"rptFilterValue1", :message=>"Parameter is too small"}, {:parameter=>"rptFilterValue2", :message=>"Missing required parameter"}]}}}
When I first tested the API response, I had multiple errors so the way I went about getting the error message was like this
data = Hash.from_xml(response.body).deep_symbolize_keys
if data[:response].has_key?(:errors)
errors = data[:response][:errors][:error].map{|x| "#{x.values[0]} #{x.values[1]}"}
However when there is only one error, the code errors out with undefined method 'values' for parameter
The only actual workaround I found was to test the class of the error key. When Array I use one method for extracting and when Hash I use another method.
if data[:response][:errors][:error].class == Array
errors = data[:response][:errors][:error].map{|x| "#{x.values[0]} #{x.values[1]}"}
else
errors = data[:response][:errors][:error].map{|x| "#{x[1]}"}
end
But I just hate hate hate it. There has to be a way to extract xml/json data from a key that may or may not be an array. The solution may be in the conversion from xml to hash rather than when parsing the actual hash. I couldn't find anything online.
I'll appreciate any help or tip.
If you're using Rails, Array#wrap is available if you can do your .dig first:
single = {:response=>{:version=>"1.0", :code=>"6", :message=>"Data validation failed", :errors=>{:error=>{:parameter=>"rptFilterValue1", :message=>"Parameter is too small"}}}}
Array.wrap(single.dig(:response, :errors, :error))
This returns an Array of size 1:
[
{
:message => "Parameter is too small",
:parameter => "rptFilterValue1"
}
]
For multiples:
multiple = {:response=>{:version=>"1.0", :code=>"6", :message=>"Data validation failed", :errors=>{:error=>[{:parameter=>"rptFilterValue1", :message=>"Parameter is too small"}, {:parameter=>"rptFilterValue2", :message=>"Missing required parameter"}]}}}
Array.wrap(multiple.dig(:response, :errors, :error))
This returns an Array of size 2:
[
{
:message => "Parameter is too small",
:parameter => "rptFilterValue1"
},
{
:message => "Missing required parameter",
:parameter => "rptFilterValue2"
}
]
You can parse XML with Nokogiri and xpath, which returns array even if selector points out single element
errors = Nokogiri::XML(xml_response).xpath('//error')
errors.map { |e| e.children.each_with_object({}) { |x, h| h[x.name] = x.content } }
Your API response with single error gives
=> [{"parameter"=>"rptFilterValue1", "message"=>"Parameter is too small"}]
and API result with multiple errors
=> [{"parameter"=>"rptFilterValue1", "message"=>"Parameter is too small"}, {"parameter"=>"rptFilterValue2", "message"=>"Missing required parameter"}]
If there's no error elements you'll get an empty array.

logstash elastic not using source timestamp

Current setup looks like this.
Spring boot -> log-file.json ( using logstash-logback-encoder) -> filebeat -> logstash -> elastic
I am able to see logs appearing in elastic search ok. However its not using the dates provided in the log-file its creating them on the fly.
json-example
{
"#timestamp":"2017-09-08T17:23:38.677+01:00",
"#version":1,
"message":"A received request - withtimestanp",
etc..
My logstash.conf input filter looks like this.
input {
beats {
port => 5044
codec => "json"
}
}
output {
elasticsearch {
hosts => [ 'elasticsearch' ]
}
}
If you take a look at the kibana output for the log it has the 9th not the 8th (when I actually created the log)
So have now resolved this.. Detail of fix is below.
logback.xml
<appender name="stash" class="ch.qos.logback.core.rolling.RollingFileAppender">
<filter class="ch.qos.logback.classic.filter.ThresholdFilter">
<level>info</level>
</filter>
<file>/home/rob/projects/scratch/log-tracing-demo/build/logs/tracing-A.log</file>
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
<fileNamePattern>/home/rob/projects/scratch/log-tracing-demo/build/logs/tracing-A.log.%d{yyyy-MM-dd}</fileNamePattern>
<maxHistory>30</maxHistory>
</rollingPolicy>
<encoder class="net.logstash.logback.encoder.LogstashEncoder" >
<includeContext>false</includeContext>
<fieldNames>
<message>msg</message>
</fieldNames>
</encoder>
</appender>
renamed field message to msg as logstash expects different default for message when incoming from beat
json-file.log
below is what the sample json output looks like
{"#timestamp":"2017-09-11T14:32:47.920+01:00","#version":1,"msg":"Unregistering JMX-exposed beans","logger_name":"org.springframework.jmx.export.annotation.AnnotationMBeanExporter","thread_name":"Thread-19","level":"INFO","level_value":20000}
filebeat.yml
json settings below now handle timestamp issue where it didnt use the time from log file.
Also it moves the json into the root of the json output to logstash. i.e. its not nested within a beat json event its part of the root.
filebeat.prospectors:
- input_type: log
paths:
- /mnt/log/*.log
json.overwrite_keys: true
json.keys_under_root: true
fields_under_root: true
output.logstash:
hosts: ['logstash:5044']
logstash.conf
Using the msg rather than message resolves the JSON parse error, original data now in message field. see here
https://discuss.elastic.co/t/logstash-issue-with-json-input-from-beats-solved/100039
input {
beats {
port => 5044
codec => "json"
}
}
filter {
mutate {
rename => {"msg" => "message"}
}
}
output {
elasticsearch {
hosts => [ 'elasticsearch' ]
user => 'elastic'
password => 'changeme'
}
}

Two different syntax in grok

A normal event could be like this:
2015-11-20 18:50:33,739 [TRE01_0101] [76] [10.117.10.220]
but sometimes I have a log with "default" IP:
2015-11-04 23:14:27,469 [TRE01_0101] [40] [default]
If I have defined in grok a [SYNTAX:SEMANTIC] pattern as follows:
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:time} \[%{DATA:instance}\] \[%{NUMBER:numeric}\] \[%{IP:client}\]}"}
}
How can I parse a log that contains dafault as IP?
Now I'm getting a _grokparsefailure because "default" is not an "IP SYNTAX".
Thanks in advance
You can group things together and then make them conditional:
(%{IP:client}|default)

Parse a log file and send info to sensu

Is there a way to make a Sensu check that takes a .log file as input and parses it and returns selected info to InfluxDB.
Im very new to this so maybe I didnt describe my problem the best way.
I found the best way to do this is with Logstash (mostly because I use ELK for general log aggregation anyway).
Set up a Logstash server.
https://www.elastic.co/products/logstash
Install logstash-forwarder on the client(s). Configure logstash-forwarder to read the logs you want and to send them to your logstash server.
https://github.com/elastic/logstash-forwarder
In the Logstash server's config;
Define a lumberjack input for the log you want to send to sensu (https://www.elastic.co/guide/en/logstash/current/plugins-inputs-lumberjack.html).
Eg:
input {
lumberjack {
port => 5555
type => "logs"
tags => ["lumberjack", "influxdb"]
}
}
Do your processing/filtering.
Eg:
filter {
if ("influxdb" in [tags]) {
...
}
}
Define an InfluxDB output (https://www.elastic.co/guide/en/logstash/current/plugins-outputs-influxdb.html).
Eg:
output {
influxdb {
...
}
}
This method would skip Sensu all together. If you do want to send the logs to Sensu and see the output in Uchiwa it would involve setting up some Sensu-friendly info in your logstash filter:
filter {
if ("influxdb" in [tags]) {
add_field => { "name" => "SensuCheckName" }
add_field => { "handler" => "SensuHandlerName" }
add_field => { "output" => "the stuff you want to send to sensu" }
add_field => { "status" => "1" }
}
}
And sending the logs to sensu's RabbitMQ transport (https://www.elastic.co/guide/en/logstash/current/plugins-outputs-rabbitmq.html):
output {
rabbitmq {
exchange => "results"
exchange_type => "direct"
host => "192.168.0.5 or whatever it is"
vhost => "/sensu"
user => "sensuUser"
password => "whateverItIs"
}
}
Define a Sensu handler for this (name above in logstash filter) and do any extra processing there before passing it to InfluxDB.
If you haven't got Sensu sending data to InfuxBD set up already, go here: https://github.com/sensu-plugins/sensu-plugins-influxdb

Resources