Logstash / GROK: Creating a custom Variable when parsing a log file - parsing

I am new to Grok, although I have managed to create custom regular expressions and write GROK filters in the logstash config file. My problem is as follows:
SOURCE FIELD -
I am parsing a log file, where, every event includes a 'source' field, which is the name of the log file, e.g.:
test.YYYYMMDD_HHMMSS.log
What I want to do is: For each event, where 'source' contains this filename, extract the date and time in the following format within a new field within the Grok Filter:
DD/MM/YYYY HH:MM:SS
I know how to write custom Regular Expressions (REs) in GROK, but I cannot write an RE which will match the data and format it before storing it into a variable. So that is my problem.
Can anyone please help?
Thanks a lot!

Extracting the date from the filename should work. You should be able to match the date parts in the source field with a grok filter and add a new field like so:
filter {
grok {
match => [
"source", "test.%{YEAR:year}%{MONTHNUM2:month}%{DATA:day}_%{HOUR:hour}%{MINUTE:minute}%{SECOND:second}.log"
]
}
mutate { add_field => { "your_new_date_field" => "%{day}/%{month}/%{year} %{hour}:%{minute}:%{second}" } }
}
I don't have the possibility to test this right now but I hope you get the idea.
This solution will create a lot of additional fields like year, month, day and so on. If you want to get rid of the additional fields you can use metadata fields.

Related

Scanning text to find domain types

I'm trying to piece together a part of a create action in a controller that scans the the text entered and intelligently understands what type of domain name it is.
I have a text box called "domain_names". A user puts domains into the box separated by commas, e.g. "yahoo.com, google.com"
In the controller it hits it like this:
#extracted_domains = (params[:domain_names]).split(",")
#extracted_domains.each do |domain|
domain.strip
domain_scan = domain.scan(/(\w+)[.]/).flatten
com_scan = domain.scan(/[.](\w+)/).flatten
new_domain_type = DomainType.find_or_create_by_domain_type(:domain_type => com_scan)
new_domain = Domain.create(:domain => domain_scan, :domain_type_id => new_domain_type.id)
end
In the console it works great. But when I put it into practise I get really odd things stored in the database. For example if :domain was meant to have the value "google", it will instead have the value "---\n- google\n" , when its stored in the database.
No idea why
Thanks in advance.
UPDATE**
Problem: It was putting an array into a string.
Solution: Make it a string.
domain_scan = domain.scan(/(\w+)[.]/).flatten.first
com_scan = domain.scan(/[.](\w+)/).flatten.first
It appears to be fed YAML input. Three dashes at the beginning of the string followed by a newline are a strong indicator of YAML: http://en.wikipedia.org/wiki/YAML#Sample_document
As for your issue, can we see the exact params that are sent?
I would take a look at https://github.com/pauldix/domainatrix for domain extraction.

XML Schema - Allow Invalid Dates

Hi I am using biztalk's FlatFile parser (using XML schema) to part a CSV file. The CSV File sometimes contains invalid date - 1/1/1900. Currently the schema validation for the flat file fails because of invalid date. Is there any setting that I can use to allow the date to be used?
I dont want to read the date as string. I might be forced to if there is no other way.
You could change it to a valid XML date time (e.g., 1900-01-00:00:00Z) using a custom pipeline component (see examples here). Or you can just treat it as a string in your schema and deal with converting it later in a map, in an orchestration, or in a downstream system.
Here is a a C# snippet that you could put into a scripting functoid inside a BizTalk map to convert the string to an xs:dateTime, though you'll need to do some more work if you want to handle the potential for bad input data:
public string ConvertStringDateToDateTime(string param1)
{
return DateTime.Parse(inputDate).ToString("s",System.Globalization.DateTimeFormatInfo.InvariantInfo);
}
Also see this blog post if you're looking to do that in multiple places in a single map.

extract information from particular string

I am new to ruby on rails. I have passed parameters ISDCode,AreaCode and Telephone number using POST from a form.
I have a string with information of the format countryName(ISDCode) passed in the variable ISDCode. For example "United States of America(+1)".
Now I want to save only the value of the ISDCode in the database.
What would be the ideal way to extract the ISD Code from the string?
Should I extract the ISD Code in Javascript before user POSTs the form or should I extract it in the model using a callback ?
Also is regex the only way to extract the information?
Since the string is from auto completion, the ISDcodes should be existing in your database. So the best solution may be including an extra parameter (with a hidden input), like isdcode_id, then you simply use isdcode_id in your model. This way you can avoid the trouble to parse the string.
If this is not feasible, regex could be the best way to extract the information. You can override the setter in the model to do it.
use regular expression to match ISDcode
"United States of America(+1)" =~ /(\+[\d]+)/
puts $1
If you are interested in getting just the ISD Code alone, this should work:
"United States of America(+1)".gsub!(/[^\+\d]/, "")
NB: You can have this in your view helper and just call the helper on the string before persistence
Already answered, but I'd like to offer an alternative to getting the ISD Code:
isd = "United States(+1)"
puts isd[/[+]*[\d]{1,4}/] # +1
This regexp matches:
0001
+1
+01
etc.
I prefer to use js to extract information in the client side and make a validation in the model. By this way, you can get what you want and make sure it's correct.

How to format a date to JSON in Rails?

I need to produce a date in Rails which looks like this:
/Date(1294268400000)/
I have tried various combinations of DateTime, to_i, to_json but never managed to get the /Date()/ thing.
Do I have to simply get my date in ms and then wrap the /Date(and )/ manually, or is there a built in method?
What about (ruby 1.9.x)?:
Time.now.strftime("/Date(%s%L)/")
=> "/Date(1335280866211)/"
You should try
new Date(posixMillisecondsHere)
first. MDN says that calling the Date function outside of the constructor context (i.e., without the new) will always return a string containing a formatted date rather than a Date object.
Strictly speaking, when you do that, you are writing JavaScript and not JSON. JSON cannot contain Date objects.
RFC 4627 says
2.1. Values
A JSON value MUST be an object, array, number, or string, or one of
the following three literal names:
false null true
If you want to put a Date into what is strictly considered JSON and then get it back out, you must choose some way of using the JSON primitives (to wit, objects, arrays, numbers, strings, etc.) to encode a Date.
If you want to get a Date back out of JSON, whatever parses your JSON must understand the convention that you used to encode the Date.
Hope these are credible and/or official enough to help.
What about something like this:
in your config/en.yml file:
en:
time:
formats:
json: "/Date(%s%L)/"
and than in the view:
<%= l(Time.now, :format => :json) %>
Please note that you would need access to the helpers in the method that renders json. So it won't work if you are using ActiveRecord#to_json method for generating jsons.
Check out this question:
c# serialized JSON date to ruby
... simple answer seems to be to create a parse_date method.
It's the UNIX Epoch (seconds since 1970-01-01) right? What about using DateTime#strftime method?
# Taken from the Ruby documentation
seconds_since_1970 = your_date.strftime("%s")
UPDATE: OK, it's milliseconds, according to the documentation you can use your_date.strftime("%Q") to get the ms (but I've not tried yet).

How to get regex to ignore URL strings?

I have the following Regexp to create a hash of values by separating a string at a semicolon:
Hash["photo:chase jarvis".scan(/(.*)\:(.*)/)]
// {'photo' => 'chase jarvis'}
But I also want to be able to have URL's in the string and recognize it so it maintains the URL part in the value side of the hash i.e:
Hash["photo:http://www.chasejarvis.com".scan(/(.*)\:(.*)/)]
// Results in {'photo:http' => '//www.chasejarvis.com'}
I want, of course:
Hash["photo:chase jarvis".scan(/ ... /)]
// {'photo' => 'http://www.chasejarvis.com'}
If you only want to match up to first colon you could change (.*)\:(.*) to ([^:]*)\:(.*).
Alternatively, you could make it a non-greedy match, but I prefer saying "not colon".
How do figure out a person's family name and first name?
Changing chasejarvis to chase and jarvis might not be possible unless you have a solution for that.
Do you already know everyone's name in your project? Nobody is having the initial of a middle name like charvisdjarvis (assuming the name is "Charvis D. Jarvis".)?

Resources