parsing custom log in Logstash - parsing

I want to ask your help in parsing my logs in custom format. I tried to use http://grokdebug.herokuapp.com/ for discover my log format, but unfortunately I didn't succeed. my log has the next format:
#AdressHost#TypeLogs|OrganizationName|user#mail.com|CallMethod|ExecutionTimeOnDB|ExecutionTimeOnAppServer|Date Time
for example:
#mac.frozm.com#CallInfo|Jonsens|jack.lellow#jonsens.com|GetTotalsInfo|19|3|2014-05-11 07:49:10
I try to use the following pattern:
#%{URIHOST}#CallInfo|Jonsens|%{USER:auth}#%{URIPROTO}|GetTotalsInfo|%{NUMBER:duration}|0|%{DATESTAMP} %{TIME}
but Logstash keeps throwing "_grokparsefailure"
Can you help me or might suggest another way for parsing log in the Logstash tool

This part:
GetTotalsInfo|19|3|
does not seem to match up with this part of your pattern:
GetTotalsInfo|%{NUMBER:duration}|0|
as you specify a 0 in the pattern where in your example you have a 3.
Here's a general piece of advice when building grok patterns. Do small pieces of the pattern at a time, and check to see if they work first before building up a larger pattern. You can use this to debug patterns that are giving you trouble, taking each individual piece and testing them.
For example, just start out with a pattern like this and check to see if it works:
"^#%{URIHOST}"
(the ^ is a regex anchor that guarantees there are no preceding characters)
Build up from there.
I hope this helped!

Related

Use the Jenkins build log variable ${BUILD_LOG} as a substring in the Post Build Actions

I want to expose the Jenkins build log as part of the Post Build Actions. Ideally, I just want to use some of the log, and I'm thinking that I can do this by using a substring of what is produced from the build log.
${BUILD_LOG, maxLines, escapeHtml}
Reference: How can I take last 20 lines from the $BUILD_LOG variable?
The reason that I want to use substring over maxlines is because I just want to output some of the log, in this case from ERROR:, and the length is variable.
Any help on this would be much appreciated!
Doable. This post covers off the hurdles / issues you will need to consider and deal with.

Seeing Bad parsing rule for Jenkins Log parser plugin

I am trying to use Log Parser Plugin with Jenkins. Following is my rule file which I have taken from the sample given on the link.
# match line starting with 'error', case-insensitive
error /(?i)^error/
# list of warnings here...
warning /[Ww]arning/
warning /WARNING/
# create a quick access link to lines in the report containing 'INFO'
info /INFO/
# each line containing 'BUILD' represents the start of a section for grouping errors and warnings found after the line.
# also creates a quick access link.
start /BUILD/
I still see following at the end of the Parsed Console Output page:
NOTE: Some bad parsing rules have been found:
Bad parsing rule: , Error:1
Bad parsing rule: , Error:1
Bad parsing rule: , Error:1
I did come across this, but dint help as I am not using space anywhere.
Can someone help me resolving this issue?
It appears you have extra white-space somewhere in the file that the plugin is interpreting as you attempting to define a rule. Maybe try running it with the empty lines removed. That plugin has given me quite a bit of trouble as well, it's not very well documented (as is the case with many Jenkins plugins).
I had tried no spaces in the pattern, but that did not work. Turns out that the Parsing Rules files does not support empty lines in it. Once I removed the empty lines, I did not get this "Bad parsing rule: , Error:1".
I think since the line is empty - it doesn't echo any rule after the first colon. Would have been nice it the line number was reported where the problem is.
I posted the same to this thread too - Log parsing rules in Jenkins
Hopefully, it helps out other folks who may be wondering what is causing this.

Machine parseable error messages

(From https://groups.google.com/d/msg/bazel-discuss/cIBIP-Oyzzw/caesbhdEAAAJ)
What is the recommended way for rules to export information about failures such that downstream tools can include them in UIs.
Example use case:
I ran bazel test //my:target, and one of the actions for //my:target fails because there is an unknown variable "usrname" in my/target.foo at line 7 column 10. It would also like to report that "username" is a valid variable and this is a possible misspelling. And thus wants to suggest an addition of an "e" character.
One way I have thought to do this is to have a separate file that my action produces //my:target.errors that is in a separate output group and have it write machine parseable data there in addition to human readable data on stdout.
I can then find all of these files and parse the data in them in downstream tools.
Is there any prior work on this, or does everything just try to parse the human readable output?
I recommend running the error checkers as extra actions.
I don't think Bazel currently has hooks for custom error handlers like you describe. Please consider opening a feature request: https://github.com/bazelbuild/bazel/issues/new

Processing multiline events from a text file in Dataflow

I am attempting to build a dataflow pipeline to process a text file which contains events that span multiple lines. The dataflow SDK TextIO class assumes each line is a new event.
My plan is to create a new TextReader and register it with the DataPipelineRunner. This new reader will know how to aggregate the multiple lines into a single line.
I am pretty sure that this approach will work but I am wondering if this is the right way to do it or if there is a simpler solution?
The text I am trying to parse is:
==============> len:45 pktype:4 mtype:2
SYMBOL: USOCSTIA151632.00
OPEN_INT: 212
PR_OPEN_INTEREST: 212
TIME_STAMP: 04/10/2015 06:30:17:420 val:1428661817
The result should be the last 4 lines concatenated together and the first line dropped.
Best regards,
Peter
Note that TextReader is an internal implementation detail class, so subclassing it would be highly discouraged and challenging to do properly.
The recommended way to define a new file-based format like yours is to subclass FileBasedSource using the user-defined source API.
In your case, I would recommend to base your class on the LineIO example from documentation, and wrap the LineReader defined there into your own class which would use LineReader as a helper for reading individual lines, but:
In startReading() it would skip until the line starting with "====>"
In readNextRecord() it would read lines until the next "====>" and bundle them into a single record.
Please make sure to carefully read the documentation to FileBasedSource and FileBasedReader: the parallelization mechanism relies on the consistency properties described there, which your format has to satisfy, for ensuring that records are not duplicated or omitted on the boundaries between adjacent processing shards. XmlSource tests are a good example of how to unit-test these properties.
Please tell us how it goes and report back with any problems or questions - we are very interested in feedback on this API.

Examples of getting it wrong first, on purpose

I just caught myself doing something I do a lot, and wanted to generalize it, express it, share it and see who else is following this general practice, to find some other example situations where it might be relevant.
The general practice is getting something wrong first, on purpose, to establish that everything else is right before undertaking the current task.
What I was trying to do, specifically, was to find examples in our code base where the dojo TextArea widget was used. I knew (because I had it in front of me - existence proof) that the TextBox widget was present in at least one file. So I looked first for what I knew was there:
grep -r digit.form.TextBox | grep -v
svn
This wasn't right - I had made a common (for me) mistake of leaving off the star, so I fixed that:
grep -r digit.form.TextBox * | grep
-v svn
which found no results! Quick comparison with the file I was looking at showed me I had misspelled "dijit":
grep -r dijit.form.TextBox * | grep
-v svn
And now I got results. Cool; doing it wrong first on purpose meant my query was correct except for looking for the wrong thing, so now I could construct the right query:
grep -r dijit.form.TextArea * | grep
-v svn
and be confident that when it gave me no results, it was because there are no such files, and not because I had malformed the query.
I'll add three other examples as answers; please add any others you're aware of.
TDD
The red-green-refactor cycle of test-driven development may be the archetype of this practice. With red, demonstrate that the functionality doesn't exist; then make it exist and demonstrate that you've done so by witnessing the green bar.
http://support.microsoft.com/kb/275085
This VBA routine turns off the "subdatasheets" property for every table in your MS Access database. The user is instructed to make sure error-handling is set to "Break only on unhandled errors." The routine identifies tables needing the fix by the error that is thrown. I'm not sure this precisely fits your question, but it's always interesting to me that the error is being used in a non-error way.
Here's an example from VBA:
I also use camel case when I Dim my variables. ThisIsAnExampleOfCamelCase. As soon as I exit the VBA code line if Access doesn't change the lower case variable to camel case then I know I've got a typo. [OR, Option Explicit isn't set, which is the post topic.]
I also use this trick, several times an hour at least.
arrange - assert - act - assert
I sometimes like, in my tests, to add a counter-assertion before the action to show that the action is actually responsible for producing the desired outcome demonstrated by the concluding assertion.
When in doubt of my spelling, and of my editor's spell-checking
We use many editors. Many of them highlight misspelled words as I type them - some do not. I rely on automatic spell checking, but I can't always remember whether the editor of the moment has that feature. So I'll enter, say, "circuitx" and hit space. If it highlights, I'll back up over the space and the "x" and type another space - and learn that I spelled circuit correctly - but if it doesn't, I'll copy the word and paste it into a known spell-checker to see whether I did.
I'm not sure it's the best way to act, as it does not prevent you from mispelling the final command, for example typing "TestArea" or something like that instead of "TextArea" (your finger just have to slip a little for such a mistake).
IMHO the best way is to run your "final" command, but on two sample files first : one containing the requested text, another that doesn't.
In other words, instead of running a "similar" command, run the real one, but over "similar" data.
(Not sure if this would be a good idea to try for real!)
For example, you might give the system to the users for testing and tell them the password to get started is "Apple".
You know the users are fully up and ready to test (everything is installed and connections to databases working) when they contact you and say the password doesn't work (it's actually "Orange").

Resources