Trying to parse a custom log using grok - parsing

I have the following log:
2016-10-20T23:56:42.000+00:00 clientIp:83.149.9.216 TransactionId=1233 TransactionType=Sell
How can i ignore the words clientIp:, TransactionId= and TransactionType= to match only the values?
If I modify my log to look like this:
2016-10-20T23:56:42.000+00:00 83.149.9.216 1233 Sell
And I use this pattern:
%{TIMESTAMP_ISO8601:timestamp} %{IP:clientIp} %{NUMBER:TransactionId} %{WORD:TransactionType}
It works.
So i need a way to read only the values after "word:" or "word="

Your pattern can include literals, e.g.
TransactionId=%{NUMBER:TransactionId}

Related

In custom Instrument, how can I include the duration in a value

I have a custom Instrument with a os-signpost-interval-schema that captures a "state" string. I would like the final plot value to be <state>: <duration>, but I don't know how to get the duration into the string.
My working schema is the following, which just stores the state itself in the column:
<os-signpost-interval-schema>
<id>state-interval</id>
<title>State Interval</title>
<subsystem>"..."</subsystem>
<category>"..."</category>
<name>"state"</name>
<start-pattern>
<message>?state</message>
</start-pattern>
<column>
<mnemonic>state</mnemonic>
<title>State</title>
<type>string</type>
<expression>?state</expression>
</column>
</os-signpost-interval-schema>
I would like to change the expression in the column to (str-cat ?state ": " ?duration), but that fails with:
Variable '?duration' must appear in a pattern element to be used in a later expression.
I don't see any way to compute this later in the graph, lane, or plot. I've also tried explicitly creating a <duration-column>, but that doesn't seem to change anything.
The rest of the pieces include the table:
<create-table>
<id>state-table</id>
<schema-ref>state-interval</schema-ref>
</create-table>
And the lane, which I would like to display as <state>: <duration> rather than just the duration:
<lane>
<title>State</title>
<table-ref>state-table</table-ref>
<plot>
<value-from>state</value-from>
</plot>
</lane>
This turns out to be impossible. Apple does not expose duration as a variable. It can be solved by writing a custom modeler, though this adds a lot of complexity.

format timezone using XSLT/xpath 2.0

I need to get one date in one format like this:
2020-06-03T06:14:00.000+0100.
following this documentation page [1], I tried to do with this expression, but always get an error:
format-dateTime(current-dateTime(), "[Y0001]-[M01]-[D01]-[H01]:[m01]:[s][Z0000]")
I tried to put with this mask too:
format-dateTime(current-dateTime(), "[Y0001]-[M01]-[D01]-[H01]:[m01]:[s][Z0001]")
but the result is 2020-06-03-14:39:50+02:00
I need to delete the ":" on the offset, ¿Which mask may I use?
[1]https://www.rfc-editor.org/rfc/rfc3339#section-5.6
A workaround for your problem could be splitting the output of format-dateTime into two parts and remove the colon on the second expression:
concat(format-dateTime(current-dateTime(), "[Y0001]-[M01]-[D01]-[H01]:[m01]:[s]"),translate(format-dateTime(current-dateTime(), "[Z0001]"),":",""))
Maybe this works for you.

Dart Markdown package, how to handle new lines

I am trying to make a WYSIWYG internal tool. And we decided to implement this feature with contentEditable. However, we save data to our databases in markdown. So I have to be able to parse from html to md and back. For html to md I use package html2md and for the other way around I use Markdown package.
The issue i've been having is that when you write to my editor text like
HEY
After many lines some text
It produces this in md
HEY
After many lines some text
Notably it uses 2 whitespace and 2 LF characters (or atleast i think so but i might be slightly wrong.) I solved this issue by parsing it like this
markdownToHtml(data.replaceAll('&', '&').replaceAll('<', '<').replaceAll('>', '>'), inlineSyntaxes: [TextSyntax(String.fromCharCodes([32,32,10,10]),sub: "<div><br></div>")],inlineOnly: true );
The inline only parameter was neccesary because without it the text syntax wasnt applied for some reason. However this inline only then bit me in the arse when I tried to implement parsing of unordered lists, which are parsed as blocks. So I need a way to correctly parse these empty lines without using inline only.
class EmptyLineBlockSyntax extends BlockSyntax{
RegExp get pattern => RegExp(r'^(?:[ \t][ \t]+)$');
const EmptyLineBlockSyntax();
Node parse(BlockParser parser) {
parser.encounteredBlankLine = true;
parser.advance();
return Element('p',[Element.empty('br')]);
}
}
return markdownToHtml(data.replaceAll('&', '&').replaceAll('<', '<').replaceAll('>', '>'), blockSyntaxes: [EmptyLineBlockSyntax()]);

Which settings should be used for TokensregexNER

When I try regexner it works as expected with the following settings and data;
props.setProperty("annotators", "tokenize, cleanxml, ssplit, pos, lemma, regexner");
Bachelor of Laws DEGREE
Bachelor of (Arts|Laws|Science|Engineering|Divinity) DEGREE
What I would like to do is that using TokenRegex. For example
Bachelor of Laws DEGREE
Bachelor of ([{tag:NNS}] [{tag:NNP}]) DEGREE
I read that to do this, I should use TokensregexNERAnnotator.
I tried to use it as follows, but it did not work.
Pipeline.addAnnotator(new TokensRegexNERAnnotator("expressions.txt", true));
Or I tried setting annotator in another way,
props.setProperty("annotators", "tokenize, cleanxml, ssplit, pos, lemma, tokenregexner");
props.setProperty("customAnnotatorClass.tokenregexner", "edu.stanford.nlp.pipeline.TokensRegexNERAnnotator");
I tried to different TokenRegex formats but either annotator could not find the expression or I got SyntaxException.
What is the proper way to use TokenRegex (query on tokens with tags) on NER data file ?
BTW I just see a comment in TokensRegexNERAnnotator.java file. Not sure if it is related pos tags does not work with RegexNerAnnotator.
if (entry.tokensRegex != null) {
// TODO: posTagPatterns...
pattern = TokenSequencePattern.compile(env, entry.tokensRegex);
}
First you need to make a TokensRegex rule file (sample_degree.rules). Here is an example:
ner = { type: "CLASS", value: "edu.stanford.nlp.ling.CoreAnnotations$NamedEntityTagAnnotation" }
{ pattern: (/Bachelor/ /of/ [{tag:NNP}]), action: Annotate($0, ner, "DEGREE") }
To explain the rule a bit, the pattern field is specifying what type of pattern to match. The action field is saying to annotate every token in the overall match (that is what $0 represents), annotate the ner field (note that we specified ner = ... in the rule file as well, and the third parameter is saying set the field to the String "DEGREE").
Then make this .props file (degree_example.props) for the command:
customAnnotatorClass.tokensregex = edu.stanford.nlp.pipeline.TokensRegexAnnotator
tokensregex.rules = sample_degree.rules
annotators = tokenize,ssplit,pos,lemma,ner,tokensregex
Then run this command:
java -Xmx8g edu.stanford.nlp.pipeline.StanfordCoreNLP -props degree_example.props -file sample-degree-sentence.txt -outputFormat text
You should see that the three tokens you wanted tagged as "DEGREE" will be tagged.
I think I will push a change to the code to make tokensregex link to the TokensRegexAnnotator so you won't have to specify it as a custom annotator.
But for now you need to add that line in the .props file.
This example should help in implementing this. Here are some more resources if you want to learn more:
http://nlp.stanford.edu/software/tokensregex.shtml#TokensRegexRules
http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/ling/tokensregex/SequenceMatchRules.html
http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/ling/tokensregex/types/Expressions.html

Raw SQL Insert into remote ruby on rails database

I am having trouble with inserting data into an sqlite development database.
My app has 2 servers, one that scrapes browsers (browserscraper) and another that serves client requests. Each of these have a production and development.
I'm setting up development to insert the final scraped data into my development client request server however I can't get the insert to work. I suspect it is related to escaping the content properly but i have been on google for several hours trying to figure this out.
Here is the insert going from my scraping app to my remote client app
#sql_insert = "INSERT INTO #{#table} (`case_number`, `style_of_case`, `circuit`, `judge`, `location`, `disposition`, `date_filed`, `disposition_date`, `case_type`, 'lead_details', 'charge_details')"
#sql_values = " VALUES (#{self.case_number.to_blob}, #{self.style_of_case.to_blob}, #{self.circuit.to_blob}, #{self.judge.to_blob}, #{self.location.to_blob}, #{self.disposition.to_blob}, #{self.date_filed.to_blob}, #{self.disposition_date.to_blob}, #{self.case_type.to_blob}, #{self.lead_details.to_blob}, #{self.charge_details.to_blob});"
#db = SQLite3::Database::new('E:/Sites/aws/db/development.sqlite3')
#db.execute(#sql_insert + #sql_values + "COMMIT;")
The ultimate query looks something like this (quite ugly i know). The last two that i am inserting are yaml
INSERT INTO lead_to_processes (`case_number`, `style_of_case`, `circuit`, `judge`, `location`, `disposition`, `date_filed`, `disposition_date`, `case_type`, 'lead_details', 'charge_details') VALUES (130025129, 130025129 - CITY, 1st(Jim, Counties), LOVEKAMP, KELLY LAREE, Schuyler, Plea Written, 03/19/2012, 03/19/201, Municipal Ordinance - Traffic, ---
1-address_line_1: 6150 RICHLAND RD
1-address_line_2: ''
1-city: 'GEORGIA'
1-birth_year: '1955' 
1-is_alive: 1
, ---
1-Description: Not Available }
1-Code: '95220'
);
You're not hacking PHP in 1999 so you shouldn't be using string interpolation to talk to your database. SQLite3::Database#execute supports placeholders, please use them; your execute should look something like this:
#db.execute("insert into #{#table} (case_number, style_of_case, ...) values (?, ?, ...)", [
self.case_number.to_blob,
self.style_of_case.to_blob,
...
])
That way the database interface will take care of all the quoting and escaping and whatnot for you.
I'm not familiar with Ruby or SQLite, but purely looking at your query you have the last two column names quoted incorrectly with single quotes. 'lead_details' and 'charge_details' should not need to be in quotes unless you use back ticks like the other column names.
Further to that, the values you are inserting are not quoted correctly either. Most languages provide a function to escape and quote database strings appropriately.
I would also suggest checking what the actual error message from your insert is as it should help point you towards the problem in situations like this.
INSERT INTO lead_to_processes (case_number, style_of_case, circuit, judge, location, disposition, date_filed, disposition_date, case_type, 'lead_details', 'charge_details') VALUES (130025129, 130025129 - CITY, 1st(Jim, Counties), LOVEKAMP, KELLY LAREE, Schuyler, Plea Written, 03/19/2012, 03/19/201, Municipal Ordinance - Traffic, --- 1-address_line_1: 6150 RICHLAND RD 1-address_line_2: '' 1-city: 'GEORGIA' 1-birth_year: '1955' 1-is_alive: 1 , --- 1-Description: Not Available } 1-Code: '95220' );
It looks like, starting with 130025129 - CITY, your input values are not surrounded with quotes, so the query parser cannot parse it. I would surround each string value with single quotes.

Resources