Groovy: Read file and get every matching result back - grep

I've tried the whole working day to do this, but I've not a result yet. So, what i wan't to do is this:
I have a textfile, bad formated, in this file are hunderts of textes like these:
2012-02-21 05:16:47,205 ERROR - No KPI mapping found for kpi 'stoerungbeheben_moduleaccess_triage_1',
I will find all Strings between ' and ' so that one result is: stoerungbeheben_moduleaccess_triage_1 and write it back to another .txt file
The text is different, sometimes the same.
I've tried with filterLine and pattern with regex but it doesn't work.
Could you give me a hint how I can do that?
Kind regards
Collin

The following groovy-script produces the desired result (though does not write to file, but I believe you can easily achieve that):
def regex = "[0-9]+-[^']+'([^']+)'[^\r\n]*\r?\n?"
def source = """
2012-02-21 05:16:47,205 ERROR - No KPI mapping found for kpi 'stoerungbeh_¤eben_moduleaccess_triage_1',
2012-02-21 05:16:47,205 ERROR - No KPI mapping found for kpi 'otherbeheben_üü'
2012-02-21 05:16:47,205 ERROR - No KPI mapping found for kpi 'stoerungbeheben_moduleaccess_triage_1',
2012-02-21 05:16:47,205 ERROR - No KPI mapping found for kpi 'thirdhbeheben_äÄ_moduleaccess_triage_1'
2012-02-21 05:16:47,205 ERROR - No KPI mapping found for kpi 'stoerungbeheben_mo&%duleaccess_triage_1',
"""
java.util.regex.Pattern p = java.util.regex.Pattern.compile(regex)
java.util.regex.Matcher m = p.matcher(source)
while(m.find()) {
println(m.group(1))
}
yields:
stoerungbeh_¤eben_moduleaccess_triage_1
otherbeheben_üü
stoerungbeheben_moduleaccess_triage_1
thirdhbeheben_äÄ_moduleaccess_triage_1
stoerungbeheben_mo&%duleaccess_triage_1
EDIT:
The explanation of the pattern would have been to long of a comment so added it to the answer:
Wikipedia article has a fairly comprehensive table of regex meta characters: http://en.wikipedia.org/wiki/Regular_expression#Examples
IMO the best way to learn and understand regexes is to write and execute zounds of regexes against various arbitrary strings.
The pattern is far from optimal but here's some explanation for [0-9]+-[^']+'([^']+)'[^\r\n]*\r?\n?:
[0-9]+- => + sign means match 1 or more numbers from 0 to 9. Then stop at hyphen (example: 2012-). This for tackling the case if there's no newline or it is the last line.
[^']+' => match 1 or more characters that are not apostrophe and stop at apostrophe (example: -02-21 05:16:47,205 ERROR - No KPI mapping found for kpi ').
([^']+)' => match and capture 1 or more characters that are not apostrophe and stop at apostrophe (example: stoerungbeheben_moduleaccess_triage_1' where from the captured part in brackets is: stoerungbeheben_moduleaccess_triage_1).
[^\r\n]* => match 0 or more characters that are not carriage return (\r) or newline (\n) (example: ,).
\r? => match carriage return if it exists.
\n? => match newline if it exists.

Related

Lua pattern matching for empty string

after trying and reading Lua's doc on patterns, I couldn't figure it out.
I am using an OBS plugin to activate a source on text document change, and the plugin uses Lua pattern matching. I would like to trigger the source whenver the document is EMPTY and only empty. How can I go about doing this?
Example, using the non-empty pattern match:
Thank you for any help!
Pattern matching an empty string is
^$
where
^ - matches the start of string position
$ - matches the end of string position.
That is, start and end of string positions must be the same position in the string.

How can I combine words with numbers when pattern matching in LUA?

I'm trying to match any strings that come in that follow the format Word 100.00% ~(45.56, 34.76) in LUA. As such, I'm looking to do a regex close (in theory) to this:
%D%s[%d%.%d]%%(%d.%d, %d.%d)
But I'm having no luck so far. LUA's patterns are weird.
What am I missing?
Your pattern is close you neglected to allow for multiple instances of a digit you can do this by using a + at like %d+.
You also did not use [,( and . correctly in the pattern.
[s in a pattern will create a set of chars that you are trying to match such as [abc] means you are looking to match any as bs or c at that position.
( are used to define a capture so the specific values you want returned rather then the whole string in the event of a match, in order to use it as a char you for the match you need to escape it with a %.
. will match any character rather then specifically a . you will need to add a % to escape if you want to match a . specifically.
local str = "Word 100.00% ~(45.56, 34.76)"
local pattern = "%w+%s%d+%.%d+%%%s~%(%d+%.%d+, %d+%.%d+%)"
print(string.match(str, pattern))
Here you will see the input string print if it matches the pattern otherwise you will see nil.
Suggested resource: Understanding Lua Patterns

Lua: How to start match after a character

I'm trying to make a search feature that allows you to split a search into two when you insert a | character and search after what you typed.
So far I have understood how to keep the main command by capturing before the space.
An example being that if I type :ban user, a box below would still say :ban, but right when I type in a |, it starts the search over again.
:ba
:ba
:ban user|:at
:at
:ban user|:attention members|:kic
:kic
This code:
text=":ban user|:at"
text=text:match("(%S+)%s+(.+)")
print(text)
would still return ban.
I'm trying to get a match of after the final | character.
Then you can use
text=":ban user|:at"
new_text=text:match("^.*%|(.*)")
if new_text == nil then new_text = text end
print(new_text)
See the Lua demo
Explanation:
.* - matches any 0+ characters as many as possibl (in a "greedy" way, since the whole string is grabbed and then backtracking occurs to find...)
%| - the last literal |
(.*) - match and capture any 0+ characters (up to the end of the string).
To avoid special cases, make sure that the string always has |:
function test(s)
s="|"..s
print(s:match("^.*|(.*)$"))
end
test":ba"
test":ban user|:at"
test":ban user|:attention members|:kic"

How to find 2 Assignment operator in regex

I am trying to find 2 symbols together "+*" , "-/", or such and also I want to identify if it's "3-", "4-" "*4" and such. I will be looking for it inside and array or strings like such ["2" , "+", "3","/" , "2"]
If I understand your question correctly, you are trying to match a symbol followed by a number or a number followed by a symbol
the regex would look something like this
/^[+-\/\*]\d$|^\d[+-\/\*]$/
Breakdown
^ - Start of line
[+-\/\*] - Any one of the symbols. Asterisk and forward slash must be escaped
\d - Matches any digit (0 through 9)
$ - End of line
| - Or
^\d[+-\/\*]$ - starts with a digit and ends with a symbol.
Please let me know if this is what you are looking for. Otherwise I can fix this.
In Ruby, let's pretend you have an array as follows
array = ["2" , "+", "3","/" , "2"]
You can find if any two consecutive elements match the above pattern as follows
array.each_cons(2).to_a.any? { |combo| combo.join.match(/^[+-\/\*]\d$|^\d[+-\/\*]$/) }
Breakdown
Use the each_cons(2) function to find every two consecutive characters in the array
use the any? method to find if any elements in the array satisfy a condition
Iterate over every element and find if any of the two joined together match the regex pattern
I don't get the second part about "3-" etc. But the basic idea for the rest is:
your_array.each do |element|
result element.match([/\+\/-]{2}/)
end
Note that the following characters have to be escaped with a backslash when used in ruby:
. | ( ) [ ] { } + \ ^ $ * ?.

Split lua string into characters

I only found this related to what I am looking for: Split string by count of characters but it is not useful for what I mean.
I have a string variable, which is an ammount of 3 numbers (can be from 000 to 999). I need to separate each of the numbers (characters) and get them into a table.
I am programming for a game mod which uses lua, and it has some extra functions. If you could help me to make it using: http://wiki.multitheftauto.com/wiki/Split would be amazing, but any other way is ok too.
Thanks in advance
Corrected to what the OP wanted to ask:
To just split a 3-digit number in 3 numbers, that's even easier:
s='429'
c1,c2,c3=s:match('(%d)(%d)(%d)')
t={tonumber(c1),tonumber(c2),tonumber(c3)}
The answer to "How do I split a long string composed of 3 digit numbers":
This is trivial. You might take a look at the gmatch function in the reference manual:
s="123456789"
res={}
for num in s:gmatch('%d%d%d') do
res[#res+1]=tonumber(num)
end
or if you don't like looping:
res={}
s:gsub('%d%d%d',function(n)res[#res+1]=tonumber(n)end)
I was looking for something like this, but avoiding looping - and hopefully having it as one-liner. Eventually, I found this example from lua-users wiki: Split Join:
fields = {str:match((str:gsub("[^"..sep.."]*"..sep, "([^"..sep.."]*)"..sep)))}
... which is exactly the kind of syntax I'd like - one liner, returns a table - except, I don't really understand what is going on :/ Still, after some poking about, I managed to find the right syntax to split into characters with this idiom, which apparently is:
fields = { str:match( (str:gsub(".", "(.)")) ) }
I guess, what happens is that gsub basically puts parenthesis '(.)' around each character '.' - so that match would consider those as a separate match unit, and "extract" them as separate units as well... But I still don't get why is there extra pair of parenthesis around the str:gsub(".", "(.)") piece.
I tested this with Lua5.1:
str = "a - b - c"
fields = { str:match( (str:gsub(".", "(.)")) ) }
print(table_print(fields))
... where table_print is from lua-users wiki: Table Serialization; and this code prints:
"a"
" "
"-"
" "
"b"
" "
"-"
" "
"c"

Resources