I want to make a pattern that matches strings like (figure.
I tried
string.find("See this example (figure 1), "%(%figure$")
But it doesn't work.
Your %(%figure$ pattern is invalid, it throws
missing '[' after '%f' in pattern
because %f defines a frontier pattern.
You may use
string.match("See this example (figure 1)", "%((figure%s*%d+)%)")
See Lua demo online
Details
%( - a ( char
(figure%s*%d+) - Capturing group (this value will be the output of string.match): figure, zero or more whitespaces (%s*) and then 1+ digits (%d+)
%) - a ) char
Related
I am writing a simple scanner in flex. I want my scanner to print out "integer type seen" when it sees the keyword "int". Is there any difference between the following two ways?
1st way:
%%
int printf("integer type seen");
%%
2nd way:
%%
"int" printf("integer type seen");
%%
So, is there a difference between writing if or "if"? Also, for example when we see a == operator, we print something. Is there a difference between writing == or "==" in the flex file?
There's no difference in these specific cases -- the quotes(") just tell lex to NOT interpret any special characters (eg, for regular expressions) in the quoted string, but if there are no special characters involved, they don't matter:
[a-z] printf("matched a single letter\n");
"[a-z]" printf("matched the 5-character string '[a-z]'\n");
0* printf("matched zero or more zero characters\n");
"0*" printf("matched a zero followed by an asterisk\n");
Characters that are special and mean something different outside of quotes include . * + ? | ^ $ < > [ ] ( ) { } /. Some of those only have special meaning if they appear at certain places, but its generally clearer to quote them regardless of where they appear if you want to match the literal characters.
I'm trying to match any strings that come in that follow the format Word 100.00% ~(45.56, 34.76) in LUA. As such, I'm looking to do a regex close (in theory) to this:
%D%s[%d%.%d]%%(%d.%d, %d.%d)
But I'm having no luck so far. LUA's patterns are weird.
What am I missing?
Your pattern is close you neglected to allow for multiple instances of a digit you can do this by using a + at like %d+.
You also did not use [,( and . correctly in the pattern.
[s in a pattern will create a set of chars that you are trying to match such as [abc] means you are looking to match any as bs or c at that position.
( are used to define a capture so the specific values you want returned rather then the whole string in the event of a match, in order to use it as a char you for the match you need to escape it with a %.
. will match any character rather then specifically a . you will need to add a % to escape if you want to match a . specifically.
local str = "Word 100.00% ~(45.56, 34.76)"
local pattern = "%w+%s%d+%.%d+%%%s~%(%d+%.%d+, %d+%.%d+%)"
print(string.match(str, pattern))
Here you will see the input string print if it matches the pattern otherwise you will see nil.
Suggested resource: Understanding Lua Patterns
In my Ruby on Rails app I need a regex that accepts the following values:
{DD}
{MM}
{YY}
{NN}
{NNN}
{NNNN}
{NNNNN}
{NNNNNN}
upper and lowercase letters
the special characters -, _, . and #
I am still new to regular expressions and I came up with this:
/\A[a-zA-Z._}{#-]*\z/
This works pretty well already, however it also matches strings that should not be allowed such as:
}FOO or {YYY}
Can anybody help?
You may use
/\A(?:\{(?:DD|MM|YY|N{2,6})\}|[A-Za-z_.#-])*\z/
See Rubular demo
\A - start of string anchor
(?:\{(?:DD|MM|YY|N{2,6})\}|[A-Za-z_.#-])* - a non-capturing group ((?:...) that only groups sequences of atoms and does not create submatches/subcaptures) zero or more occurrences of:
\{(?:DD|MM|YY|N{2,6})\} - a { then either DD, or MM, YY, 2 to 6N followed with }
| - or
[A-Za-z_.#-] - 1 char from the set (ASCII letter, _, ., # or -)
\z - end of string.
I want to run two different lua string find on the same string " (55)"
Pattern 1 "[^%w_](%d+)", should match any number
Pattern 2 "[%(|%)|%%|%+|%=|%-|%{%|%}|%,|%:|%*|%^]", should match any of these ( ) % + = - { } , : * ^ characters.
Both of these patterns return 2, why? Also if I run a string match, they return ( and 55 respectivly (as expected).
It seems you are using the patterns with string.find that finds the first occurrence of the pattern in the string passed. If an instance of the pattern is found a pair of values representing the start and end of the string is returned. If the pattern cannot be found nil is returned.
Both patterns find a match at Position 2: [^%w_](%d+) finds ( because it is matched with [^%w_] (a char other than letter, digit or _), and [%(|%)|%%|%+|%=|%-|%{%|%}|%,|%:|%*|%^] matches the ( because it is part of the character set.
However, the first pattern can be re-written using a frontier pattern, %f[%w_]%d+, that will match 1+ digits if not preceded with letters, digits or underscore, and the second pattern does not require such heavy escaping, [()%%+={},:*^-] is enough (only % needs escaping here, as the - is placed at the end of the character set and is thus treated as a literal hyphen).
See this Lua demo:
a = " (55)"
for word in string.gmatch(a, "%f[%w_]%d+") do print(word) end
-- 55
for word in string.gmatch(a, "[()%%+={},:*^-]+") do print(word) end
-- (, )
Currently I have code that looks like this:
somestring = "param=valueZ&456"
local stringToPrint = (somestring):gsub("(param=)[^&]+", "%1hello", 1)
StringToPrint will look like this:
param=hello&456
I have replaced all of the characters before the & with the string "hello". This is where my question becomes a little strange and specific.
I want my string to appear as: param=helloZ&456. In other words, I want to preserve the character right before the & when replacing the string valueZ with hello to make it helloZ instead. How can this be done?
I suggest:
somestring:gsub("param=[^&]*([^&])", "param=hello%1", 1)
See the Lua demo
Here, the pattern matches:
param= - literal substring param=
[^&]* - 0 or more chars other than & as many as possible
([^&]) - Group 1 capturing a symbol other than & (here, backtracking will occur, as the previous pattern grabs all such chars other than & and then the engine will take a step back and place the last char from that chunk into Group 1).
There are probably other ways to do this, but here is one:
somestring = "param=valueZ&456"
local stringToPrint = (somestring):gsub("(param=).-([^&]&)", "%1hello%2", 1)
print(stringToPrint)
The thing here is that I match the shortest string that ends with a character that is not & and a character that is &. Then I add the two ending characters to the replaced part.