For example string.find not work correct:
print ( string.match("Test/(", "Test/(") )
second argument is submitted as regexp, not simple string.
print(string.match("Test/(", "Test/("))
Will cause the error message
unfinished capture
"Test/(" is not a valid Lua string pattern.
( is one of the magic characters ^$()%.[]*+-? which have to be escaped by prepending % because they otherwise have a special meaning in defining patterns. ( starts a capture. As it is not followed by ) to end it your pattern contains an unfinished capture.
Use "Test/%(" instead to include the parenthesis into your search and avoid the error message.
Please refer to the Lua Reference Manual - Patterns for further details.
Is this what you're looking for?
string.sub(s, i [, j])s:sub(i [,j])
Return a substring of the string passed. The substring starts at i. If the third argument j is not given, the substring will end at the end of the string. If the third argument is given, the substring
ends at and includes j
You can find documentation of the functions here
Related
I'm looking for a way search in a string for a very specific set of characters: "(),:;<>#[\]
specialChar = str:find("[\"][%(][%)][,][:][;][<][>][#][%[][%]][\\]")
I'm thinking that there will be no pattern to satisfy my need because of the Limitations of Lua patterns.
I have read the Lua Manual pattern matching section pretty thoroughly but still can't seem to figure it out.
Does anyone know a way I can identify if a given string contains any of these characters?
Note, I do not need to know anything about which character or where in the string it is, if that helps.
To check if a string contains ", (, ), ,, :, ;, <, >, #, [, \ or ] you may use
function ContainsSpecialChar(input)
return string.find(input, "[\"(),:;<>#[\\%]]")
end
See the Lua demo
I need to check wether matching parenthesis is present in a string that might have emoticons (like :) or :(). For example, "(:)())()", "(abcd)()ghijk)((mnop)qert)"
I have used the patterns "^[:\\(|:\\)]" to check for emoticons and "\\([^()]*\\)" to check for matching parenthesis present, but they are not detected. How can I do this?
The really simple solution to this problem is to count the parentheses, trying to solve it with regular expressions is hard though extended regular expressions can handle it. Here is a sketch of the simple algorithm:
Set openParenthesisCount to 0
Iterate over the string:
If current character is ( increment openParenthesisCount
If current character is ) decrement openParenthesisCount, if count goes negative then fail (too many closing)
If current character is : lookahead and skip next character if it is a parenthesis (skip smilies)
If openParenthesisCount is zero => succeed
HTH
As far as I can tell, you want to match a string if and only if it contains matching parentheses, after ignoring every occurrence of ":)" and ":(" in the string, if any.
So, try this:
^((?!:).)*\(.*(?<!:)\).*
It will match the following strings:
()
(abd)
(())(
(:))
(:(:))
(:)())()
(abcd)()ghijk)((mnop)qert)
(abc):
(:abc)
But will NOT match the following:
)(
(:)
(:(
:(:)
:()
(:)
:()(:)
(
)
(abc
abc)
(abc:)
:(abc)
I found in the legacy code the following:
"myString".sub(/^(.)/) {$1.upcase} seems very weird. While executing in IRB, I got the same result as "myString".capitalize
Wasn't able to find the documentation... so ended up on SO
Not exactly,
"myString".capitalize
#=> "Mystring"
"myString".sub(/^(.)/) {$1.upcase}
#=> "MyString"
From the docs for capitalize
Returns a copy of str with the first character converted to uppercase and the remainder to lowercase. Note: case conversion is effective only in ASCII region.
sub accepts an optional block instead of a replacement parameter. If given, it places the sub-matches into global variables, invokes the block, and returns the matched portion of the string with the block's return value.
The regular expression in question finds the first character at the beginning of a line. It places that character in $1 because it's contained in a sub-match (), invokes the block, which returns $1.upcase.
As an aside, this is a brain-dead way of capitalizing a string. Even if you didn't know about .capitalize or this code is from before .capitalize was available (?), you could still have simply done myString[0] = myString[0].upcase. The only possible benefit is the .sub method will work if the string is empty, where ""[0].upcase will raise an exception. Still, the better way of circumventing that problem is myString[0] = myString[0].upcase if myString.length > 0
Both are not exactly same. sub is used to replace the first occurrence of the pattern specified, whereas gsub does it for all occurrences (that is, it replaces globally).
In your question, regular expression is the first character i.e., $1 and replaces with $1.upcase.
CODE :
"myString".sub(/^(.)/) {$1.upcase}
OUTPUT :
"MyString"
CODE :
"myString".capitalize
OUTPUT :
"Mystring"
Frustratingly, any my previous Lua tries went in extensive Google searching of more/less same Lua resources, and then resulted in some multi-line code to get basic things, which i.e. I get from Python with simple command.
Same again, I want to replace substring from string, and use i.e.:
string.gsub("My string", "str", "th")
which results in:
My thing 1
I imagine replacement count can be useful, but who would expect it by default, and without option to suppress it, but maybe I miss something?
How to print just string result, without counter?
Enclose in parentheses: (string.gsub("My string", "str", "th")).
The results are only a problem because you are using print, which takes multiple parameters. Lua allows multiple assignments, so normally the code would look like
newstr, n = string.gsub("My string", "str", "th")
but the count is only provided if there is a place to put it, so
newstr = string.gsub("My string", "str", "th")
is also fine, and causes the count to be discarded. If you are using print directly (the same applies to return) then you should enclose the call in parentheses to discard all but the first result.
Could anybody help me make a proper regular expression from a bunch of text in Ruby. I tried a lot but I don't know how to handle variable length titles.
The string will be of format <sometext>title:"<actual_title>"<sometext>. I want to extract actual_title from this string.
I tried /title:"."/ but it doesnt find any matches as it expects a closing quotation after one variable from opening quotation. I couldn't figure how to make it check for variable length of string. Any help is appreciated. Thanks.
. matches any single character. Putting + after a character will match one or more of those characters. So .+ will match one or more characters of any sort. Also, you should put a question mark after it so that it matches the first closing-quotation mark it comes across. So:
/title:"(.+?)"/
The parentheses are necessary if you want to extract the title text that it matched out of there.
/title:"([^"]*)"/
The parentheses create a capturing group. Inside is first a character class. The ^ means it's negated, so it matches any character that's not a ". The * means 0 or more. You can change it to one or more by using + instead of *.
I like /title:"(.+?)"/ because of it's use of lazy matching to stop the .+ consuming all text until the last " on the line is found.
It won't work if the string wraps lines or includes escaped quotes.
In programming languages where you want to be able to include the string deliminator inside a string you usually provide an 'escape' character or sequence.
If your escape character was \ then you could write something like this...
/title:"((?:\\"|[^"])+)"/
This is a railroad diagram. Railroad diagrams show you what order things are parsed... imagine you are a train starting at the left. You consume title:" then \" if you can.. if you can't then you consume not a ". The > means this path is preferred... so you try to loop... if you can't you have to consume a '"' to finish.
I made this with https://regexper.com/#%2Ftitle%3A%22((%3F%3A%5C%5C%22%7C%5B%5E%22%5D)%2B)%22%2F
but there is now a plugin for Atom text editor too that does this.