In Objective-c, I want to check is a proper english sentence/word or not, not grammatically..
i.e: texts like "I didn't go!", ""Hi" is a word", "hello world", "a 5 digit number", "the % is high!" and "x#x.com" should pass.
but texts like "#/-5%;l:" should NOT pass
the text may contain: numbers 0-9 and letters a-z, A-Z and -/:;()$&\"'!?,._
I tried:
NSString *regex1 = #"^[\w:;()'\"\s-]*";
NSPredicate *streamTest = [NSPredicate predicateWithFormat:#"SELF MATCHES %#", regex1];
return [streamTest evaluateWithObject:candidate];
But it wouldn't achieve what I want
Any ideas?
I agree with #borrrden that this is a difficult task for a regex, but one thing you'd need to do is to escape the regex-backslashes (for want of a better word) with another backslash (\). Like this:
NSString *regex1 = #"^[\\w:;()'\"\\s-]*";
The reasoning behind this is that you want the regex engine to "see" the backslash, but the compiler which handles the NSString also uses backslashes to escape certain characters. "w" and "s" are not among those characters, so they \w and \s are just translated into w and s, respectively.
A double backslash in a literal string serves to get a single backslash into the compiled string.
Related
I'm trying create a regular expression for string comparison.
The regular expression is: .*\bword.*
However, I want to ignore special characters and the comparison should work with and without them.
For example:
O'Reilly should match O'Reilly and oreilly
It is possible do it with a regular expression?
P.S.
This is to be used in iOS with NSPredicate.
Currently, the predicate looks like:
NSString *regexString = [NSString stringWithFormat:#".*\b%#.*", word];
NSPredicate *predicate = [NSPredicate predicateWithFormat:#"%K matches[cd] %#", keypath, regexString];
Since NSPredicate doesn't allow me to do any operation like replace the value of the keypath to a value without special characters, I need to do it via regular expression.
You might think about preprocessing your string before doing the match. If you have a list of acceptable characters, which looking at your example is just a-z and A-Z you can use the transliteration operator tr/// to remove all the other characters and lc to lower case the string. The flags on tr are c compliment the match, ie match everything that is not listed and d delete everything that matched that does not have a replacement, as the replacement is empty that means everything that matched.
$string =~ tr/a-zA-Z//cd;
$string = lc $string;
If you are using characters outside the ASCII range then you need to be a little cleverer.
$string =~ s/\P{L}+//g;
$string = fc $string;
First off we use a regex to remove any Unicode character that is not in the general category letter. And then we use the fc function to fold case the string, this is the same function that Perl uses to do case insensitive regex matches. Note that you might want to normalise the string first.
I want to validate the password to include at least 1 Arabic or English letter and at least 1 Arabic or English number and at leats 8 length password, my old code that was made for English only was like :
let passwordRegex = "^(?=.*[A-Za-z])(?=.*\\d)[A-Za-z\\d]{8,}$"
if (!NSPredicate(format:"SELF MATCHES %#",passwordRegex).evaluate(with: password)){
return false
}
and then i found this answer for Arabic characters and digits, then i tried to merge both like this :
let passwordRegex = "^(?=.*[A-Za-zء-ي])(?=.*٠-٩\\d)[A-Za-zء-ي٠-٩\\d]{8,}$"
if (!NSPredicate(format:"SELF MATCHES %#",passwordRegex).evaluate(with: password)){
return false
}
please advise what's wrong, thanks in advance
Since an English or Arabic letter regex (as described in this answer you linked to, also, see this answer, too) is [a-zA-Za-z\u0621-\u064A] and an English or Arabic digit regex is [0-9\u0660-\u0669] you may use
let passwordRegex = "^(?=.*[a-zA-Z\\u0621-\\u064A])(?=.*[0-9\\u0660-\\u0669])[a-zA-Za-z\\u0621-\\u064A0-9\\u0660-\\u0669]{8,}$"
NOTE: you do not need the outer ^ and $ anchors because MATCHES requires the pattern to match the whole string input.
Another way to match an Arabic letter with ICU regex used in Swift is to use [\p{L}&&[\p{script=Arabic}]] (it is an intersection inside a character class, it matches any letter but from the Arabic character set). Same with a digit: [\p{N}&&[\p{script=Arabic}]]. Then, the regex will look like
let passwordRegex = "^(?=.*[\\p{L}&&[\\p{script=Arabic}A-Za-z]])(?=.*[\\p{N}&&[\\p{script=Arabic}0-9]])[\\p{L}\\p{N}&&[\\p{script=Arabic}a-zA-Z0-9]]{8,}$"
So, here
[\\p{L}&&[\\p{script=Arabic}A-Za-z]] - any letter but it should belong to either ASCII letters or Arabic script
[\\p{N}&&[\\p{script=Arabic}0-9]] - any digit but either from 0-9 range or Arabic script
[\\p{L}\\p{N}&&[\\p{script=Arabic}a-zA-Z0-9]] - any letter or digit but only from the ASCII 0-9, A-Z, a-z and Arabic script.
Note also, that in order to match any letters, you may use\p{L} and to match any digits you may use \d (they are Unicode aware in ICU library). So, *in case t does not matter if the letters or digits are Arabic, English, Greek or whatever, you may use
let passwordRegex = "^(?=.*\\p{L})(?=.*\\d)[\\p{L}\\d]{8,}$"
Can someone please tell me how can I print something in following way "with" double quotes.
"Double Quotes"
With a backslash before the double quote you want to insert in the String:
let sentence = "They said \"It's okay\", didn't they?"
Now sentence is:
They said "It's okay", didn't they?
It's called "escaping" a character: you're using its literal value, it will not be interpreted.
With Swift 4 you can alternatively choose to use the """ delimiter for literal text where there's no need to escape:
let sentence = """
They said "It's okay", didn't they?
Yes, "okay" is what they said.
"""
This gives:
They said "It's okay", didn't they?
Yes, "okay" is what they said.
With Swift 5 you can use enhanced delimiters:
String literals can now be expressed using enhanced delimiters. A string literal with one or more number signs (#) before the opening quote treats backslashes and double-quote characters as literal unless they’re followed by the same number of number signs. Use enhanced delimiters to avoid cluttering string literals that contain many double-quote or backslash characters with extra escapes.
Your string now can be represented as:
let sentence = #"They said "It's okay", didn't they?"#
And if you want add variable to your string you should also add # after backslash:
let sentence = #"My "homepage" is \#(url)"#
For completeness, from Apple docs:
String literals can include the following special characters:
The escaped special characters \0 (null character), \ (backslash), \t
(horizontal tab), \n (line feed), \r (carriage return), \" (double
quote) and \' (single quote)
An arbitrary Unicode scalar, written as
\u{n}, where n is a 1–8 digit hexadecimal number with a value equal to
a valid Unicode code point
which means that apart from being able to escape the character with backslash, you can use the unicode value. Following two statements are equivalent:
let myString = "I love \"unnecessary\" quotation marks"
let myString = "I love \u{22}unnecessary\u{22} quotation marks"
myString would now contain:
I love "unnecessary" quotation marks
According to your needs, you may use one of the 4 following patterns in order to print a Swift String that contains double quotes in it.
1. Using escaped double quotation marks
String literals can include special characters such as \":
let string = "A string with \"double quotes\" in it."
print(string) //prints: A string with "double quotes" in it.
2. Using Unicode scalars
String literals can include Unicode scalar value written as \u{n}:
let string = "A string with \u{22}double quotes\u{22} in it."
print(string) //prints: A string with "double quotes" in it.
3. Using multiline string literals (requires Swift 4)
The The Swift Programming Language / Strings and Characters states:
Because multiline string literals use three double quotation marks instead of just one, you can include a double quotation mark (") inside of a multiline string literal without escaping it.
let string = """
A string with "double quotes" in it.
"""
print(string) //prints: A string with "double quotes" in it.
4. Using raw string literals (requires Swift 5)
The The Swift Programming Language / Strings and Characters states:
You can place a string literal within extended delimiters to include special characters in a string without invoking their effect. You place your string within quotation marks (") and surround that with number signs (#). For example, printing the string literal #"Line 1\nLine 2"# prints the line feed escape sequence (\n) rather than printing the string across two lines.
let string = #"A string with "double quotes" in it."#
print(string) //prints: A string with "double quotes" in it.
regular expression need to accept +,-,& need to accept before # in email validation
([\w-\.]+)#((?:[\w]+\.)+)([a-zA-Z]{2,4}) in this regular expression its not accepting.
can any one provide me proper regular expression.
Example:
demo+wifi-mail&name#gmail.com
use this one it'l helps you.
([\\w-\\.\\+\\-\\&}]+)#((?:[\\w]+\\.)+)([a-zA-Z]{2,4})
NSString *phone=#"demo+wifi-mail&name#gmail.com";
NSString *pNRegex = #"([\\w-\\.\\+\\-\\&}]+)#((?:[\\w]+\\.)+)([a-zA-Z]{2,4})";
NSPredicate *PNTest = [NSPredicate predicateWithFormat:#"SELF MATCHES %#", pNRegex];
BOOL check=[PNTest evaluateWithObject:phone ];
NSLog(#"%i",check);----> 1
This [\w\.] is a character class (I removed the - for now). Every character that is within the square brackets is matched by this class. So, your class is matching all letters, digits and underscores (that is done by the \w part) and dots.
If you want additional characters, just add them to the character class, e.g. [\w.+&-].
Be careful with the - character, it has a special meaning in a character class, either escape it or put it at the start or the end.
But be aware, your regex is still not matching all valid email addresses, see the links in the comments.
Single characters (without special meaning) or predefined classes written in a class doesn't make sense, [\w] is exactly the same than \w.
([\w-.+-\&}]+)#((?:[\w]+.)+)([a-zA-Z]{2,4}) Now it will work!
Here's my string:
mystring = %Q{object1="this is, a testyay', asdkf'asfkd", object2="yo ho', ho"}
I am going to split mystring on commas, therefore I want to (temporarily) sub out the commas that lie in between the escaped quotes.
So, I need to match escaped quote + some characters + one or more commas + escaped quote and then gsub the commas in the matched string.
The regex for gsub I came up with is /(".*?),(.*?")/, and I used it like so:
newstring = mystring.gsub(/(".*?),(.*?")/ , "\\1|TEMPSUBSTITUTESTRING|\\2"), but this only replaces the first comma it finds between the escaped quotes.
How can I make it replace all the commas?
Thanks.
I believe this is one way to achieve the results you are wanting.
newstring = mystring.gsub(/".*?,.*?"/) {|s| s.gsub( ",", "|TEMPSUBSTITUTESTRING|" ) }
It passes the matched string (the quoted part) to the code block which then replaces all of the occurrences of the comma. The initial regex could probably be /".*?"/, but it would likely be less efficient since the code block would be invoked for each quoted string even if it did not have a comma.
Don't bother with all that, just split mystring on this regex:
,(?=(?:[^"]*"[^"]*")*[^"]*$)
The lookahead asserts that the comma is followed by an even number of quotes, meaning it's not inside a quoted value.