Validate a string using regex - ios

I want to validate a string to check if it is alphanumeric and contains "-" and "." with the alphanumeric characters. So I have done something like this to form the regex pattern
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"[a-zA-Z0-9\\.\\-]"
options:NSRegularExpressionCaseInsensitive
error:&error];
NSPredicate *regexTest = [NSPredicate predicateWithFormat:#"SELF MATCHES %#", regex];
BOOL valid = [regexTest evaluateWithObject:URL_Query];
App crashes stating that the regex pattern cannot be formed . Can anyone give me a quickfix to what am i doing wrong? Thanks in advance.

You must pass a variable of type NSString to the NSPredicate SELF MATCHES:
NSString * URL_Query = #"PAS.S.1-23-";
NSString * regex = #"[a-zA-Z0-9.-]+";
NSPredicate *regexTest = [NSPredicate predicateWithFormat:#"SELF MATCHES %#", regex];
BOOL valid = [regexTest evaluateWithObject:URL_Query];
See the Objective C demo
Note that you need no anchors with the SELF MATCHES (the regex is anchored by default) and you need to add + to match one or more allows symbols, or * to match 0+ (to also allow an empty string).
You do not need to escape the hyphen at the start/end of the character class, and the dot inside a character class is treated as a literal dot char.
Also, since both the lower- and uppercase ASCII letter ranges are present in the pattern, you need not pass any case insensitive flags to the regex.

Related

NSPredicate with regex capture always gets 0 results

Hi to all overflowers,
I'm scratching my head around putting a regular expression inside an NSPredicate.
I would like to move all our thumbnails from Documents directory into Caches directory and catch em'all I've created this regex: _thumb(#[2-3]x)?\.jpg.
Here on regex101.com you can see the above regex working with this test data:
grwior_thumb.jpg <- match
grwior.jpg
vuoetrjrt_thumb#2x.jpg <- match
vuoetrjrt.jpg
hafiruwhf_thumb.jpg <- match
hafiruwhf_thumb#2x.jpg <- match
hafiruwhf_thumb#3x.jpg <- match
hafiruwhf.jpg
But when I put it in the code it's not matching anything:
NSError *error = nil;
NSFileManager *fileManager = [NSFileManager defaultManager];
// Find and move thumbs to the caches folder
NSArray<NSString *> *mediaFilesArray = [fileManager contentsOfDirectoryAtPath:documentsPath error:&error];
NSString *regex = #"_thumb(#[2-3]x)?\\.jpg";
NSPredicate *thumbPredicate = [NSPredicate predicateWithFormat: #"SELF ENDSWITH %#", regex];
NSArray<NSString *> *thumbFileArray = [mediaFilesArray filteredArrayUsingPredicate:thumbPredicate];
thumbFileArray has always 0 elements...
What am I doing wrong?
Use MATCHES rather than ENDSWITH, as ENDSWITH does not treat the expression as a regular expression, but make sure you match all the chars from the start of the string, too, as MATCHES requires a full string match, so you need to somehow match the chars before the _.
Use
NSString *regex = #".*_thumb(#[23]x)?\\.jpg";
And then
[NSPredicate predicateWithFormat: #"SELF MATCHES %#", regex]
The .* will match any 0+ chars other than line break chars, as many as possible.
Note that if you just want to match either 2 or 3, you might as well write [23], no need for a - range operator here.
You may also replace (#[23]x)? with (?:#[23]x)?, i.e. change the capturing group to a non-capturing, since you do not seem to need the submatch to be accessible later. If you do, keep the optional capturing group.
The problem is with ENDSWITH.
ENDSWITH
The left-hand expression ends with the right-hand expression.
MATCHES
The left hand expression equals the right hand expression using a regex-style comparison according to ICU v3
What you need is
NSString *regex = #".+_thumb(#[2-3]x)?\\.jpg";
NSPredicate *thumbPredicate = [NSPredicate predicateWithFormat: #"SELF MATCHES %#", regex];

Is there a way to check if a string contains a Unicode letter?

In Cocoa, regular expressions are presumably following the ICU Unicode rules for character matching and the ICU standard includes character properties such as \p{L} for matching all kinds of Unicode letters. However
NSString* str = #"A";
NSPredicate* pred = [NSPredicate predicateWithFormat:#"SELF MATCHES '\\p{L}'"];
NSLog(#"%d", [pred evaluateWithObject:str]);
doesn't seem to compile:
Can't do regex matching, reason: Can't open pattern U_REGEX_BAD_INTERVAL (string A, pattern p{L}, case 0, canon 0)
If character properties are not supported (are they?), how else could I check if a string contains a Unicode letter in my iOS app?
The main point here is that MATCHES requires a full string match, and also, \ backslash passed to the regex engine should be a literal backslash.
The regex can thus be
(?s).*\p{L}.*
Which means:
(?s) - enable DOTALL mode
.* - match 0 or more any characters
\p{L} - match a Unicode letter
.* - match zero or more characters.
In iOS, just double the backslashes:
NSPredicate * predicat = [NSPredicate predicateWithFormat:#"SELF MATCHES '(?s).*\\p{L}.*'"];
See IDEONE demo
If the backslashes inside the NSPrediciate are treated specifically, use:
NSPredicate * predicat = [NSPredicate predicateWithFormat:#"SELF MATCHES '(?s).*\\\\p{L}.*'"];

One uppercase letter validation regex

I am working on a regex validation for an alphanumeric character with a length of 4 but contains only one Uppercase letter.
This is the code I have:
NSRegularExpression *expression = [NSRegularExpression regularExpressionWithPattern:#"(?=.*[0-9])(?=.*[A-Z])[a-zA-Z0-9]{4}" options:NSRegularExpressionCaseInsensitive error:&error];
However, it does not perform the check correctly. How can I do it?
Change your pattern like this,
#"^(?=.*[0-9])(?=[^A-Z]*[A-Z][^A-Z]*$)[a-zA-Z0-9]{4}$"
I think this would be enough for you
#"^(?=.*\\d)(?=.*[A-Z]).{4}$"
Or if you want to give minimum and maximum length then use below snippet
#"^(?=.*\\d)(?=.*[A-Z]).{4,15}$"
Here 4 would be the minimum length and 15 would be maximum length for your string
If you need to differentiate upper- and lowercase letters, you need to remove NSRegularExpressionCaseInsensitive option. It removes the differentiation between the lower and upper case.
Once you remove it, the following regex (if you need to support Unicode letters):
#"\\A(?=\\D*\\d)(?=\\P{Lu}*\\p{Lu}\\P{Lu}*\\z)[\\p{L}\\d]{4}\\z"
Or just ASCII:
#"\\A(?=\\D*\\d)(?=[^A-Z]*[A-Z][^A-Z]*\\z)[A-Za-z\\d]{4}\\z"
See another regex demo
NSRegularExpression *expression = [
NSRegularExpression regularExpressionWithPattern:#"\\A(?=\\D*\\d)(?=[^A-Z]*[A-Z][^A-Z]*\\z)[A-Za-z\\d]{4}\\z"
options:0
error:&error];
Regex breakdown:
\A - unambigous start of string
(?=\D*\d) - check if there is at least 1 digit after 0 or more non-digits (\D*)
(?=\P{Lu}*\p{Lu}\P{Lu}*\z) - check if there is ONLY 1 uppercase letter (\p{L}) in-between 0 or more any characters other than uppercase letters (\P{Lu})
[\p{L}\d]{4} - exactly 4 characters that are either a letter (lower- or uppercase) or a digit.
\z - match unambigous end of string.
IDEONE demo resulting in "yes":
NSString * s = #"e3Df";
NSString * rx = #"\\A(?=\\D*\\d)(?=[^A-Z]*[A-Z][^A-Z]*\\z)[A-Za-z\\d]{4}\\z";
NSPredicate * predicat = [NSPredicate predicateWithFormat:#"SELF MATCHES %#", rx];
if ([predicat evaluateWithObject:s]) {
NSLog (#"yes");
}
else {
NSLog (#"no");
}

iOS - NSString regex match

I have a string for example:
NSString *str = #"Strängnäs"
Then I use a method for replace scandinavian letters with *, so it would be:
NSString *strReplaced = #"Str*ngn*s"
I need a function to match str with strReplaced. In other words, the * should be treated as any character ( * should match with any character).
How can I achieve this?
Strängnäs should be equal to Str*ngn*s
EDIT:
Maybe I wasn't clear enough. I want * to be treated as any character. So when doing [#"Strängnäs" isEqualToString:#"Str*ngn*s"] it should return YES
I think the following regex pattern will match all non-ASCII text considering that Scandinavian letters are not ASCII:
[^ -~]
Treat each line separately to avoid matching the newline character and replace the matches with *.
Demo: https://regex101.com/r/dI6zN5/1
Edit:
Here's an optimized pattern based on the above one:
[^\000-~]
Demo: https://regex101.com/r/lO0bE9/1
Edit 1: As per your comment, you need a UDF (User defined function) that:
takes in the Scandinavian string
converts all of its Scandinavian letters to *
takes in the string with the asterisks
compares the two strings
return True if the two strings match, else false.
You can then use the UDF like CompareString(ScanStr,AsteriskStr).
I have created a code example using the regex posted by JLILI Amen
Code
NSString *string = #"Strängnäs";
NSError *error = nil;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"[^ -~]" options:NSRegularExpressionCaseInsensitive error:&error];
NSString *modifiedString = [regex stringByReplacingMatchesInString:string options:0 range:NSMakeRange(0, [string length]) withTemplate:#"*"];
NSLog(#"%#", modifiedString);
Output
Str*ngn*s
Not sure exactly what you are after, but maybe this will help.
The regular expression pattern which matches anything is. (dot), so you can create a pattern from your strReplaced by replacing the *'s with .'s:
NSString *pattern = [strReplaced stringByReplacingOccurencesOfString:#"*" withString:"."];
Now using NSRegularExpression you can construct a regular expression from pattern and then see if str matches it - see the documentation for the required methods.

NSPredicate to match unescaped apostrophes

I'd like to check an NSString (json) if there are any unescaped apostrophes, but the NSPredicate won't find it, even if the regex is correct.
Here's my code:
NSString* regx = #"[^\\\\]'";
NSPredicate* p = [NSPredicate predicateWithFormat:#"SELF MATCHES %#",regx];
if([p evaluateWithObject:json]){
//gotit
...
I know that there are some apostrophes that are not escaped, but NSPredicate just doesn't find it.
Any idea how to solve this problem?
Also if I look at the json I see the apostrophes as \u0027.
"SELF MATCHES …" tries to match the entire string, therefore you have to use the regex
NSString* regx = #".*[^\\\\]'.*";
Alternatively:
NSString* regx = #"[^\\\\]'";
NSRange r = [json rangeOfString:regx options:NSRegularExpressionSearch];
if (r.location != NSNotfound) {
…
}
But the question remains why this is necessary. NSJSONSerialization should handle
all escaping and quoting correctly.
This is the regex which works for me:
.*[^\\\\]\\\\u0027.*

Resources