multiple ????'s in regex cause error - ios

I have a simple regex search and replace method. Everything works fine as expected, however when I was hammer testing yesterday the string I entered had "????" in it. this caused the regex to fail with the following error...
error NSError * domain: #"NSCocoaErrorDomain" - code: 2048 0x0fd3e970
upon further research I believe that it might be treating the question marks as a "trigraph". Chuck has a good explanation in this post.What does the \? (backslash question mark) escape sequence mean?
I tried to escape the sequence prior to creating the regex with this
string = [string stringByReplacingOccurrencesOfString:#"\?\?" withString:#"\?\\?"];
and it seem to stop the error but the search and replace no longer works. Here is the method I am using.
- (NSString *)searchAndReplaceText:(NSString *)searchString withText:(NSString *)replacementString inString:(NSString *)text {
NSRegularExpression *regex = [self regularExpressionWithString:searchString];
NSRange range = [regex rangeOfFirstMatchInString:text options:0 range:NSMakeRange(0, text.length)];
NSString *newText = [regex stringByReplacingMatchesInString:text options:0 range:range withTemplate:replacementString];
return newText;
}
- (NSRegularExpression *)regularExpressionWithString:(NSString *)string {
NSError *error = NULL;
NSString *pattern = [NSString stringWithFormat:#"\\b%#\\b", string];
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern options:NSRegularExpressionCaseInsensitive error:&error];
if (error)
NSLog(#"Couldn't create regex with given string and options");
return regex;
}
My questions are; is there a better way of escaping this sequence? Is this a case of trigraphs, or another possibility? Or is a there a way in code of ignoring trigraphs or turning this off?
Thanks

My questions are; is there a better way of escaping this sequence?
Yes, you can properly escape any sequence of characters for a regular expression like this:
NSString* escapedExpression = [NSRegularExpression escapedPatternForString: aStringToEscapeCharactersIn];
EDIT
You don't have to run this on the whole expression. You can use NSString stringwithFormat: to insert escaped strings into REs with patterns in them e.g.
pattern = [NSString stringWithFormat: #"^%#(.*)", [NSRegularExpression escapedPatternForString: #"????"]];
will give you the pattern ^\?\?\?\?(.*)

Related

Delete occurances at the end of a string - iOS

Say you have a NSString *testString = #"Abcd!!!!";, note the four exclamation marks, how can I delete all exclamation marks as efficiently as possible?
The exclamation marks can be any number of amount, and can only be deleted if they're in consecutive trailing order.
One example might be:
NSString *testString = #"ABC!D!!!!!";
The result would then be:
NSString *result = #"ABC!D";
Since you don't know how many ! you'll be removing from the string, you could do it with a regular expression.
NSString *string = #"ABC!D!!!!!";
NSError *error = nil;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"!+$" options:NSRegularExpressionCaseInsensitive error:&error];
NSString *modifiedString = [regex stringByReplacingMatchesInString:string options:0 range:NSMakeRange(0, [string length]) withTemplate:#""];
NSLog(#"%#", modifiedString);
Regex aren't always the most efficient way to solve these sorts of problems, but in this case, I don't think there would be a measurable gain doing it another way.

How can I replace string using NSRegularExpression in iOS?

For example the regular expression is :
A(B)C
A,B,C all represent some string.I want all the string matches A(B)C replacing by B.
If the NSString is AABCAABCBBABC:
The answear will be ABABBBB.How to do that? Thank you.
I give a more specific example:
<script\stype="text/javascript"[\s\S]*?(http://[\s\S]*?)'[\s\S]*?</script>
The answer is some script mathes and http:// url matches .
I want to use each http:// url matches to replace each script matches. Did I explain it clearly?
One solution can be using stringByReplacingMatchesInString:
NSString *strText = #"AABCAABCBBABC";
NSError *error = nil;
NSRegularExpression *regexExpression = [NSRegularExpression regularExpressionWithPattern:#"ABC" options:NSRegularExpressionCaseInsensitive error:&error];
NSString *strModifiedText = [regexExpression stringByReplacingMatchesInString:strText options:0 range:NSMakeRange(0, [strText length]) withTemplate:#"B"];
NSLog(#"%#", strModifiedText);
Another solution can be using stringByReplacingOccurrencesOfString :
strText = [strText stringByReplacingOccurrencesOfString:#"ABC" withString:#"B"];

Whats the quickest way to do lots of NSRange calls in a very long NSString on iOS?

I have a VERY long NSString. It contains about 100 strings I need to pull out of it, all randomly scattered throughout. They are all commonly are between imgurl= and &.
I could use NSRange and just loop through pulling out each string, but I'm wondering if there is a quicker was to pick out everything in a simple API call? Maybe something I am missing here?
Looking for the quickest way to do this. Thanks!
Using NSString methods componentsSeparatedByString and componentsSeparatedByCharactersInSet:
NSString *longString = some really long string;
NSArray *longStringComponents = [longString componentsSeparatedByString:#"imgurl="];
for (NSString *string in longStringComponents){
NSString *imgURLString = [[string componentsSeparatedByCharactersInSet:[NSCharacterSet characterSetWithCharactersInString:#"&"]] firstObject];
// do something with imgURLString...
}
If you feel adventurous then you can use regular expression. Since you said that the string you are looking is between imgurl and &, I assumed its a url and made the sample code to do the same.
NSString *str = #"http://www.example.com/image?imgurl=my_image_url1&imgurl=myimageurl2&somerandom=blah&imgurl=myurl3&someother=lol";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(?:imageurl=)(.*?)(?:&|\\r)"
options:NSRegularExpressionCaseInsensitive
error:&error];
//should do error checking here...
NSArray *matches = [regex matchesInString:str
options:0
range:NSMakeRange(0, [str length])];
for (NSTextCheckingResult *match in matches)
{
//[match rangeAtIndex:0] <- gives u the whole string matched.
//[match rangeAtIndex:1] <- gives u the first group you really care about.
NSLog(#"%#", [str substringWithRange:[match rangeAtIndex:1]]);
}
If I were you, I will still go with #bobnoble method because its easier and simpler compared to regex. You will have to do more error checking using this method.

How to detect email addresses within arbitrary strings

I'm using the following code to detect an email in the string. It works fine except dealing with email having pure number prefix, such as "536264846#gmail.com". Is it possible to overcome this bug of apple? Any help will be appreciated!
NSString *string = #"536264846#gmail.com";
NSError *error = NULL;
NSDataDetector *detector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeLink error:&error];
NSArray *matches = [detector matchesInString:string
options:0
range:NSMakeRange(0, [string length])];
for (NSTextCheckingResult *match in matches) {
if ([match.URL.scheme isEqualToString:#"mailto"]) {
NSString *email = [match.URL.absoluteString substringFromIndex:match.URL.scheme.length + 1];
NSLog(#"email :%#",email);
}else{
NSLog(#"[match URL] :%#",[match URL]);
}
}
Edit:
log result is: [match URL] :http://gmail.com
What I did in the past:
tokenize the input, e.g., separate tokens using spaces (since most other common separators may be valid within an email). However, this may not be necessary if the regular expression is not anchored - but not sure how it would work without the "^" and "$" anchors (which I added to what was shown on the web site).
keep in mind that addresses may take the form '"string"' as well as just address
in each token, look for '#', as it's probably the best indicator you have that its an email address
run the token through the regular expression shown on this Email Detector comparison site (I found in testing that the one marked #1 as of 3/21/2013 worked best)
What I did was put the regular expression in a text file, so I didn't need to escape it:
^(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){255,})(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){65,}#)(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))\x22))(?:.(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))\x22)))#(?:(?:(?!.[^.]{64,})(?:(?:(?:xn--)?[a-z0-9]+(?:-[a-z0-9]+).){1,126}){1,}(?:(?:[a-z][a-z0-9])|(?:(?:xn--)[a-z0-9]+))(?:-[a-z0-9]+))|(?:[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.[a-f0-9][:]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))]))$
Defined an ivar:
NSRegularExpression *reg
Created the regular expression:
NSString *fullPath = [[NSBundle mainBundle] pathForResource:#"EMailRegExp" ofType:#"txt"];
NSString *pattern = [NSString stringWithContentsOfFile:fullPath encoding:NSUTF8StringEncoding error:NULL];
NSError *error = nil;
reg = [NSRegularExpression regularExpressionWithPattern:pattern options:NSRegularExpressionCaseInsensitive error:&error];
assert(reg && !error);
Then wrote a method to do the comparison:
- (BOOL)isValidEmail:(NSString *)string
{
NSTextCheckingResult *match = [reg firstMatchInString:string options:0 range:NSMakeRange(0, [string length])];
return match ? YES : NO;
}
EDIT: I've turned the above into a project on github
EDIT2: for an alterate, less rigorous but faster, see the comment section of this question

NSMutableString replaceOccurrencesOfString replacing whole words

Is there a way to use replaceOccurrencesOfString (from NSMutableString) to replace whole words?
For example, if I want to replace all occurrences of a fraction in a string, like "1/2", I'd like that to match only that specific fraction. So if I had "11/2", I would not want that to match my "1/2" rule.
I've been trying to look for answers to this already, but I am having no luck.
You could use word boundaries \b with Regex. This example matches the "1/2" at the start and the end of the example string, but neither of the middle options
// Create your expression
NSString *string = #"1/2 of the 11/2 objects were 1/2ed in (1/2)";
NSError *error = nil;
NSRegularExpression *regex =
[NSRegularExpression
regularExpressionWithPattern:#"\\b1/2\\b"
options:NSRegularExpressionCaseInsensitive
error:&error];
// Replace the matches
NSString *modifiedString =
[regex stringByReplacingMatchesInString:string
options:0
range:NSMakeRange(0, [string length])
withTemplate:#"HALF USED TO BE HERE"];

Resources