I am using NSDataDetector to parse a text and retrieve the numbers. Here is my code:
NSDataDetector *detector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypePhoneNumber
error:&error];
NSArray *matches = [detector matchesInString:locationAndTitle options:0 range:NSMakeRange(0,[locationAndTitle length])];
for (NSTextCheckingResult *match in matches) {
if ([match resultType] == NSTextCheckingTypePhoneNumber) {
self.theNumber = [match phoneNumber];
}
}
The problem with this is that it sometime returns something like this:
Telephone: 9729957777
OR
9729957777x3547634
I don't want that to appear and to remove it would be harder then using a regex code to retrieve the numbers. Do you have any idea on how to retrieve only the number.
Personally I would just use -substringWithRange: on the string to remove everything past and including the 'x' character:
NSString * myPhoneNum = #"9729957777x3547634";
NSRange r = [myPhoneNum rangeOfString:#"x"];
if (r.location != NSNotFound) {
myPhoneNum = [myPhoneNum substringWithRange:NSMakeRange(0, r.location)];
}
NSLog(#"Fixed number: %#", myPhoneNum);
Any idea where the x3547634 comes from, anyway?
Related
I can detect hashtags like this.
+ (NSArray *)getHashArrayWithInputString:(NSString *)inputStr
{
NSError *error = nil;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"#(\\w+)" options:0 error:&error];
NSArray *matches = [regex matchesInString:inputStr options:0 range:NSMakeRange(0, inputStr.length)];
NSMutableArray *muArr = [NSMutableArray array];
for (NSTextCheckingResult *match in matches) {
NSRange wordRange = [match rangeAtIndex:1];
NSString* word = [inputStr substringWithRange:wordRange];
NSCharacterSet* notDigits = [[NSCharacterSet decimalDigitCharacterSet] invertedSet];
if ([word rangeOfCharacterFromSet:notDigits].location == NSNotFound)
{
// newString consists only of the digits 0 through 9
}
else
[muArr addObject:[NSString stringWithFormat:#"#%#",word]];
}
return muArr;
}
Problem is that if inputStr is "#D&D", it can detect only #D. How shall I do?
For that with your reg expression add special character that you want allow.
#(\\w+([&]*\\w*)*) //To allow #D&D&d...
#(\\w+([&-]*\\w*)*) //To allow both #D&D-D&...
Same way you add other special character that you want.
So simply change your regex like this.
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"#(\\w+([&]*\\w*)*)" options:0 error:&error];
I was using this lib:
https://cocoapods.org/pods/twitter-text
There is TwitterText class with method
(NSArray *)hashtagsInText:(NSString *)text checkingURLOverlap (BOOL)checkingURLOverlap It could help.
I used this pod year ago last time, then it worked great. For today you need to check if it is still ok. Let me know :) Good luck
I'm trying to get from this string: 5556007503140005
Two strings. "555600750314" and "0005"
I'm Using the regexp ^([a-z0-9]*)([0-9]{4})$that works fine on the regexp tools, but when i use this on my code I only get 1 match.
this is the code
-(NSDictionary *)parse_barcode:(NSString *)barcode {
NSString *regexp = #"^([a-z0-9]*)([0-9]{4})$";
NSPredicate *predicate = [NSPredicate predicateWithFormat:#"SELF MATCHES %#",regexp];
if ([predicate evaluateWithObject:barcode]) {
NSError *error;
NSRegularExpression *regular_exp = [NSRegularExpression regularExpressionWithPattern:regexp options:0 error:&error];
NSArray *matches = [regular_exp matchesInString:barcode options:0 range:NSMakeRange(0, [barcode length])];
for (NSTextCheckingResult *match in matches) {
NSLog(#"match %# :%#",[barcode substringWithRange:[match range]], match);
}
}
return nil;
}
But the match is always the entire string (Barcode)
You get the right match, you are just not printing them correctly. You need to use numberOfRanges to get the individual groups (i.e. sections enclosed in parentheses), and then call rangeAtIndex: for each group, like this:
for (NSTextCheckingResult *match in matches) {
for (int i = 0 ; i != match.numberOfRanges ; i++) {
NSLog(#"match %d - %# :%#", i, [barcode substringWithRange:[match rangeAtIndex:i]], match);
}
}
I'm using a simple pattern with NSRegularExpression to delimit content within a string:
(\s)+(and|or)(\s)+
So, when I use matchesInString it's not the matches that I'm interested in, but the other stuff.
Below is the code that I'm using. Iterating over the matches and then using indexes and lengths to pull out the content.
Question: I'm just wondering if I'm missing something in the api to get the other bits? Or, is the approach below generally ok?
- (NSArray*)separateText:(NSString*)text
{
NSString* regExPattern = #"(\\s)+(and|or)(\\s)+";
NSError* error = NULL;
NSRegularExpression* regex = [NSRegularExpression regularExpressionWithPattern:regExPattern
options:NSRegularExpressionCaseInsensitive
error:&error];
NSArray* matches = [regex matchesInString:text options:0 range:NSMakeRange(0, text.length)];
if (matches.count == 0) {
return #[text];
}
NSInteger itemStartIndex = 0;
NSMutableArray* result = [NSMutableArray new];
for (NSTextCheckingResult* match in matches) {
NSRange matchRange = [match range];
if (!matchRange.location == 0) {
NSInteger matchStartIndex = matchRange.location;
NSInteger length = matchStartIndex - itemStartIndex;
NSString* item = [text substringWithRange:NSMakeRange(itemStartIndex, length)];
if (item.length != 0) {
[result addObject:item];
}
}
itemStartIndex = NSMaxRange(matchRange);
}
if (itemStartIndex != text.length) {
NSInteger length = text.length - itemStartIndex;
NSString* item = [text substringWithRange:NSMakeRange(itemStartIndex, length)];
[result addObject:item];
}
return result;
}
You can capture the string before the and|or with parentheses, and add it to your array with rangeAtIndex.
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(.+?)(\\s+(and|or)\\W+|\\s*$)" options:NSRegularExpressionCaseInsensitive error:&error];
NSMutableArray *phrases = [NSMutableArray array];
[regex enumerateMatchesInString:string options:0 range:NSMakeRange(0, [string length]) usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) {
NSRange range = [result rangeAtIndex:1];
[phrases addObject:[string substringWithRange:range]];
}];
A couple of minor points about my regex:
I added the |\\s*$ construct to capture the last string after the final and|or. If you don't want that, you can eliminate that.
I replaced the second \\s+ (whitespace) with a \\W+ (non-word characters), in case you encountered something like and|or followed by a comma or something else. You could alternatively look explicitly for ,?\\s+ if the comma was the only non-word character you cared about. It just depends upon the specific business problem you're solving.
You might want to replace the first \\s+ with \\W+, too.
If your string contains newline characters, you might want to use the NSRegularExpressionDotMatchesLineSeparators option when you instantiate the NSRegularExpression.
You could replace all matches of the regex with a template string (e.g. ", " or "," etc) and then separate the string components based on that new delimiter.
NSString *stringToBeMatched = #"Your string to be matched";
NSString *regExPattern = #"(\\s)+(and|or)(\\s)+";
NSError *error = nil;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:regExPattern
options:NSRegularExpressionCaseInsensitive
error:&error];
if (error) {
// handle error
}
NSString *replacementString = [regex stringByReplacingMatchesInString:stringToBeMatched
options:0
range:NSMakeRange(0, stringToBeMatched.length)
withTemplate:#","];
NSArray *otherItemsInString = [replacementString componentsSeparatedByString:#","];
I have been using NSDataDetector to parse address out of strings and for the most part it does a good job. However on address' similar to this one it does not detect it.
6200 North Evan Blvd Suit 487 Highland UT 84043
Currently I am using this code:
NSError *error = nil;
NSDataDetector *detector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeAddress error:&error];
NSArray *matches = [detector matchesInString:output options:0 range:NSMakeRange(0, [output length])];
for (NSTextCheckingResult *match in matches) {
if ([match resultType] == NSTextCheckingTypeAddress) {
_address = [_tesseractData substringWithRange:[match range]];
NSDictionary *data = [match addressComponents];
_zip = [data objectForKey:#"ZIP"];
if (_zip) {
NSRange zipRange = [_tesseractData rangeOfString:_zip];
if (zipRange.location != NSNotFound) {
[_tesseractData deleteCharactersInRange:zipRange];
}
}
_city = [data objectForKey:#"City"];
if (_city) {
NSRange cityRange = [_tesseractData rangeOfString:[_city uppercaseString]];
if (cityRange.location != NSNotFound) {
[_tesseractData deleteCharactersInRange:cityRange];
}
}
_city = [_city capitalizedString];
_state = [data objectForKey:#"State"];
_street = [data objectForKey:#"Street"];
if (_street) {
NSRange streetRange = [_tesseractData rangeOfString:[_street uppercaseString]];
if (streetRange.location != NSNotFound) {
[_tesseractData deleteCharactersInRange:streetRange];
}
}
_street = [_street capitalizedString];
}
}
Can anyone suggest a more robust method for parsing out the physical address out of a string? I need to be able to get the Zip, Street, State and City.
A NSDataDetector is a NSRegularExpression subclass, so maybe you could create a customized instance and start by checking what Apple puts as pattern and options parameters.
Something along this lines:
NSDataDetector * dataDetectorRegEx = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeAddress error:&error];
NSString * dataDetectorPattern = dataDetectorRegEx.pattern;
NSLog(#"Check out this pattern!: %#", dataDetectorPattern);
// Customize the pattern for your special cases
NSString * customPattern = [NSString stringWithFormat:#"<MY_OTHER_PATERNS + %#>", dataDetectorPattern];
NSRegularExpression * customDataDetectorLikeRegEx = [NSRegularExpression regularExpressionWithPattern:customPattern options:someOptions error:&error];
You can try parse the address information with regular expressions (RegEx), I think that is more robust way. See the following reference to work with RegEx: Making RegEx Easy in Objective-C, Objective-C RegEx Categories is available on GitHub.
What I'm trying to accomplish is as follows. I have a NSString with a sentence that has a URL within the sentience. I'm needing to be able to grab the URL that is presented within any sentence that is within a NSString so for example:
Let's say I had this NSString
NSString *someString = #"This is a sample of a http://example.com/efg.php?EFAei687e3EsA sentence with a URL within it.";
I need to be able to extract http://example.com/efg.php?EFAei687e3EsA from within that NSString. This NSString isn't static and will be changing structure and the url will not necessarily be in the same spot of the sentence. I've tried to look into the three20 code but it makes no sense to me. How else can this be done?
Use an NSDataDetector:
NSString *string = #"This is a sample of a http://example.com/efg.php?EFAei687e3EsA sentence with a URL within it.";
NSDataDetector *linkDetector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeLink error:nil];
NSArray *matches = [linkDetector matchesInString:string options:0 range:NSMakeRange(0, [string length])];
for (NSTextCheckingResult *match in matches) {
if ([match resultType] == NSTextCheckingTypeLink) {
NSURL *url = [match URL];
NSLog(#"found URL: %#", url);
}
}
This way you don't have to rely on an unreliable regular expression, and as Apple upgrades their link detection code, you get those improvements for free.
Edit: I'm going to go out on a limb here and say you should probably use NSDataDetector as Dave mentions. Far less prone to error than regular expressions.
Take a look at regular expressions. You can construct a simple one to extract the URL using the NSRegularExpression class, or find one online that you can use. For a tutorial on using the class, see here.
The code you want essentially looks like this (using John Gruber's super URL regex):
NSRegularExpression *expression = [NSRegularExpression regularExpressionWithPattern:#"(?i)\\b((?:[a-z][\\w-]+:(?:/{1,3}|[a-z0-9%])|www\\d{0,3}[.]|[a-z0-9.\\-]+[.][a-z]{2,4}/)(?:[^\\s()<>]+|\\(([^\\s()<>]+|(\\([^\\s()<>]+\\)))*\\))+(?:\\(([^\\s()<>]+|(\\([^\\s()<>]+\\)))*\\)|[^\\s`!()\\[\\]{};:'\".,<>?«»“”‘’]))" options:NSRegularExpressionCaseInsensitive error:NULL];
NSString *someString = #"This is a sample of a http://example.com/efg.php?EFAei687e3EsA sentence with a URL within it.";
NSString *match = [someString substringWithRange:[expression rangeOfFirstMatchInString:someString options:NSMatchingCompleted range:NSMakeRange(0, [someString length])]];
NSLog(#"%#", match); // Correctly prints 'http://example.com/efg.php?EFAei687e3EsA'
That will extract the first URL in any string (of course, this does no error checking, so if the string really doesn't contain any URL's it won't work, but take a look at the NSRegularExpression class to see how to get around it.
Use Like This:
NSError *error = nil;
NSDataDetector *detector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeLink
error:&error];
[detector enumerateMatchesInString:someString
options:0
range:NSMakeRange(0, someString.length)
usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop)
{
if (result.resultType == NSTextCheckingTypeLink)
{
NSString *str = [NSString stringWithFormat:#"%#",result.URL];
NSLOG(%#,str);
}
}];
This will Output the all links in your someString one by one
Swift 2 :
let input = "This is a test with the URL https://www.hackingwithswift.com to be detected."
let detector = try! NSDataDetector(types: NSTextCheckingType.Link.rawValue)
let matches = detector.matchesInString(input, options: [], range: NSMakeRange(0, input.characters.count))
for match in matches {
let url = (input as NSString).substringWithRange(match.range)
print(url)
}
Source
use this:
NSURL *url;
NSArray *listItems = [someString componentsSeparatedByString:#" "];
for(int i=0;i<[listItems count];i++)
{
NSString *str=[listItems objectAtIndex:i];
if ([str rangeOfString:#"http://"].location == NSNotFound)
NSLog(#"Not url");
else
url=[NSURL URLWithString:str];
}
you need two things:
A category that adds regex to NSString (i.e. RegexKit)
Matching Regex for URLS.
regards,
Funny you mention three20, that was the first place I was going to go look for the answer. Here's the method from three20:
- (void)parseURLs:(NSString*)string {
NSInteger index = 0;
while (index < string.length) {
NSRange searchRange = NSMakeRange(index, string.length - index);
NSRange startRange = [string rangeOfString:#"http://" options:NSCaseInsensitiveSearch
range:searchRange];
if (startRange.location == NSNotFound) {
NSString* text = [string substringWithRange:searchRange];
TTStyledTextNode* node = [[[TTStyledTextNode alloc] initWithText:text] autorelease];
[self addNode:node];
break;
} else {
NSRange beforeRange = NSMakeRange(searchRange.location, startRange.location - searchRange.location);
if (beforeRange.length) {
NSString* text = [string substringWithRange:beforeRange];
TTStyledTextNode* node = [[[TTStyledTextNode alloc] initWithText:text] autorelease];
[self addNode:node];
}
NSRange searchRange = NSMakeRange(startRange.location, string.length - startRange.location);
NSRange endRange = [string rangeOfString:#" " options:NSCaseInsensitiveSearch
range:searchRange];
if (endRange.location == NSNotFound) {
NSString* URL = [string substringWithRange:searchRange];
TTStyledLinkNode* node = [[[TTStyledLinkNode alloc] initWithText:URL] autorelease];
node.URL = URL;
[self addNode:node];
break;
} else {
NSRange URLRange = NSMakeRange(startRange.location,
endRange.location - startRange.location);
NSString* URL = [string substringWithRange:URLRange];
TTStyledLinkNode* node = [[[TTStyledLinkNode alloc] initWithText:URL] autorelease];
node.URL = URL;
[self addNode:node];
index = endRange.location;
}
}
}
}
Every time it does [self addNode:node]; after the first if part, it's adding a found URL. This should get you started! Hope this helps. :)
Using Swift 2.2 - NSDataDetector
let string = "here is the link www.google.com"
let types: NSTextCheckingType = [ .Link]
let detector = try? NSDataDetector(types: types.rawValue)
detector?.enumerateMatchesInString(string, options: [], range: NSMakeRange(0, (string as NSString).length)) { (result, flags, _) in
if(result?.URL != nil){
print(result?.URL)
}
}
Swift 4.x
Xcode 12.x
let string = "This is a test with the URL https://www.hackingwithswift.com to be detected. www.example.com"
let types: NSTextCheckingResult.CheckingType = [ .link]
let detector = try? NSDataDetector(types: types.rawValue)
detector?.enumerateMatches(in: string, options: [], range: NSMakeRange(0, (string as NSString).length)) { (result, flags, _) in
if(result?.url != nil){
print(result?.url)
}
}