Find dynamically word in NSString - ios

NSString * stringExample1=#"www.mysite.com/word-4-word-1-1-word-word-2-word-817061.html";
NSString * stringExample2=#"www.mysite.com/word-4-5-1-1-word-1-5-word-11706555.html";
I try to find - and . Inside of NSString.
NSRange range = [string rangeOfString:#"-"];
NSUInteger start = range.location;
NSUInteger end = start + range.length;
NSRange rangeDot= [string rangeOfString:#"."];
NSUInteger startt = rangeDot.location;
NSUInteger endt = startt + rangeDot.length;
But it's can't be successful. It's showing first place. How can I get 817061 and 11706555 inside of Nstring?
Thank you .

This will work for you,
NSArray *strArry=[stringExample1 componentsSeparatedByString:#"-"];
NSString *result =[strArry lastObject];
NSString *resultstring= [result stringByReplacingOccurrencesOfString:#".html" withString:#""];

Are you trying to find if it contains at least one of - or . ?
You can use -rangeOfCharacterFromSet:
NSCharacterSet *CharacterSet = [NSCharacterSet characterSetWithCharactersInString:#"-."];
NSRange range = [YourString rangeOfCharacterFromSet:CharacterSet];
if (range.location == NSNotFound)
{
// no - or . in the string
}
else
{
// - or . are present
}

Try this simple Regular Expression.
NSString * stringExample1=#"www.mysite.com/word-4-word-1-1-word-word-2-word-84354354353.html";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#"(\\-\\d*\\.)"
options:0
error:&error];
NSRange range = [regex rangeOfFirstMatchInString:stringExample1
options:0
range:NSMakeRange(0, [stringExample1 length])];
range = NSMakeRange(range.location+1, range.length-2);
NSString *result = [stringExample1 substringWithRange:range];
NSLog(#"%#",result);

I think the best way to find the match is by using regulars expressions with NSRegularExpression.
NSString * stringEx=#"www.mysite.com/word-4-word-1-1-word-word-2-word-817061.html";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"-(\\d*).html$"
options:NSRegularExpressionCaseInsensitive
error:&error];
NSArray *matches = [regex matchesInString:stringEx options:NSMatchingReportCompletion range:NSMakeRange(0, [stringEx length])];
if ([matches count] > 0)
{
NSString* resultString = [stringEx substringWithRange:[matches[0] rangeAtIndex:1]];
NSLog(#"Matched: %#", resultString);
}
Make sure you use an extra \ escape character in the regex NSString whenever needed.
UPDATE
I did a test using the two different approaches (regex vs string splitting) with the code below:
NSDate *timeBefore = [NSDate date];
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"-(\\d*).html$"
options:NSRegularExpressionCaseInsensitive
error:&error];
for (int i = 0; i < 100000; i++)
{
NSArray *matches = [regex matchesInString:stringEx options:NSMatchingReportCompletion range:NSMakeRange(0, [stringEx length])];
if ([matches count] > 0)
{
NSString* resultString = [stringEx substringWithRange:[matches[0] rangeAtIndex:1]];
}
}
NSTimeInterval timeSpent = [timeBefore timeIntervalSinceNow];
NSLog(#"Time: %.5f", timeSpent*-1);
on the simulator the differences are not significant, but running on an iPhone 4 I got the following results:
2013-11-25 10:24:19.795 NotifApp[406:60b] Time: 11.45771 // string splitting
2013-11-25 10:25:10.451 NotifApp[412:60b] Time: 7.55713 // regex
so I guess the best approach depends on case to case.

Related

Detect hashtags including & in hashtag

I can detect hashtags like this.
+ (NSArray *)getHashArrayWithInputString:(NSString *)inputStr
{
NSError *error = nil;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"#(\\w+)" options:0 error:&error];
NSArray *matches = [regex matchesInString:inputStr options:0 range:NSMakeRange(0, inputStr.length)];
NSMutableArray *muArr = [NSMutableArray array];
for (NSTextCheckingResult *match in matches) {
NSRange wordRange = [match rangeAtIndex:1];
NSString* word = [inputStr substringWithRange:wordRange];
NSCharacterSet* notDigits = [[NSCharacterSet decimalDigitCharacterSet] invertedSet];
if ([word rangeOfCharacterFromSet:notDigits].location == NSNotFound)
{
// newString consists only of the digits 0 through 9
}
else
[muArr addObject:[NSString stringWithFormat:#"#%#",word]];
}
return muArr;
}
Problem is that if inputStr is "#D&D", it can detect only #D. How shall I do?
For that with your reg expression add special character that you want allow.
#(\\w+([&]*\\w*)*) //To allow #D&D&d...
#(\\w+([&-]*\\w*)*) //To allow both #D&D-D&...
Same way you add other special character that you want.
So simply change your regex like this.
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"#(\\w+([&]*\\w*)*)" options:0 error:&error];
I was using this lib:
https://cocoapods.org/pods/twitter-text
There is TwitterText class with method
(NSArray *)hashtagsInText:(NSString *)text checkingURLOverlap (BOOL)checkingURLOverlap It could help.
I used this pod year ago last time, then it worked great. For today you need to check if it is still ok. Let me know :) Good luck

How to parse a NSString

The string is myAgent(9953593875).Amt:Rs.594 and want to extract 9953593875 from it. Here is what I tried:
NSRange range = [feDetails rangeOfString:#"."];
NSString *truncatedFeDetails = [feDetails substringWithRange:NSMakeRange(0, range.location)];
NSLog(#"truncatedString-->%#",truncatedFeDetails);
This outputs: truncatedString-->AmzAgent(9953593875)
Or you do like this:
NSString *string = #"myAgent(9953593875).Amt:Rs.594.";
NSRange rangeOne = [string rangeOfString:#"("];
NSRange rangeTwo = [string rangeOfString:#")"];
if (rangeOne.location != NSNotFound && rangeTwo.location != NSNotFound) {
NSString *truncatedFeDetails = [string substringWithRange:NSMakeRange(rangeOne.location + 1, rangeTwo.location - rangeOne.location - 1)];
NSLog(#"%#",truncatedFeDetails);
}
do like
Step-1
// split the string first based on .
for example
NSString *value = #"myAgent(9953593875).Amt:Rs.594.How I get 9953593875 only";
NSArray *arr = [value componentsSeparatedByString:#"."];
NSString * AmzAgent = [arr firstObject]; // or use [arr firstObject];
NSLog(#"with name ==%#",AmzAgent);
in here u get the output of myAgent(9953593875)
Step-2
in here use replace string like
AmzAgent = [AmzAgent stringByReplacingOccurrencesOfString:#"myAgent("
withString:#""];
AmzAgent = [AmzAgent stringByReplacingOccurrencesOfString:#")"
withString:#""];
NSLog(#"final ==%#",AmzAgent);
finally you get output as 9953593875
Try this
NSString *str = #"myAgent(9953593875).Amt:Rs.594.";
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#"(?<=\\()\\d+(?=\\))"
options:NSRegularExpressionCaseInsensitive
error:nil];
NSString *result = [str substringWithRange:[regex firstMatchInString:str options:0 range:NSMakeRange(0, [str length])].range];
//result = 9953593875

How to work with the results from NSRegularExpression when using the regex pattern as a string delimiter

I'm using a simple pattern with NSRegularExpression to delimit content within a string:
(\s)+(and|or)(\s)+
So, when I use matchesInString it's not the matches that I'm interested in, but the other stuff.
Below is the code that I'm using. Iterating over the matches and then using indexes and lengths to pull out the content.
Question: I'm just wondering if I'm missing something in the api to get the other bits? Or, is the approach below generally ok?
- (NSArray*)separateText:(NSString*)text
{
NSString* regExPattern = #"(\\s)+(and|or)(\\s)+";
NSError* error = NULL;
NSRegularExpression* regex = [NSRegularExpression regularExpressionWithPattern:regExPattern
options:NSRegularExpressionCaseInsensitive
error:&error];
NSArray* matches = [regex matchesInString:text options:0 range:NSMakeRange(0, text.length)];
if (matches.count == 0) {
return #[text];
}
NSInteger itemStartIndex = 0;
NSMutableArray* result = [NSMutableArray new];
for (NSTextCheckingResult* match in matches) {
NSRange matchRange = [match range];
if (!matchRange.location == 0) {
NSInteger matchStartIndex = matchRange.location;
NSInteger length = matchStartIndex - itemStartIndex;
NSString* item = [text substringWithRange:NSMakeRange(itemStartIndex, length)];
if (item.length != 0) {
[result addObject:item];
}
}
itemStartIndex = NSMaxRange(matchRange);
}
if (itemStartIndex != text.length) {
NSInteger length = text.length - itemStartIndex;
NSString* item = [text substringWithRange:NSMakeRange(itemStartIndex, length)];
[result addObject:item];
}
return result;
}
You can capture the string before the and|or with parentheses, and add it to your array with rangeAtIndex.
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(.+?)(\\s+(and|or)\\W+|\\s*$)" options:NSRegularExpressionCaseInsensitive error:&error];
NSMutableArray *phrases = [NSMutableArray array];
[regex enumerateMatchesInString:string options:0 range:NSMakeRange(0, [string length]) usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) {
NSRange range = [result rangeAtIndex:1];
[phrases addObject:[string substringWithRange:range]];
}];
A couple of minor points about my regex:
I added the |\\s*$ construct to capture the last string after the final and|or. If you don't want that, you can eliminate that.
I replaced the second \\s+ (whitespace) with a \\W+ (non-word characters), in case you encountered something like and|or followed by a comma or something else. You could alternatively look explicitly for ,?\\s+ if the comma was the only non-word character you cared about. It just depends upon the specific business problem you're solving.
You might want to replace the first \\s+ with \\W+, too.
If your string contains newline characters, you might want to use the NSRegularExpressionDotMatchesLineSeparators option when you instantiate the NSRegularExpression.
You could replace all matches of the regex with a template string (e.g. ", " or "," etc) and then separate the string components based on that new delimiter.
NSString *stringToBeMatched = #"Your string to be matched";
NSString *regExPattern = #"(\\s)+(and|or)(\\s)+";
NSError *error = nil;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:regExPattern
options:NSRegularExpressionCaseInsensitive
error:&error];
if (error) {
// handle error
}
NSString *replacementString = [regex stringByReplacingMatchesInString:stringToBeMatched
options:0
range:NSMakeRange(0, stringToBeMatched.length)
withTemplate:#","];
NSArray *otherItemsInString = [replacementString componentsSeparatedByString:#","];

iOS: extract substring of NSString in objective C

I have an NSString as:
"<a href='javascript:void(null)' onclick='handleCommandForAnchor(this, 10);return false;'>12321<\/a>"
I need to extract the 12321 near the end of the NSString from it and store.
First I tried
NSString *shipNumHtml=[mValues objectAtIndex:1];
NSInteger htmlLen=[shipNumHtml length];
NSString *shipNum=[[shipNumHtml substringFromIndex:htmlLen-12]substringToIndex:8];
But then I found out that number 12321 can be of variable length.
I can't find a method like java's indexOf() to find the '>' and '<' and then find substring with those indices. All the answers I've found on SO either know what substring to search for or know the location if the substring. Any help?
I don't usually advocate using Regular expressions for parsing HTML contents but it seems a regex matching >(\d+)< would to the job in this simple string.
Here is a simple example:
NSError *error = nil;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#">(\\d+)<"
options:0
error:&error];
// Handle error != nil
NSTextCheckingResult *match = [regex firstMatchInString:string
options:0
range:NSMakeRange(0, [string length])];
if (match) {
NSRange matchRange = [match rangeAtIndex:1];
NSString *number = [string substringWithRange:matchRange]
NSLog(#"Number: %#", number);
}
As #HaneTV says, you can use the NSString method rangeOfString to search for substrings. Given that the characters ">" and "<" appear in multiple places in your string, so you might want to take a look at NSRegularExpression and/or NSScanner.
that may help on you a bit, I've just tested:
NSString *_string = #"<a href='javascript:void(null)' onclick='handleCommandForAnchor(this, 10);return false;'>12321</a>";
NSError *_error;
NSRegularExpression *_regExp = [NSRegularExpression regularExpressionWithPattern:#">(.*)<" options:NSRegularExpressionCaseInsensitive error:&_error];
NSArray *_matchesInString = [_regExp matchesInString:_string options:NSMatchingReportCompletion range:NSMakeRange(0, _string.length)];
[_matchesInString enumerateObjectsUsingBlock:^(NSTextCheckingResult * result, NSUInteger idx, BOOL *stop) {
for (int i = 0; i < result.numberOfRanges; i++) {
NSString *_match = [_string substringWithRange:[result rangeAtIndex:i]];
NSLog(#"%#", _match);
}
}];

How to use regular expressions to find words that begin with a three character prefix

My goal is to count the number of words (in a string) that begin with a specified prefix of more than one letter. A case is words that begin with "non". So in this example...
NSString * theFullTestString = #"nonsense non-issue anonymous controlWord";
...I want to get hits on "nonsense" and "non-issue", but not on "anonymous" or "controlWord". The total count of my hits should be 2.
So here's my test code which seems close, but none of the regular expression forms I've tried works correctly. This code catches "nonsense" (correct) and "anonymous" (wrong) but not "non-issue" (wrong). Its count is 2, but for the wrong reason.
NSUInteger countOfNons = 0;
NSString * theFullTestString = #"nonsense non-issue anonymous controlWord";
NSError *error = nil;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"non(\\w+)" options:0 error:&error];
NSArray *matches = [regex matchesInString:theFullTestString options:0 range:NSMakeRange(0, theFullTestString.length)];
for (NSTextCheckingResult *match in matches) {
NSRange wordRange = [match rangeAtIndex:1];
NSString* word = [theFullTestString substringWithRange:wordRange];
++countOfNons;
NSLog(#"Found word:%# countOfNons:%d", word, countOfNons);
}
I'm stumped.
The regex \bnon[\w-]* should do the trick
\bnon[\w-]*
^ (\b) Start of word
^ (non) Begins with non
^ ([\w-]) A alphanumeric char, or hyphen
^ (*) The character after 'non' zero or more times
So, in your case:
NSUInteger countOfNons = 0;
NSString * theFullTestString = #"nonsense non-issue anonymous controlWord";
NSError *error = nil;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(\\bnon[\\w-]*)" options:0 error:&error];
NSArray *matches = [regex matchesInString:theFullTestString options:0 range:NSMakeRange(0, theFullTestString.length)];
for (NSTextCheckingResult *match in matches) {
NSRange wordRange = [match rangeAtIndex:1];
NSString* word = [theFullTestString substringWithRange:wordRange];
++countOfNons;
NSLog(#"Found word:%# countOfNons:%d", word, countOfNons);
}
I think regular expressions are a bit of an overkill here.
NSString *words = #"nonsense non-issue anonymous controlWord";
NSArray *wordsArr = [words componentsSeparatedByString:#" "];
int count = 0;
for (NSString *word in wordsArr) {
if ([word hasPrefix:#"non"]) {
count++;
NSLog(#"%dth match: %#", count, word);
}
}
NSLog(#"Count: %d", count);
There is more easier way to do this. You can use NSPredicate and use this format BEGINSWITH[c] %#.
Sample code
NSPredicate *resultPredicate = [NSPredicate predicateWithFormat:#"Firstname BEGINSWITH[c] %#", text];
NSArray *results = [People filteredArrayUsingPredicate:resultPredicate];

Resources