Is this NSScanner 's bug? - ios

The code snippet is as follows:
unichar chars[] = {0x0030, 0x0031, 0x0032, 0x003B, 0x0E31};//the testString is "012;" plus a thai character
NSString *testString = [[NSString alloc] initWithCharacters:chars length:5];
NSLog(#"testString %#", testString);
NSScanner *theScanner = [NSScanner scannerWithString:testString];
NSString *result = nil;
[theScanner scanUpToString:#";" intoString:&result];
//[theScanner scanUpToCharactersFromSet:[NSCharacterSet characterSetWithCharactersInString:#";"] intoString:&result];
NSLog(#"the result is %#", result);
using scanUpToString failed, however, using scanUpToCharactersFromSet works. And if the character after 0x003B is not 0x0E31, for example ,0x0030, both api works.
So I guess scanUpToString has a bug dealing with some characters.
Does anyone has better ideas?
Thank you.

Related

Extract a String out of a specific set of strings

I have a text as:
sometext[string1 string2]someText
I want to retrieve string1 and string2 as separate strings from this text
How can i parse it in objective - c?
i have found the solution
NSArray *arrayOne = [prettyFunctionString componentsSeparatedByString:#"["];
NSString *parsedOne = [arrayOne objectAtIndex:1];
NSArray *arrayTwo = [parsedOne componentsSeparatedByString:#"]"];
NSString *parsedTwo = [arrayTwo objectAtIndex:0];
NSArray *arrayThree = [parsedTwo componentsSeparatedByString:#" "];
NSString *className = [arrayThree objectAtIndex:0];
NSString *functionName = [arrayThree objectAtIndex:1];
thanks anyways
Maybe something like this could work for you
NSString * string = #"sometext[string1 string2]sometext";
NSString * pattern = #"(.*)\[(.+) (.+)\](.*)"
NSRegularExpression * expression = [NSRegularExpression regularExpressionWithPattern:pattern options:NSRegularExpressionCaseInsensitive error:NULL];
NSTextCheckingResult * match = [expression firstMatchInString:string options:NSMatchingReportCompletion range:NSMakeRange(0, string.length)];
if (match) {
NSString * substring1 = [string substringWithRange:[match rangeAtIndex:2]];
NSString * substring2 = [string substringWithRange:[match rangeAtIndex:3]];
// do something with substring1 and substring2
}
You can Use this Simple Approach approach
NSString *str = #"sometext[string1 string2]someText";
NSInteger loc1 = [str localizedStandardRangeOfString:#"["].location;
NSInteger loc2 = [str localizedStandardRangeOfString:#"]"].location;
NSString *resultString = [str substringWithRange:(NSRange){loc1+1,loc2-loc1}];
NSArray *resultArry = [resultString componentsSeparatedByString:#" "];
result array will contain your required Reuslt
For completeness - if you are trying to extract strings out of a string with a known pattern, then an NSScanner is the way to go.
This goes through the string in one pass.
NSString *string = #"sometext[string1 string2]someText";
NSScanner *scanner = [NSScanner scannerWithString:string];
NSString *str1;
NSString *str2;
[scanner scanUpToString:#"[" intoString:nil]; // Scan up to the '[' character.
[scanner scanString:#"[" intoString:nil]; // Scan the '[' character and discard it.
[scanner scanUpToCharactersFromSet:[NSCharacterSet whitespaceCharacterSet] intoString: &str1]; // Scan all the characters up to the whitespace and accumulate the characters into 'str1'
[scanner scanUpToCharactersFromSet:[NSCharacterSet alphanumericCharacterSet] intoString:nil]; // Scan up to the next alphanumeric character and discard the result.
[scanner scanUpToString:#"]" intoString:&str2]; // Scan up to the ']' character, accumulate the characters into 'str2'
// Log the output.
NSLog(#"First String: %#", str1);
NSLog(#"Second String: %#", str2);
Which puts the output into the console of:
2015-09-23 11:31:02.522 StringExtractor[46678:4289499] First String: string1
2015-09-23 11:31:02.522 StringExtractor[46678:4289499] Second String: string2

IOS How to find full rss feed link with nsscanner class

I am working on fetching data from rss feed based project.From searching on google i found that generally RSS link found in this format in source of HTML.
<link rel="alternate" type="application/rss+xml" title="RSS Feed" href="http://feeds.abcnews.com/abcnews/topstories" />
so, I have to use nsscanner class to find the link of RSS feed from HTML source. but i don't know proper pattern and which i have to set scanUpToString: and haracterSetWithCharactersInString: or etc.
So, please help me how to i find the full link of RSS feed.
Here is my try:
- (void)viewDidLoad {
NSString *googleString = #"http://abcnews.go.com/";
NSURL *googleURL = [NSURL URLWithString:googleString];
NSError *error;
NSString *googlePage = [NSString stringWithContentsOfURL:googleURL encoding:NSASCIIStringEncoding
error:&error];
NSLog(#"%#",[self yourStringArrayWithHTMLSourceString:googlePage]);//will return NSMutableArray
}
-(NSMutableArray *)yourStringArrayWithHTMLSourceString:(NSString *)html
{
NSString *from = #"<a href=\"";
NSString *to = #"</a>";
NSMutableArray *array = [[NSMutableArray alloc]init];
NSScanner* scanner = [NSScanner scannerWithString:html];
[scanner scanUpToString:#"<link" intoString:nil];
if (![scanner isAtEnd]) {
NSString *url = nil;
[scanner scanUpToString:#"RSS Feed" intoString:nil];
NSCharacterSet *charset = [NSCharacterSet characterSetWithCharactersInString:#"/>"];
[scanner scanUpToCharactersFromSet:charset intoString:nil];
[scanner scanCharactersFromSet:charset intoString:nil];
[scanner scanUpToCharactersFromSet:charset intoString:&url];
NSLog(#"%#",url);
// "url" now contains the URL of the img
}
return array;
}
currently i am able find only link with this code .
output:
But full link is :-
http://feeds.abcnews.com/abcnews/topstories
That is because
[NSCharacterSet characterSetWithCharactersInString:#"/>"];
contains characters "/" which is the last character of http://
and also the character right after feeds.abcnews.com.
Edit: Here's a playground which shows the approach you could take.(Not fully tested)
It's in Swift but the API is the same in Obj-C.
var str = "<link rel=\"alternate\" type=\"application/rss+xml\" title=\"RSS Feed\" href=\"http://feeds.abcnews.com/abcnews/topstories\" />";
var scanner = NSScanner.init(string: str);
var result: NSString? = nil
scanner.scanUpToString("href=\"", intoString: nil);
scanner.scanString("href=\"", intoString: nil);
scanner.scanUpToString("\" />", intoString: &result);
Use "link" instead of "a" tags from this reference.
Reference : Regular expression in ios to extract href url and discard rest of anchor tag

Getting just the phone number digits from kABPersonPhoneProperty in iOS [duplicate]

I have an NSString (phone number) with some parenthesis and hyphens as some phone numbers are formatted. How would I remove all characters except numbers from the string?
Old question, but how about:
NSString *newString = [[origString componentsSeparatedByCharactersInSet:
[[NSCharacterSet decimalDigitCharacterSet] invertedSet]]
componentsJoinedByString:#""];
It explodes the source string on the set of non-digits, then reassembles them using an empty string separator. Not as efficient as picking through characters, but much more compact in code.
There's no need to use a regular expressions library as the other answers suggest -- the class you're after is called NSScanner. It's used as follows:
NSString *originalString = #"(123) 123123 abc";
NSMutableString *strippedString = [NSMutableString
stringWithCapacity:originalString.length];
NSScanner *scanner = [NSScanner scannerWithString:originalString];
NSCharacterSet *numbers = [NSCharacterSet
characterSetWithCharactersInString:#"0123456789"];
while ([scanner isAtEnd] == NO) {
NSString *buffer;
if ([scanner scanCharactersFromSet:numbers intoString:&buffer]) {
[strippedString appendString:buffer];
} else {
[scanner setScanLocation:([scanner scanLocation] + 1)];
}
}
NSLog(#"%#", strippedString); // "123123123"
EDIT: I've updated the code because the original was written off the top of my head and I figured it would be enough to point the people in the right direction. It seems that people are after code they can just copy-paste straight into their application.
I also agree that Michael Pelz-Sherman's solution is more appropriate than using NSScanner, so you might want to take a look at that.
The accepted answer is overkill for what is being asked. This is much simpler:
NSString *pureNumbers = [[phoneNumberString componentsSeparatedByCharactersInSet:[[NSCharacterSet decimalDigitCharacterSet] invertedSet]] componentsJoinedByString:#""];
This is great, but the code does not work for me on the iPhone 3.0 SDK.
If I define strippedString as you show here, I get a BAD ACCESS error when trying to print it after the scanCharactersFromSet:intoString call.
If I do it like so:
NSMutableString *strippedString = [NSMutableString stringWithCapacity:10];
I end up with an empty string, but the code doesn't crash.
I had to resort to good old C instead:
for (int i=0; i<[phoneNumber length]; i++) {
if (isdigit([phoneNumber characterAtIndex:i])) {
[strippedString appendFormat:#"%c",[phoneNumber characterAtIndex:i]];
}
}
Though this is an old question with working answers, I missed international format support. Based on the solution of simonobo, the altered character set includes a plus sign "+". International phone numbers are supported by this amendment as well.
NSString *condensedPhoneNumber = [[phoneNumber componentsSeparatedByCharactersInSet:
[[NSCharacterSet characterSetWithCharactersInString:#"+0123456789"]
invertedSet]]
componentsJoinedByString:#""];
The Swift expressions are
var phoneNumber = " +1 (234) 567-1000 "
var allowedCharactersSet = NSMutableCharacterSet.decimalDigitCharacterSet()
allowedCharactersSet.addCharactersInString("+")
var condensedPhoneNumber = phoneNumber.componentsSeparatedByCharactersInSet(allowedCharactersSet.invertedSet).joinWithSeparator("")
Which yields +12345671000 as a common international phone number format.
Here is the Swift version of this.
import UIKit
import Foundation
var phoneNumber = " 1 (888) 555-5551 "
var strippedPhoneNumber = "".join(phoneNumber.componentsSeparatedByCharactersInSet(NSCharacterSet.decimalDigitCharacterSet().invertedSet))
Swift version of the most popular answer:
var newString = join("", oldString.componentsSeparatedByCharactersInSet(NSCharacterSet.decimalDigitCharacterSet().invertedSet))
Edit: Syntax for Swift 2
let newString = oldString.componentsSeparatedByCharactersInSet(NSCharacterSet.decimalDigitCharacterSet().invertedSet).joinWithSeparator("")
Edit: Syntax for Swift 3
let newString = oldString.components(separatedBy: CharacterSet.decimalDigits.inverted).joined(separator: "")
Thanks for the example. It has only one thing missing the increment of the scanLocation in case one of the characters in originalString is not found inside the numbers CharacterSet object. I have added an else {} statement to fix this.
NSString *originalString = #"(123) 123123 abc";
NSMutableString *strippedString = [NSMutableString
stringWithCapacity:originalString.length];
NSScanner *scanner = [NSScanner scannerWithString:originalString];
NSCharacterSet *numbers = [NSCharacterSet
characterSetWithCharactersInString:#"0123456789"];
while ([scanner isAtEnd] == NO) {
NSString *buffer;
if ([scanner scanCharactersFromSet:numbers intoString:&buffer]) {
[strippedString appendString:buffer];
}
// --------- Add the following to get out of endless loop
else {
[scanner setScanLocation:([scanner scanLocation] + 1)];
}
// --------- End of addition
}
NSLog(#"%#", strippedString); // "123123123"
It Accept only mobile number
NSString * strippedNumber = [mobileNumber stringByReplacingOccurrencesOfString:#"[^0-9]" withString:#"" options:NSRegularExpressionSearch range:NSMakeRange(0, [mobileNumber length])];
It might be worth noting that the accepted componentsSeparatedByCharactersInSet: and componentsJoinedByString:-based answer is not a memory-efficient solution. It allocates memory for the character set, for an array and for a new string. Even if these are only temporary allocations, processing lots of strings this way can quickly fill the memory.
A memory friendlier approach would be to operate on a mutable copy of the string in place. In a category over NSString:
-(NSString *)stringWithNonDigitsRemoved {
static NSCharacterSet *decimalDigits;
if (!decimalDigits) {
decimalDigits = [NSCharacterSet decimalDigitCharacterSet];
}
NSMutableString *stringWithNonDigitsRemoved = [self mutableCopy];
for (CFIndex index = 0; index < stringWithNonDigitsRemoved.length; ++index) {
unichar c = [stringWithNonDigitsRemoved characterAtIndex: index];
if (![decimalDigits characterIsMember: c]) {
[stringWithNonDigitsRemoved deleteCharactersInRange: NSMakeRange(index, 1)];
index -= 1;
}
}
return [stringWithNonDigitsRemoved copy];
}
Profiling the two approaches have shown this using about 2/3 less memory.
You can use regular expression on mutable string:
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:
#"[^\\d]"
options:0
error:nil];
[regex replaceMatchesInString:str
options:0
range:NSMakeRange(0, str.length)
withTemplate:#""];
Built the top solution as a category to help with broader problems:
Interface:
#interface NSString (easyReplace)
- (NSString *)stringByReplacingCharactersNotInSet:(NSCharacterSet *)set
with:(NSString *)string;
#end
Implemenation:
#implementation NSString (easyReplace)
- (NSString *)stringByReplacingCharactersNotInSet:(NSCharacterSet *)set
with:(NSString *)string
{
NSMutableString *strippedString = [NSMutableString
stringWithCapacity:self.length];
NSScanner *scanner = [NSScanner scannerWithString:self];
while ([scanner isAtEnd] == NO) {
NSString *buffer;
if ([scanner scanCharactersFromSet:set intoString:&buffer]) {
[strippedString appendString:buffer];
} else {
[scanner setScanLocation:([scanner scanLocation] + 1)];
[strippedString appendString:string];
}
}
return [NSString stringWithString:strippedString];
}
#end
Usage:
NSString *strippedString =
[originalString stringByReplacingCharactersNotInSet:
[NSCharacterSet setWithCharactersInString:#"01234567890"
with:#""];
Swift 3
let notNumberCharacters = NSCharacterSet.decimalDigits.inverted
let intString = yourString.trimmingCharacters(in: notNumberCharacters)
swift 4.1
var str = "75003 Paris, France"
var stringWithoutDigit = (str.components(separatedBy:CharacterSet.decimalDigits)).joined(separator: "")
print(stringWithoutDigit)
Um. The first answer seems totally wrong to me. NSScanner is really meant for parsing. Unlike regex, it has you parsing the string one tiny chunk at a time. You initialize it with a string, and it maintains an index of how far along the string it's gotten; That index is always its reference point, and any commands you give it are relative to that point. You tell it, "ok, give me the next chunk of characters in this set" or "give me the integer you find in the string", and those start at the current index, and move forward until they find something that doesn't match. If the very first character already doesn't match, then the method returns NO, and the index doesn't increment.
The code in the first example is scanning "(123)456-7890" for decimal characters, which already fails from the very first character, so the call to scanCharactersFromSet:intoString: leaves the passed-in strippedString alone, and returns NO; The code totally ignores checking the return value, leaving the strippedString unassigned. Even if the first character were a digit, that code would fail, since it would only return the digits it finds up until the first dash or paren or whatever.
If you really wanted to use NSScanner, you could put something like that in a loop, and keep checking for a NO return value, and if you get that you can increment the scanLocation and scan again; and you also have to check isAtEnd, and yada yada yada. In short, wrong tool for the job. Michael's solution is better.
For those searching for phone extraction, you can extract the phone numbers from a text using NSDataDetector, for example:
NSString *userBody = #"This is a text with 30612312232 my phone";
if (userBody != nil) {
NSError *error = NULL;
NSDataDetector *detector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypePhoneNumber error:&error];
NSArray *matches = [detector matchesInString:userBody options:0 range:NSMakeRange(0, [userBody length])];
if (matches != nil) {
for (NSTextCheckingResult *match in matches) {
if ([match resultType] == NSTextCheckingTypePhoneNumber) {
DbgLog(#"Found phone number %#", [match phoneNumber]);
}
}
}
}
`
I created a category on NSString to simplify this common operation.
NSString+AllowCharactersInSet.h
#interface NSString (AllowCharactersInSet)
- (NSString *)stringByAllowingOnlyCharactersInSet:(NSCharacterSet *)characterSet;
#end
NSString+AllowCharactersInSet.m
#implementation NSString (AllowCharactersInSet)
- (NSString *)stringByAllowingOnlyCharactersInSet:(NSCharacterSet *)characterSet {
NSMutableString *strippedString = [NSMutableString
stringWithCapacity:self.length];
NSScanner *scanner = [NSScanner scannerWithString:self];
while (!scanner.isAtEnd) {
NSString *buffer = nil;
if ([scanner scanCharactersFromSet:characterSet intoString:&buffer]) {
[strippedString appendString:buffer];
} else {
scanner.scanLocation = scanner.scanLocation + 1;
}
}
return strippedString;
}
#end
I think currently best way is:
phoneNumber.replacingOccurrences(of: "\\D",
with: "",
options: String.CompareOptions.regularExpression)
If you're just looking to grab the numbers from the string, you could certainly use regular expressions to parse them out. For doing regex in Objective-C, check out RegexKit. Edit: As #Nathan points out, using NSScanner is a much simpler way to parse all numbers from a string. I totally wasn't aware of that option, so props to him for suggesting it. (I don't even like using regex myself, so I prefer approaches that don't require them.)
If you want to format phone numbers for display, it's worth taking a look at NSNumberFormatter. I suggest you read through this related SO question for tips on doing so. Remember that phone numbers are formatted differently depending on location and/or locale.
Swift 5
let newString = origString.components(separatedBy: CharacterSet.decimalDigits.inverted).joined(separator: "")
Based on Jon Vogel's answer here it is as a Swift String extension along with some basic tests.
import Foundation
extension String {
func stringByRemovingNonNumericCharacters() -> String {
return self.componentsSeparatedByCharactersInSet(NSCharacterSet.decimalDigitCharacterSet().invertedSet).joinWithSeparator("")
}
}
And some tests proving at least basic functionality:
import XCTest
class StringExtensionTests: XCTestCase {
func testStringByRemovingNonNumericCharacters() {
let baseString = "123"
var testString = baseString
var newString = testString.stringByRemovingNonNumericCharacters()
XCTAssertTrue(newString == testString)
testString = "a123b"
newString = testString.stringByRemovingNonNumericCharacters()
XCTAssertTrue(newString == baseString)
testString = "a=1-2_3#b"
newString = testString.stringByRemovingNonNumericCharacters()
XCTAssertTrue(newString == baseString)
testString = "(999) 999-9999"
newString = testString.stringByRemovingNonNumericCharacters()
XCTAssertTrue(newString.characters.count == 10)
XCTAssertTrue(newString == "9999999999")
testString = "abc"
newString = testString.stringByRemovingNonNumericCharacters()
XCTAssertTrue(newString == "")
}
}
This answers the OP's question but it could be easily modified to leave in phone number related characters like ",;*#+"
NSString *originalPhoneNumber = #"(123) 123-456 abc";
NSCharacterSet *numbers = [[NSCharacterSet characterSetWithCharactersInString:#"0123456789"] invertedSet];
NSString *trimmedPhoneNumber = [originalPhoneNumber stringByTrimmingCharactersInSet:numbers];
];
Keep it simple!

Getting multiple tags from html source code in Objective-C

I have extracted the source code from a website but i would like to display the strings of three urls. I have managed to strip the code so the only url's are the ones I need. How can I get the three strings in an array. The URL's look like this: Example
where I need to extract the string: 'example'
I have tried the NSScanner but without any luck. Please advice
Not the most clever way, but you can get the first approach of > and then the first < from there. All with standard NSString methods like rangeOfString: and such.
This code with NSScanner should give you luck :)
-(NSMutableArray *)yourStringArrayWithHTMLSourceString:(NSString *)html {
NSString *from = #"<a href=\"";
NSString *to = #"</a>";
NSMutableArray *array = [[NSMutableArray alloc]init];
NSScanner* scanner = [NSScanner scannerWithString:html];
for(int x=0;x<3;x++) {//You said only 3 strings
NSString *tempString;
[scanner scanUpToString:from intoString:nil];
[scanner scanString:from intoString:nil];
[scanner scanUpToString:to intoString:&tempString];
NSString *str = [tempString substringFromIndex:[tempString rangeOfString:#"\">"].location+2];
[array addObject:str];
}
return array;
}
usage:
for example:
NSString *html = [NSString stringWithContentsOfURL:[NSURL URLWithString:#"http://facebook.com"] encoding:NSUTF8StringEncoding error:nil];
NSLog(#"%#",[self yourStringArrayWithHTMLSourceString:html]);//will return NSMutableArray
Here is how to convert NSMutableArray to NSArray if you would like to to that:
NSArray *array = [NSArray arrayWithArray:mutableArray];

Extract substring from a string in iOS?

Is there any way to extract substring from a string like below
My real string is "NS09A" or "AB455A" but i want only "NS09" or "AB455" (upto the end of numeric part of original string).
How can i extract this?
I saw google search answers like using position of starting and endinf part of substring we can extract that ,But here any combination of "Alphabets+number+alphabets" .I need only " "Alphabets+number"
Perhaps not everybody will agree, but I like regular expressions. They allow to specify
precisely what you are looking for:
NSString *string = #"AB455A";
// One or more "word characters", followed by one or more "digits":
NSString *pattern = #"\\w+\\d+";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern
options:0
error:NULL];
NSTextCheckingResult *match = [regex firstMatchInString:string
options:NSMatchingAnchored
range:NSMakeRange(0, [string length])];
if (match != nil) {
NSString *extracted = [string substringWithRange:[match range]];
NSLog(#"%#", extracted);
// Output: AB455
} else {
// Input string is not of the expected form.
}
Try This:-
NSString *str=#"ASRF12353FYTEW";
NSString *resultStr;
for(int i=0;i<[str length];i++){
NSString *character = [str substringFromIndex: [str length] - i];
if([character intValue]){
resultStr=[str substringToIndex:[str length]-i+1];
break;
}
}
NSLog(#"RESUKT STRING %#",resultStr);
I tested this code:
NSString *originalString = #"NS09A";
// Intermediate
NSString *numberString;
NSString *numberString1;
NSScanner *scanner = [NSScanner scannerWithString:originalString];
NSCharacterSet *numbers = [NSCharacterSet characterSetWithCharactersInString:#"0123456789"];
[scanner scanUpToCharactersFromSet:numbers intoString:&numberString];
[scanner scanCharactersFromSet:numbers intoString:&numberString1];
NSString *result=[NSString stringWithFormat:#"%#%#",numberString,numberString1];
NSLog(#"Finally ==%#",result);
Hope it Help You
OUTPUT
Finally ==NS09
UPDATE:
NSString *originalString = #"kirtimali#gmail.com";
NSString *result;
NSScanner *scanner = [NSScanner scannerWithString:originalString];
NSCharacterSet *cs1 = [NSCharacterSet characterSetWithCharactersInString:#"#"];
[scanner scanUpToCharactersFromSet:cs1 intoString:&result];
NSLog(#"Finally ==%#",result);
output:
Finally ==kirtimali
Use NSScanner and the scanUpToCharactersFromSet:intoString: method to specify which characters should be used to stop the parsing. This could be in a loop with some logic or it could be applied in conjunction with setScanLocation: if you already have a method of finding the start of each section you want to extract.
When using scanUpToCharactersFromSet:intoString: you are looking for the next invalid character. It doesn't need to be a 'special' character (in a unicode sense), just a known set of characters that aren't valid for the content you want. So, you might use:
[[NSCharacterSet characterSetWithCharactersInString:#"1234567890"] invertedSet]
You can use - (NSString *)substringWithRange:(NSRange)aRange method on NSString class to get a substring extracted. Use NSMakeRange to create the NSRange object.

Resources