How can I get the unique characters in an NSString?
What I'm trying to do is get all the illegal characters in an NSString so that I can prompt the user which ones were inputted and therefore need to be removed. I start off by defining an NSCharacterSet of legal characters, separate them with every occurrence of a legal character, and join what's left (only illegal ones) into a new NSString. I'm now planning to get the unique characters of the new NSString (as an array, hopefully), but I couldn't find a reference anywhere.
NSCharacterSet *legalCharacterSet = [NSCharacterSet
characterSetWithCharactersInString:#"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLKMNOPQRSTUVWXYZ0123456789-()&+:;,'.# "];
NSString *illegalCharactersInTitle = [[self.titleTextField.text.noWhitespace
componentsSeparatedByCharactersInSet:legalCharacterSet]
componentsJoinedByString:#""];
That should help you. I couldn't find any ready to use function for that.
NSMutableSet *uniqueCharacters = [NSMutableSet set];
NSMutableString *uniqueString = [NSMutableString string];
[illegalCharactersInTitle enumerateSubstringsInRange:NSMakeRange(0, illegalCharactersInTitle.length) options:NSStringEnumerationByComposedCharacterSequences usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
if (![uniqueCharacters containsObject:substring]) {
[uniqueCharacters addObject:substring];
[uniqueString appendString:substring];
}
}];
Try with the following adaptation of your code:
// legal set
NSCharacterSet *legalCharacterSet = [NSCharacterSet
characterSetWithCharactersInString:#"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLKMNOPQRSTUVWXYZ0123456789-()&+:;,'.# "];
// test strings
NSString *myString = #"LegalStrin()";
//NSString *myString = #"francesco#gmail.com"; illegal string
NSMutableCharacterSet *stringSet = [NSCharacterSet characterSetWithCharactersInString:myString];
// inverts the set
NSCharacterSet *illegalCharacterSet = [legalCharacterSet invertedSet];
// intersection of the string set and the illegal set that modifies the mutable stringset itself
[stringSet formIntersectionWithCharacterSet:illegalCharacterSet];
// prints out the illegal characters with the convenience method
NSLog(#"IllegalStringSet: %#", [self stringForCharacterSet:stringSet]);
I adapted the method to print from another stackoverflow question:
- (NSString*)stringForCharacterSet:(NSCharacterSet*)characterSet
{
NSMutableString *toReturn = [#"" mutableCopy];
unichar unicharBuffer[20];
int index = 0;
for (unichar uc = 0; uc < (0xFFFF); uc ++)
{
if ([characterSet characterIsMember:uc])
{
unicharBuffer[index] = uc;
index ++;
if (index == 20)
{
NSString * characters = [NSString stringWithCharacters:unicharBuffer length:index];
[toReturn appendString:characters];
index = 0;
}
}
}
if (index != 0)
{
NSString * characters = [NSString stringWithCharacters:unicharBuffer length:index];
[toReturn appendString:characters];
}
return toReturn;
}
First of all, you have to be careful about what you consider characters. The API of NSString uses the word characters when talking about what Unicode refers to as UTF-16 code units, but dealing with code units in isolation will not give you what users think of as characters. For example, there are combining characters that compose with the previous character to produce a different glyph. Also, there are surrogate pairs, which only make sense when, um, paired.
As a result, you will actually need to collect substrings which contain what the user thinks of as characters.
I was about to write code very similar to Grzegorz Krukowski's answer. He beat me to it, so I won't but I will add that your code to filter out the legal characters is broken because of the reasons I cite above. For example, if the text contains "é" and it's decomposed as "e" plus a combining acute accent, your code will strip the "e", leaving a dangling combining acute accent. I believe your intent is to treat the "é" as illegal.
Related
This question already has answers here:
How to remove non numeric characters from phone number in objective-c?
(5 answers)
Closed 6 years ago.
I can't remove white space from Phone Number in iOS app.
Here is my codes.
ABMultiValueRef multiPhones = ABRecordCopyValue(person, kABPersonPhoneProperty);
for (CFIndex iPhone = 0; iPhone < ABMultiValueGetCount(multiPhones); iPhone++)
{
CFStringRef phoneNumberRef = ABMultiValueCopyValueAtIndex(multiPhones, iPhone);
NSString *phoneNumber = (__bridge NSString *) phoneNumberRef;
if (phoneNumber == nil) {
phoneNumber = #"";
}
if (phoneNumber.length == 0) continue;
// phone number = (217) 934-3234
phoneNumber = [phoneNumber stringByReplacingOccurrencesOfString:#"(" withString:#""];
phoneNumber = [phoneNumber stringByReplacingOccurrencesOfString:#")" withString:#""];
phoneNumber = [phoneNumber stringByReplacingOccurrencesOfString:#"-" withString:#""];
phoneNumber = [phoneNumber stringByReplacingOccurrencesOfString:#" " withString:#""];
// phone number = 217 9343234
[phoneNumbers addObject:phoneNumber];
}
I expect to get without white space. But it is not removed from the phone number.
How can I fix? Please help me. Thanks
You can do something a lot simpler than what you're currently doing with NSCharacterSet. Here's how:
NSCharacterSet defines a collection of characters. There are a few standard ones, such as decimalDigitsCharacterSet and alphaNumericCharacterSet.
There's also a neat method called invertedSet which returns a character set with all of the characters not included in the current one. Now, we need just one more bit of information.
NSString has a method called componentsSeparatedByCharactersInSet:, which gives you back an NSArray of the parts of the string, broken up around the characters in the characterSet you supply.
NSArray has a complementary function, componentsJoinedWithString: which you can use to turn the elements of an array (back) into a string. See where this is going?
First, define a character set that we want to include in our final output:
NSCharacterSet *digits = [NSCharacterSet decimalDigitCharacterSet];
Now, get everything else.
NSCharacterSet *illegalCharacters = [digits invertedSet]
Once we have the character set that we want, we can break out the string and reconstruct it:
NSArray *components = [phoneNumber componentsSeperatedByCharactersInSet:illegalCharacters];
NSString *output = [components componentsJoinedByString:#""];
That should give you the correct output. Four lines, and you're done:
NSCharacterSet *digits = [NSCharacterSet decimalDigitCharacterSet];
NSCharacterSet *illegalCharacters = [digits invertedSet];
NSArray *components = [phoneNumber componentsSeparatedByCharactersInSet:illegalCharacters];
NSString *output = [components componentsJoinedByString:#""];
You can use the whitespaceCharacterSet do do something similar to trim whitespace off of strings.
NSHipster has a great article about this, too.
EDIT:
If you want to include other symbols, such as the + prefix or parenthesis, you can create custom character sets with characterSetWithCharactersInString:. If you have two character sets, such as the decimal digits and the custom one you created, you could use NSMutableCharacterSet to modify the character set you have to include other characters.
I have a string like this #"abcdefghijklmnopqrstuvwxyzA". As you can see, A is at the end. How can I find the first capital letter and split the strings:
NSString *lower = #"abcdefghijklmnopqrstuvwxyz";
NSString *upper = #"A";
The string in the beginning is static so the capital letter could be ANYTHING. Will this scanner help?
NSString *String = titleLabelLatestNews.text;
NSScanner *stringScanner = [NSScanner scannerWithString:String];
NSString *content = [[NSString alloc] init];
while ([stringScanner isAtEnd] == NO) {
[stringScanner scanUpToString:#"url=\"" intoString:Nil];
[stringScanner scanUpToString:#"/>" intoString:&content];
}
For another example, #"this is all lower case letters I am awesome"; should become two strings, #"this is all lower case letters"; and #"I am awesome";
Get the idea? Anything before the Capital Letter goes to a string and anything after goes to another string.
An NSScanner will do the trick for you, yes. You just need to create an NSCharacterSet consisting of the capital letters, then use scanUpToCharactersFromSet:intoString:
NSString * s = #"this is all lower case letters I am awesome";
NSScanner * scanner = [NSScanner scannerWithString:s];
NSString * firstPart;
[scanner scanUpToCharactersFromSet:[NSCharacterSet uppercaseLetterCharacterSet]
intoString:&firstPart];
NSString * secondPart = [s substringFromIndex:[scanner scanLocation]];
If you insist on using NSScanner, use scanCharactersFromSet:intoString: where the NSCharacterSet is lowercase characters only.
What I would personally do, if anyone cares, is call rangeOfCharacterFromSet(NSCharacterSet.uppercaseLetterCharacterSet()...) and derive the resulting substrings from there.
A better solution is to use NSString's rangeOfCharacterFromSet
NSString *lowerCaseString=#"";
NSString *upperCaseString=#"";
NSString *stringToSplit = titleLabelLatestNews.text;
NSRange capitalRange=[stringToSplit rangeOfCharacterFromSet:[NSCharacterSet uppercaseLetterCharacterSet]];
if (capitalRange.location == NSNotFound) {
lowerCaseString=stringToSplit;
}
else if (capitalRange.location ==0 ) {
upperCaseString=stringToSplit;
}
else {
lowerCaseString=[stringToSplit substringToIndex:capitalRange.location-1];
upperCaseString=[stringToSplit substringFromIndex:capitalRange.location];
}
NSLog(#"lower case string=%# uppercase=%#",lowerCaseString,upperCaseString);
For completeness, the regular expression solution:
Use NSRegularExpression
The pattern #"([^A-Z]*)([A-Z].*)" will match what you want if you are only interested in A-Z as uppercase characters (see below for unicode change). Broken down this is two group, (...), one for before one for after; first group - anything which is not uppercase, [^A-Z], zero or more times, *; second group - an uppercase letter, [A-Z], followed by anything, .*.
Use firstMatchInString:options:range:; the NSTextCheckingResult will contain the ranges of the two matched groups.
If you wish to allow for Unicode's myriad of uppercase and titlecase letters just change A-Z above to \\p{Lu}\\p{Lt} (make sure you type the double-backslashes, you are passing a backslash to NSRegularExpression). Those two are all the Unicode uppercase letters, \\p{Lu}, and all the title case letters, \\p{Lt}.
HTH
Throwing one more solution into the mix utilizing componentsSeparatedByCharactersInSet: to split the string into multiple arrays (i.e. more than 2 if needed):
// Separate the "sentence" into components separated
// by the characters in the uppercase character set
NSMutableArray *sentenceArray = [[sentence componentsSeparatedByCharactersInSet:[NSCharacterSet uppercaseLetterCharacterSet]] mutableCopy];
// Get the first sentence "segment", i.e. the sentenceArray's
// first object
NSString *segment = [sentenceArray objectAtIndex:0];
// Keep track of the character count with a variable
int characterCount = (int)segment.length;
// Then starting from sentenceArray's index 1, go through
// the rest of sentenceArray's indices
for (int i = 1 ; i < sentenceArray.count ; i ++) {
// Append that "separator" character to the segment at the
// current index by accessing the character before the current segment
segment = [[NSString stringWithFormat:#"%c", [sentence characterAtIndex:characterCount]]stringByAppendingString:[sentenceArray objectAtIndex:i]];
// Replace the object at the current index with this new segment
// string
[sentenceArray replaceObjectAtIndex:i withObject:segment];
// Increment the character count
characterCount += segment.length;
}
NSLog(#"%#", sentenceArray);
// Find index of first capital letter
NSInteger index = ^NSInteger{
for (NSInteger i = 0; i < string.length; ++i) {
unichar c = [string characterAtIndex:i];
if ('A' <= c && c <= 'Z') { return i; }
}
return string.length; // No capital letter, take the entire string
}();
NSLog(#"lower = %#", [string substringToIndex:index]);
NSLog(#"upper = %#", [string substringFromIndex:index]);
I have textual reports that are coded with special shortcuts (i.e #"BLU" for #"BLUE", #"ABV" for #"Above", etc).
I created an NSDictionary where the keys are the coded word and values are the translations.
Currently I translate the string using this code:
NSMutableString *decodedDesc = [#"" mutableCopy];
for (NSString *word in [self.rawDescriprion componentsSeparatedByString:#" "]) {
NSString * decodedWord;
if (word && word.length>0 && [word characterAtIndex:word.length-1] == '.') {
decodedWord = [abbreviations[[word substringToIndex:word.length-1]] stringByAppendingString:#"."];
} else
decodedWord = abbreviations[word];
if (!decodedWord)
decodedWord = word;
[decodedDesc appendString:[NSString stringWithFormat:#"%# ",decodedWord]];
}
_decodedDescription = [decodedDesc copy];
The problem is that the words in the report are not always seperated by a space. Sometimes the are connected to other special characters, such as #"-" or #"/", the code ignores the word because something like #"BLU-ABV" is not in the dictionary keys.
How can I improve this code to ignore special chars while translating the words but preserving them in the translated NSString? For example #"BLU-ABV" would translate into #"Blue-Above".
This can be done by
NSCharacterSet *characterSet = [NSCharacterSet characterSetWithCharactersInString:#" -/"];
[self.rawDescription componentsSeparatedByCharactersInSet:characterSet];
where the characterSet contains the Characters you want to separate by or you can even define the CharacterSet with the letters that make out your words (only alphabetical or alphanumeric) and then use the invertedSet.
Use character set to separate it.
NSMutableCharacterSet* cSet = [NSMutableCharacterSet punctuationCharacterSet];
// add your own custom character
[cSet addCharactersInString:#" "];
NSArray *comps = [self.rawDescriprion componentsSeparatedByString:cSet];
You can take a look at which character set is more suitable for your case at Apple Documentation
But I will say it should be punctuation.
I solved it. The solution is to enumerate the original NSString letter by letter. Each letter is added to a word until reaching a special char.
Upon reaching a special char, the translated word, followed by the special char, are dumped into the translated report (which is an NSMutableString) and the word variable is reset to #"".
Here is the code: (not including the dictionary or original NSString initialization)
__block NSString *word;
NSCharacterSet *specialChars = [[NSCharacterSet alphanumericCharacterSet] invertedSet];
[_rawDescriprion enumerateSubstringsInRange:NSMakeRange(0, [_rawDescriprion length])
options:(NSStringEnumerationByComposedCharacterSequences)
usingBlock:^(NSString *letter, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
if (!word)
word = #"";
//if letter is a special char
if ([letter rangeOfCharacterFromSet:specialChars].location != NSNotFound) {
//add old word to decoded string
if (word && abbreviations[word])
[decodedDesc appendString:abbreviations[word]];
else if (word)
[decodedDesc appendString:word];
//Add the punctuation character to the decoded description
[decodedDesc appendString:letter];
//Clear word variable
word = #"";
}
else { //Alpha-numeric letter
//add letter to word
word = [word stringByAppendingString:letter];
}
}];
//add the last word to the decoded string
if (word && abbreviations[word])
[decodedDesc appendString:abbreviations[word]];
else if (word)
[decodedDesc appendString:word];
_decodedDescription = [decodedDesc copy];
I have an NSString (phone number) with some parenthesis and hyphens as some phone numbers are formatted. How would I remove all characters except numbers from the string?
Old question, but how about:
NSString *newString = [[origString componentsSeparatedByCharactersInSet:
[[NSCharacterSet decimalDigitCharacterSet] invertedSet]]
componentsJoinedByString:#""];
It explodes the source string on the set of non-digits, then reassembles them using an empty string separator. Not as efficient as picking through characters, but much more compact in code.
There's no need to use a regular expressions library as the other answers suggest -- the class you're after is called NSScanner. It's used as follows:
NSString *originalString = #"(123) 123123 abc";
NSMutableString *strippedString = [NSMutableString
stringWithCapacity:originalString.length];
NSScanner *scanner = [NSScanner scannerWithString:originalString];
NSCharacterSet *numbers = [NSCharacterSet
characterSetWithCharactersInString:#"0123456789"];
while ([scanner isAtEnd] == NO) {
NSString *buffer;
if ([scanner scanCharactersFromSet:numbers intoString:&buffer]) {
[strippedString appendString:buffer];
} else {
[scanner setScanLocation:([scanner scanLocation] + 1)];
}
}
NSLog(#"%#", strippedString); // "123123123"
EDIT: I've updated the code because the original was written off the top of my head and I figured it would be enough to point the people in the right direction. It seems that people are after code they can just copy-paste straight into their application.
I also agree that Michael Pelz-Sherman's solution is more appropriate than using NSScanner, so you might want to take a look at that.
The accepted answer is overkill for what is being asked. This is much simpler:
NSString *pureNumbers = [[phoneNumberString componentsSeparatedByCharactersInSet:[[NSCharacterSet decimalDigitCharacterSet] invertedSet]] componentsJoinedByString:#""];
This is great, but the code does not work for me on the iPhone 3.0 SDK.
If I define strippedString as you show here, I get a BAD ACCESS error when trying to print it after the scanCharactersFromSet:intoString call.
If I do it like so:
NSMutableString *strippedString = [NSMutableString stringWithCapacity:10];
I end up with an empty string, but the code doesn't crash.
I had to resort to good old C instead:
for (int i=0; i<[phoneNumber length]; i++) {
if (isdigit([phoneNumber characterAtIndex:i])) {
[strippedString appendFormat:#"%c",[phoneNumber characterAtIndex:i]];
}
}
Though this is an old question with working answers, I missed international format support. Based on the solution of simonobo, the altered character set includes a plus sign "+". International phone numbers are supported by this amendment as well.
NSString *condensedPhoneNumber = [[phoneNumber componentsSeparatedByCharactersInSet:
[[NSCharacterSet characterSetWithCharactersInString:#"+0123456789"]
invertedSet]]
componentsJoinedByString:#""];
The Swift expressions are
var phoneNumber = " +1 (234) 567-1000 "
var allowedCharactersSet = NSMutableCharacterSet.decimalDigitCharacterSet()
allowedCharactersSet.addCharactersInString("+")
var condensedPhoneNumber = phoneNumber.componentsSeparatedByCharactersInSet(allowedCharactersSet.invertedSet).joinWithSeparator("")
Which yields +12345671000 as a common international phone number format.
Here is the Swift version of this.
import UIKit
import Foundation
var phoneNumber = " 1 (888) 555-5551 "
var strippedPhoneNumber = "".join(phoneNumber.componentsSeparatedByCharactersInSet(NSCharacterSet.decimalDigitCharacterSet().invertedSet))
Swift version of the most popular answer:
var newString = join("", oldString.componentsSeparatedByCharactersInSet(NSCharacterSet.decimalDigitCharacterSet().invertedSet))
Edit: Syntax for Swift 2
let newString = oldString.componentsSeparatedByCharactersInSet(NSCharacterSet.decimalDigitCharacterSet().invertedSet).joinWithSeparator("")
Edit: Syntax for Swift 3
let newString = oldString.components(separatedBy: CharacterSet.decimalDigits.inverted).joined(separator: "")
Thanks for the example. It has only one thing missing the increment of the scanLocation in case one of the characters in originalString is not found inside the numbers CharacterSet object. I have added an else {} statement to fix this.
NSString *originalString = #"(123) 123123 abc";
NSMutableString *strippedString = [NSMutableString
stringWithCapacity:originalString.length];
NSScanner *scanner = [NSScanner scannerWithString:originalString];
NSCharacterSet *numbers = [NSCharacterSet
characterSetWithCharactersInString:#"0123456789"];
while ([scanner isAtEnd] == NO) {
NSString *buffer;
if ([scanner scanCharactersFromSet:numbers intoString:&buffer]) {
[strippedString appendString:buffer];
}
// --------- Add the following to get out of endless loop
else {
[scanner setScanLocation:([scanner scanLocation] + 1)];
}
// --------- End of addition
}
NSLog(#"%#", strippedString); // "123123123"
It Accept only mobile number
NSString * strippedNumber = [mobileNumber stringByReplacingOccurrencesOfString:#"[^0-9]" withString:#"" options:NSRegularExpressionSearch range:NSMakeRange(0, [mobileNumber length])];
It might be worth noting that the accepted componentsSeparatedByCharactersInSet: and componentsJoinedByString:-based answer is not a memory-efficient solution. It allocates memory for the character set, for an array and for a new string. Even if these are only temporary allocations, processing lots of strings this way can quickly fill the memory.
A memory friendlier approach would be to operate on a mutable copy of the string in place. In a category over NSString:
-(NSString *)stringWithNonDigitsRemoved {
static NSCharacterSet *decimalDigits;
if (!decimalDigits) {
decimalDigits = [NSCharacterSet decimalDigitCharacterSet];
}
NSMutableString *stringWithNonDigitsRemoved = [self mutableCopy];
for (CFIndex index = 0; index < stringWithNonDigitsRemoved.length; ++index) {
unichar c = [stringWithNonDigitsRemoved characterAtIndex: index];
if (![decimalDigits characterIsMember: c]) {
[stringWithNonDigitsRemoved deleteCharactersInRange: NSMakeRange(index, 1)];
index -= 1;
}
}
return [stringWithNonDigitsRemoved copy];
}
Profiling the two approaches have shown this using about 2/3 less memory.
You can use regular expression on mutable string:
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:
#"[^\\d]"
options:0
error:nil];
[regex replaceMatchesInString:str
options:0
range:NSMakeRange(0, str.length)
withTemplate:#""];
Built the top solution as a category to help with broader problems:
Interface:
#interface NSString (easyReplace)
- (NSString *)stringByReplacingCharactersNotInSet:(NSCharacterSet *)set
with:(NSString *)string;
#end
Implemenation:
#implementation NSString (easyReplace)
- (NSString *)stringByReplacingCharactersNotInSet:(NSCharacterSet *)set
with:(NSString *)string
{
NSMutableString *strippedString = [NSMutableString
stringWithCapacity:self.length];
NSScanner *scanner = [NSScanner scannerWithString:self];
while ([scanner isAtEnd] == NO) {
NSString *buffer;
if ([scanner scanCharactersFromSet:set intoString:&buffer]) {
[strippedString appendString:buffer];
} else {
[scanner setScanLocation:([scanner scanLocation] + 1)];
[strippedString appendString:string];
}
}
return [NSString stringWithString:strippedString];
}
#end
Usage:
NSString *strippedString =
[originalString stringByReplacingCharactersNotInSet:
[NSCharacterSet setWithCharactersInString:#"01234567890"
with:#""];
Swift 3
let notNumberCharacters = NSCharacterSet.decimalDigits.inverted
let intString = yourString.trimmingCharacters(in: notNumberCharacters)
swift 4.1
var str = "75003 Paris, France"
var stringWithoutDigit = (str.components(separatedBy:CharacterSet.decimalDigits)).joined(separator: "")
print(stringWithoutDigit)
Um. The first answer seems totally wrong to me. NSScanner is really meant for parsing. Unlike regex, it has you parsing the string one tiny chunk at a time. You initialize it with a string, and it maintains an index of how far along the string it's gotten; That index is always its reference point, and any commands you give it are relative to that point. You tell it, "ok, give me the next chunk of characters in this set" or "give me the integer you find in the string", and those start at the current index, and move forward until they find something that doesn't match. If the very first character already doesn't match, then the method returns NO, and the index doesn't increment.
The code in the first example is scanning "(123)456-7890" for decimal characters, which already fails from the very first character, so the call to scanCharactersFromSet:intoString: leaves the passed-in strippedString alone, and returns NO; The code totally ignores checking the return value, leaving the strippedString unassigned. Even if the first character were a digit, that code would fail, since it would only return the digits it finds up until the first dash or paren or whatever.
If you really wanted to use NSScanner, you could put something like that in a loop, and keep checking for a NO return value, and if you get that you can increment the scanLocation and scan again; and you also have to check isAtEnd, and yada yada yada. In short, wrong tool for the job. Michael's solution is better.
For those searching for phone extraction, you can extract the phone numbers from a text using NSDataDetector, for example:
NSString *userBody = #"This is a text with 30612312232 my phone";
if (userBody != nil) {
NSError *error = NULL;
NSDataDetector *detector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypePhoneNumber error:&error];
NSArray *matches = [detector matchesInString:userBody options:0 range:NSMakeRange(0, [userBody length])];
if (matches != nil) {
for (NSTextCheckingResult *match in matches) {
if ([match resultType] == NSTextCheckingTypePhoneNumber) {
DbgLog(#"Found phone number %#", [match phoneNumber]);
}
}
}
}
`
I created a category on NSString to simplify this common operation.
NSString+AllowCharactersInSet.h
#interface NSString (AllowCharactersInSet)
- (NSString *)stringByAllowingOnlyCharactersInSet:(NSCharacterSet *)characterSet;
#end
NSString+AllowCharactersInSet.m
#implementation NSString (AllowCharactersInSet)
- (NSString *)stringByAllowingOnlyCharactersInSet:(NSCharacterSet *)characterSet {
NSMutableString *strippedString = [NSMutableString
stringWithCapacity:self.length];
NSScanner *scanner = [NSScanner scannerWithString:self];
while (!scanner.isAtEnd) {
NSString *buffer = nil;
if ([scanner scanCharactersFromSet:characterSet intoString:&buffer]) {
[strippedString appendString:buffer];
} else {
scanner.scanLocation = scanner.scanLocation + 1;
}
}
return strippedString;
}
#end
I think currently best way is:
phoneNumber.replacingOccurrences(of: "\\D",
with: "",
options: String.CompareOptions.regularExpression)
If you're just looking to grab the numbers from the string, you could certainly use regular expressions to parse them out. For doing regex in Objective-C, check out RegexKit. Edit: As #Nathan points out, using NSScanner is a much simpler way to parse all numbers from a string. I totally wasn't aware of that option, so props to him for suggesting it. (I don't even like using regex myself, so I prefer approaches that don't require them.)
If you want to format phone numbers for display, it's worth taking a look at NSNumberFormatter. I suggest you read through this related SO question for tips on doing so. Remember that phone numbers are formatted differently depending on location and/or locale.
Swift 5
let newString = origString.components(separatedBy: CharacterSet.decimalDigits.inverted).joined(separator: "")
Based on Jon Vogel's answer here it is as a Swift String extension along with some basic tests.
import Foundation
extension String {
func stringByRemovingNonNumericCharacters() -> String {
return self.componentsSeparatedByCharactersInSet(NSCharacterSet.decimalDigitCharacterSet().invertedSet).joinWithSeparator("")
}
}
And some tests proving at least basic functionality:
import XCTest
class StringExtensionTests: XCTestCase {
func testStringByRemovingNonNumericCharacters() {
let baseString = "123"
var testString = baseString
var newString = testString.stringByRemovingNonNumericCharacters()
XCTAssertTrue(newString == testString)
testString = "a123b"
newString = testString.stringByRemovingNonNumericCharacters()
XCTAssertTrue(newString == baseString)
testString = "a=1-2_3#b"
newString = testString.stringByRemovingNonNumericCharacters()
XCTAssertTrue(newString == baseString)
testString = "(999) 999-9999"
newString = testString.stringByRemovingNonNumericCharacters()
XCTAssertTrue(newString.characters.count == 10)
XCTAssertTrue(newString == "9999999999")
testString = "abc"
newString = testString.stringByRemovingNonNumericCharacters()
XCTAssertTrue(newString == "")
}
}
This answers the OP's question but it could be easily modified to leave in phone number related characters like ",;*#+"
NSString *originalPhoneNumber = #"(123) 123-456 abc";
NSCharacterSet *numbers = [[NSCharacterSet characterSetWithCharactersInString:#"0123456789"] invertedSet];
NSString *trimmedPhoneNumber = [originalPhoneNumber stringByTrimmingCharactersInSet:numbers];
];
Keep it simple!
I am developing an iOS app using Xcode 4.6.2.
My app receives from the server lets say for example 1000 characters which is then stored in NSString.
What I want to do is: split the 1000 characters to multiple strings. Each string must be MAX 100 characters only.
The next question is how to check when the last word finished before the 100 characters so I don't perform the split in the middle of the word?
A regex-based solution:
NSString *string = // ... your 1000-character input
NSString *pattern = #"(?ws).{1,100}\\b";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern: pattern options: 0 error: &error];
NSArray *matches = [regex matchesInString:string options:0 range:NSMakeRange(0, [string length])];
NSMutableArray *result = [NSMutableArray array];
for (NSTextCheckingResult *match in matches) {
[result addObject: [string substringWithRange: match.range]];
}
The code for the regex and the matches part is taken directly from the docs, so the only difference is the pattern.
The pattern basically matches anything from 1 to 100 characters up to a word boundary. Being a greedy pattern, it will give the longest string possible while still ending with a whole word. This ensures that it won't split any words in the middle.
The (?ws) makes the word recognition work with Unicode's definition of word breaks (the w flag) and treat a line end as any other character (the s flag).
Notice that the algorithm doesn't handle "words" with more than 100 characters well - it will give you the last 100 characters and drop the first part, but that should be a corner case.
(assuming your words are separated by a single space, otherwise use rangeOfCharacterFromSet:options:range:)
Use NSString -- (NSRange)rangeOfString:(NSString *)aString options:(NSStringCompareOptions)mask range:(NSRange)aRange with:
aString as #" "
mask as NSBackwardsSearch
Then you need a loop, where you check that you haven't already got to the end of the string, then create a range (for use as aRange) so that you start 100 characters along the string and search backwards looking for the space. Once you find the space, the returned range will allow you to get the string with substringWithRange:.
(written freehand)
NSRange testRange = NSMakeRange(0, MIN(100, sourceString.length));
BOOL complete = NO;
NSMutableArray *lines = [NSMutableArray array];
while (!complete && (testRange.location + testRange.length) < sourceString.length) {
NSRange hitRange = [sourceString rangeOfString:#"" options:NSBackwardsSearch range:testRange];
if (hitRange.location != NSNotFound) {
[lines addObject:[sourceString substringWithRange:hitRange];
} else {
complete = YES;
}
NSInteger index = hitRange.location + hitRange.length;
testRange = NSMakeRange(index, MIN(100, sourceString.length - index));
}
This can help
- (NSArray *)chunksForString(NSString *)str {
NSMutableArray *chunks = [[NSMutableArray alloc] init];
double sizeChunk = 100.0; // or whatever you want
int length = 0;
int loopSize = ceil([str length]/sizeChunk);
for (int index = 0; index < loopSize; index++) {
NSInteger newRangeEndLimit = ([str length] - length) > sizeChunk ? sizeChunk : ([str length] - length);
[chunks addObject:[str substringWithRange:NSMakeRange(length, newRangeEndLimit)];
length += 99; // Minus 1 from the sizeChunk as indexing starts from 0
}
return chunks;
}
use NSArray *words = [stringFromServer componentsSeparatedBy:#" "];
this will give you words.
if you really need to make it nearest to 100 characters, start appending strings maintaining the total length of the appended strings and check that it should stay < 100.