How to split string into substrings on iOS? - ios

I received an NSString from the server. Now I want to split it into the substring which I need.
How to split the string?
For example:
substring1:read from the second character to 5th character
substring2:read 10 characters from the 6th character.

You can also split a string by a substring, using NString's componentsSeparatedByString method.
Example from documentation:
NSString *list = #"Norman, Stanley, Fletcher";
NSArray *listItems = [list componentsSeparatedByString:#", "];

NSString has a few methods for this:
[myString substringToIndex:index];
[myString substringFromIndex:index];
[myString substringWithRange:range];
Check the documentation for NSString for more information.

I wrote a little method to split strings in a specified amount of parts.
Note that it only supports single separator characters. But I think it is an efficient way to split a NSString.
//split string into given number of parts
-(NSArray*)splitString:(NSString*)string withDelimiter:(NSString*)delimiter inParts:(int)parts{
NSMutableArray* array = [NSMutableArray array];
NSUInteger len = [string length];
unichar buffer[len+1];
//put separator in buffer
unichar separator[1];
[delimiter getCharacters:separator range:NSMakeRange(0, 1)];
[string getCharacters:buffer range:NSMakeRange(0, len)];
int startPosition = 0;
int length = 0;
for(int i = 0; i < len; i++) {
//if array is parts-1 and the character was found add it to array
if (buffer[i]==separator[0] && array.count < parts-1) {
if (length>0) {
[array addObject:[string substringWithRange:NSMakeRange(startPosition, length)]];
}
startPosition += length+1;
length = 0;
if (array.count >= parts-1) {
break;
}
}else{
length++;
}
}
//add the last part of the string to the array
[array addObject:[string substringFromIndex:startPosition]];
return array;
}

Related

Find substring range of NSString with unicode characters

If I have a string like this.
NSString *string = #"😀1😀3😀5😀7😀"
To get a substring like #"3😀5" you have to account for the fact the smiley face character take two bytes.
NSString *substring = [string substringWithRange:NSMakeRange(5, 4)];
Is there a way to get the same substring by using the actual character index so NSMakeRange(3, 3) in this case?
Thanks to #Joe's link I was able to create a solution that works.
This still seems like a lot of work for just trying to create a substring at unicode character ranges for an NSString. Please post if you have a simpler solution.
#implementation NSString (UTF)
- (NSString *)substringWithRangeOfComposedCharacterSequences:(NSRange)range
{
NSUInteger codeUnit = 0;
NSRange result;
NSUInteger start = range.location;
NSUInteger i = 0;
while(i <= start)
{
result = [self rangeOfComposedCharacterSequenceAtIndex:codeUnit];
codeUnit += result.length;
i++;
}
NSRange substringRange;
substringRange.location = result.location;
NSUInteger end = range.location + range.length;
while(i <= end)
{
result = [self rangeOfComposedCharacterSequenceAtIndex:codeUnit];
codeUnit += result.length;
i++;
}
substringRange.length = result.location - substringRange.location;
return [self substringWithRange:substringRange];
}
#end
Example:
NSString *string = #"😀1😀3😀5😀7😀";
NSString *result = [string substringWithRangeOfComposedCharacterSequences:NSMakeRange(3, 3)];
NSLog(#"%#", result); // 3😀5
Make a swift extension of NSString and use new swift String struct. Has a beautifull String.Index that uses glyphs for counting characters and range selecting. Very usefull is cases like yours with emojis envolved

Split string by a constant number, only on a space

I am using the answer from this question (https://stackoverflow.com/a/13854813) to split a large string into an array based on a specific length.
- (NSArray *) componentSaparetedByLength:(NSUInteger) length{
NSMutableArray *array = [NSMutableArray new];
NSRange range = NSMakeRange(0, length);
NSString *subString = nil;
while (range.location + range.length <= self.length) {
subString = [self substringWithRange:range];
[array addObject:subString];
//Edit
range.location = range.length + range.location;
//Edit
range.length = length;
}
if(range.location<self.length){
subString = [self substringFromIndex:range.location];
[array addObject:subString];
}
return array;
}
I would like to make this only split the string on a space. So, if the last character of the substring is not a space, I would like it it shorten that substring until the last character is a space (hopefully that makes sense). Basically I want this to split the string, but not split words in the process.
Any suggestions?
May be you can separate with componentsSeparatedByCharactersInSet: and re-construct lines.
But in your case, I think you'd better to iterate unichars.
NSMutableArray *result = [NSMutableArray array];
NSUInteger charCount = string.length;
unichar *chars = malloc(charCount*sizeof(unichar));
if(chars == NULL) {
return nil;
}
[string getCharacters:chars];
unichar *cursor = chars;
unichar *lineStart = chars;
unichar *wordStart = chars;
NSCharacterSet *whitespaces = [NSCharacterSet whitespaceCharacterSet];
while(cursor < chars+charCount) {
if([whitespaces characterIsMember:*cursor]) {
if(cursor - lineStart >= length) {
NSString *line = [NSString stringWithCharacters:lineStart length:wordStart - lineStart];
[result addObject:line];
lineStart = wordStart;
}
wordStart = cursor + 1;
}
cursor ++;
}
if(lineStart < cursor) {
[result addObject:[NSString stringWithCharacters:lineStart length: cursor - lineStart]];
}
free(chars);
return result;
Input:
#"I would like to make this only split the string on a space. So, if the last character of the substring is not a space, I would like it it shorten that substring until the last character is a space (hopefully that makes sense). Basically I want this to split the string, but not split words in the process."
Output(length == 30):
(
"I would like to make this ",
"only split the string on a ",
"space. So, if the last ",
"character of the substring is ",
"not a space, I would like it ",
"it shorten that substring ",
"until the last character is a ",
"space (hopefully that makes ",
"sense). Basically I want this ",
"to split the string, but not ",
"split words in the process."
)

How do I count the occurrences of ALL characters of a string?

I am trying to find the number of times each character in a string is used. for example, in the string "wow" I would like to count the number of times the character "w" is used and the number of times the character "o" is used. I would then like to add these characters to an NSMutableArray. Is there a programmatic way to count the number of times all specific characters are used? To get the number of occurrences of ALL characters in an NSString? Or would I have to go through the process of counting the occurrences of each individual character separately?
See iOS - Most efficient way to find word occurrence count in a string
NSString *string = #"wow";
NSCountedSet *countedSet = [NSCountedSet new];
[string enumerateSubstringsInRange:NSMakeRange(0, [string length])
options:NSStringEnumerationByComposedCharacterSequences | NSStringEnumerationLocalized
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop){
// This block is called once for each word in the string.
[countedSet addObject:substring];
// If you want to ignore case, so that "this" and "This"
// are counted the same, use this line instead to convert
// each word to lowercase first:
// [countedSet addObject:[substring lowercaseString]];
}];
NSLog(#"%#", countedSet);
NSLog(#"%#", [countedSet allObjects]);
NSLog(#"%d", [countedSet countForObject:#"w"]);
The exact answer depends on some questions -
Do you only want to count the characters a-z or do you want punctuation as well?
Do you need to count unicode characters or just 8 bit characters?
Is case important ie. is A different to a?
Assuming you only want to count 8 bit, a-z independent of case, you could use something like -
- (NSArray *)countCharactersInString:(NSString *)inputString
{
NSMutableArray *result=[[NSMutableArray alloc]initWithCapacity:26];
for (int i=0;i<26;i++) {
[result addObject:[NSNumber numberWithInt:0]];
}
for (int i=0;i<[inputString length];i++)
{
unichar c=[inputString characterAtIndex:i];
c=tolower(c);
if (isalpha(c))
{
int index=c-'a';
NSNumber *count=[result objectAtIndex:index];
[result setObject:[NSNumber numberWithInt:[count intValue]+1] atIndexedSubscript:index];
}
}
return (result);
}
An alternative approach is to use an NSCountedSet - it handles all characterspunctuation etc, but will be 'sparse' - there is no entry for a character that is not present in the string. Also, the implementation below is case sensitive - W is different to w.
- (NSCountedSet *)countCharactersInString:(NSString *)inputString
{
NSCountedSet *result=[[NSCountedSet alloc]init];
for (int i=0;i<[inputString length];i++)
{
NSString *c=[inputString substringWithRange:NSMakeRange(i,1)];
[result addObject:c];
}
return result;
}
NSString *str = #"Program to Find the Frequency of Characters in a String";
NSMutableDictionary *frequencies = [[NSMutableDictionary alloc]initWithCapacity:52];
initWithCapacity:52 - capacity can be more depends on character set (for now : a-z, A-Z)
for (short i=0; i< [str length]; i++){
short index = [str characterAtIndex:i];
NSString *key = [NSString stringWithFormat:#"%d",index];
NSNumber *value = #1;
short frequencyCount=0;
if ([frequencies count] > 0 && [frequencies valueForKey:key]){
frequencyCount = [[frequencies valueForKey:key] shortValue];
frequencyCount++;
value = [NSNumber numberWithShort:frequencyCount];
[frequencies setValue:value forKey:key];
}
else{
[frequencies setValue:value forKey:key];
}
}
To display occurrence of each character in string
[frequencies enumerateKeysAndObjectsUsingBlock:^(id _Nonnull key, id _Nonnull obj, BOOL * _Nonnull stop) {
NSString *ky = (NSString*)key;
NSNumber *value = (NSNumber*)obj;
NSLog(#"%c\t%d", ([ky intValue]), [value shortValue]);
}];

How to Split NSString to multiple Strings after certain number of characters

I am developing an iOS app using Xcode 4.6.2.
My app receives from the server lets say for example 1000 characters which is then stored in NSString.
What I want to do is: split the 1000 characters to multiple strings. Each string must be MAX 100 characters only.
The next question is how to check when the last word finished before the 100 characters so I don't perform the split in the middle of the word?
A regex-based solution:
NSString *string = // ... your 1000-character input
NSString *pattern = #"(?ws).{1,100}\\b";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern: pattern options: 0 error: &error];
NSArray *matches = [regex matchesInString:string options:0 range:NSMakeRange(0, [string length])];
NSMutableArray *result = [NSMutableArray array];
for (NSTextCheckingResult *match in matches) {
[result addObject: [string substringWithRange: match.range]];
}
The code for the regex and the matches part is taken directly from the docs, so the only difference is the pattern.
The pattern basically matches anything from 1 to 100 characters up to a word boundary. Being a greedy pattern, it will give the longest string possible while still ending with a whole word. This ensures that it won't split any words in the middle.
The (?ws) makes the word recognition work with Unicode's definition of word breaks (the w flag) and treat a line end as any other character (the s flag).
Notice that the algorithm doesn't handle "words" with more than 100 characters well - it will give you the last 100 characters and drop the first part, but that should be a corner case.
(assuming your words are separated by a single space, otherwise use rangeOfCharacterFromSet:options:range:)
Use NSString -- (NSRange)rangeOfString:(NSString *)aString options:(NSStringCompareOptions)mask range:(NSRange)aRange with:
aString as #" "
mask as NSBackwardsSearch
Then you need a loop, where you check that you haven't already got to the end of the string, then create a range (for use as aRange) so that you start 100 characters along the string and search backwards looking for the space. Once you find the space, the returned range will allow you to get the string with substringWithRange:.
(written freehand)
NSRange testRange = NSMakeRange(0, MIN(100, sourceString.length));
BOOL complete = NO;
NSMutableArray *lines = [NSMutableArray array];
while (!complete && (testRange.location + testRange.length) < sourceString.length) {
NSRange hitRange = [sourceString rangeOfString:#"" options:NSBackwardsSearch range:testRange];
if (hitRange.location != NSNotFound) {
[lines addObject:[sourceString substringWithRange:hitRange];
} else {
complete = YES;
}
NSInteger index = hitRange.location + hitRange.length;
testRange = NSMakeRange(index, MIN(100, sourceString.length - index));
}
This can help
- (NSArray *)chunksForString(NSString *)str {
NSMutableArray *chunks = [[NSMutableArray alloc] init];
double sizeChunk = 100.0; // or whatever you want
int length = 0;
int loopSize = ceil([str length]/sizeChunk);
for (int index = 0; index < loopSize; index++) {
NSInteger newRangeEndLimit = ([str length] - length) > sizeChunk ? sizeChunk : ([str length] - length);
[chunks addObject:[str substringWithRange:NSMakeRange(length, newRangeEndLimit)];
length += 99; // Minus 1 from the sizeChunk as indexing starts from 0
}
return chunks;
}
use NSArray *words = [stringFromServer componentsSeparatedBy:#" "];
this will give you words.
if you really need to make it nearest to 100 characters, start appending strings maintaining the total length of the appended strings and check that it should stay < 100.

Truncate string containing emoji or unicode characters at word or character boundaries

How can I truncate a string at a given length without annihilating a unicode character that might be smack in the middle of my length? How can one determine the index of the beginning of a unicode character in a string so that I can avoid creating ugly strings. The square with half of an A visible is the location of another emoji character which has been truncated.
-(NSMutableAttributedString*)constructStatusAttributedStringWithRange:(CFRange)range
NSString *original = [_postDictionay objectForKey:#"message"];
NSMutableString *truncated = [NSMutableString string];
NSArray *components = [original componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];
for(int x=0; x<[components count]; x++)
{
//If the truncated string is still shorter then the range desired. (leave space for ...)
if([truncated length]+[[components objectAtIndex:x] length]<range.length-3)
{
//Just checking if its the first word
if([truncated length]==0 && x==0)
{
//start off the string
[truncated appendString:[components objectAtIndex:0]];
}
else
{
//append a new word to the string
[truncated appendFormat:#" %#",[components objectAtIndex:x]];
}
}
else
{
x=[components count];
}
}
if([truncated length]==0 || [truncated length]< range.length-20)
{
truncated = [NSMutableString stringWithString:[original substringWithRange:NSMakeRange(range.location, range.length-3)]];
}
[truncated appendString:#"..."];
NSMutableAttributedString *statusString = [[NSMutableAttributedString alloc]initWithString:truncated];
[statusString addAttribute:(id)kCTFontAttributeName value:[StyleSingleton streamStatusFont] range:NSMakeRange(0, [statusString length])];
[statusString addAttribute:(id)kCTForegroundColorAttributeName value:(id)[StyleSingleton streamStatusColor].CGColor range:NSMakeRange(0, [statusString length])];
return statusString;
}
UPDATE Thanks to the answer, was able to use one simple function for my needs!
-(NSMutableAttributedString*)constructStatusAttributedStringWithRange:(CFRange)range
{
NSString *original = [_postDictionay objectForKey:#"message"];
NSMutableString *truncated = [NSMutableString stringWithString:[original substringWithRange:[original rangeOfComposedCharacterSequencesForRange:NSMakeRange(range.location, range.length-3)]]];
[truncated appendString:#"..."];
NSMutableAttributedString *statusString = [[NSMutableAttributedString alloc]initWithString:truncated];
[statusString addAttribute:(id)kCTFontAttributeName value:[StyleSingleton streamStatusFont] range:NSMakeRange(0, [statusString length])];
[statusString addAttribute:(id)kCTForegroundColorAttributeName value:(id)[StyleSingleton streamStatusColor].CGColor range:NSMakeRange(0, [statusString length])];
return statusString;
}
NSString has a method rangeOfComposedCharacterSequencesForRange that you can use to find the enclosing range in the string that contains only complete composed characters. For example
NSString *s = #"😄";
NSRange r = [s rangeOfComposedCharacterSequencesForRange:NSMakeRange(0, 1)];
gives the range { 0, 2 } because the Emoji character is stored as two UTF-16 characters (surrogate pair) in the string.
Remark: You could also check if you can simplify your first loop by using
enumerateSubstringsInRange:options:usingBlock
with the NSStringEnumerationByWords option.
"truncate a string at a given length" <-- Do you mean length as in byte length or length as in number of characters? If the latter, then a simple substringToIndex: will suffice (check the bounds first though). If the former, then I'm afraid you'll have to do something like:
NSString *TruncateString(NSString *original, NSUInteger maxBytesToRead, NSStringEncoding targetEncoding) {
NSMutableString *truncatedString = [NSMutableString string];
NSUInteger bytesRead = 0;
NSUInteger charIdx = 0;
while (bytesRead < maxBytesToRead && charIdx < [original length]) {
NSString *character = [original substringWithRange:NSMakeRange(charIdx++, 1)];
bytesRead += [character lengthOfBytesUsingEncoding:targetEncoding];
if (bytesRead <= maxBytesToRead)
[truncatedString appendString:character];
}
return truncatedString;
}
EDIT: Your code can be rewritten as follows:
NSString *original = [_postDictionay objectForKey:#"message"];
NSArray *characters = [[original componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceCharacterSet]] filteredArrayUsingPredicate:[NSPredicate predicateWithFormat:#"SELF != ''"]];
NSArray *truncatedCharacters = [characters subarrayWithRange:range];
NSString *truncated = [NSString stringWithFormat:#"%#...", [truncatedCharacters componentsJoinedByString:#" "]];

Resources