I am trying to find the number of times each character in a string is used. for example, in the string "wow" I would like to count the number of times the character "w" is used and the number of times the character "o" is used. I would then like to add these characters to an NSMutableArray. Is there a programmatic way to count the number of times all specific characters are used? To get the number of occurrences of ALL characters in an NSString? Or would I have to go through the process of counting the occurrences of each individual character separately?
See iOS - Most efficient way to find word occurrence count in a string
NSString *string = #"wow";
NSCountedSet *countedSet = [NSCountedSet new];
[string enumerateSubstringsInRange:NSMakeRange(0, [string length])
options:NSStringEnumerationByComposedCharacterSequences | NSStringEnumerationLocalized
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop){
// This block is called once for each word in the string.
[countedSet addObject:substring];
// If you want to ignore case, so that "this" and "This"
// are counted the same, use this line instead to convert
// each word to lowercase first:
// [countedSet addObject:[substring lowercaseString]];
}];
NSLog(#"%#", countedSet);
NSLog(#"%#", [countedSet allObjects]);
NSLog(#"%d", [countedSet countForObject:#"w"]);
The exact answer depends on some questions -
Do you only want to count the characters a-z or do you want punctuation as well?
Do you need to count unicode characters or just 8 bit characters?
Is case important ie. is A different to a?
Assuming you only want to count 8 bit, a-z independent of case, you could use something like -
- (NSArray *)countCharactersInString:(NSString *)inputString
{
NSMutableArray *result=[[NSMutableArray alloc]initWithCapacity:26];
for (int i=0;i<26;i++) {
[result addObject:[NSNumber numberWithInt:0]];
}
for (int i=0;i<[inputString length];i++)
{
unichar c=[inputString characterAtIndex:i];
c=tolower(c);
if (isalpha(c))
{
int index=c-'a';
NSNumber *count=[result objectAtIndex:index];
[result setObject:[NSNumber numberWithInt:[count intValue]+1] atIndexedSubscript:index];
}
}
return (result);
}
An alternative approach is to use an NSCountedSet - it handles all characterspunctuation etc, but will be 'sparse' - there is no entry for a character that is not present in the string. Also, the implementation below is case sensitive - W is different to w.
- (NSCountedSet *)countCharactersInString:(NSString *)inputString
{
NSCountedSet *result=[[NSCountedSet alloc]init];
for (int i=0;i<[inputString length];i++)
{
NSString *c=[inputString substringWithRange:NSMakeRange(i,1)];
[result addObject:c];
}
return result;
}
NSString *str = #"Program to Find the Frequency of Characters in a String";
NSMutableDictionary *frequencies = [[NSMutableDictionary alloc]initWithCapacity:52];
initWithCapacity:52 - capacity can be more depends on character set (for now : a-z, A-Z)
for (short i=0; i< [str length]; i++){
short index = [str characterAtIndex:i];
NSString *key = [NSString stringWithFormat:#"%d",index];
NSNumber *value = #1;
short frequencyCount=0;
if ([frequencies count] > 0 && [frequencies valueForKey:key]){
frequencyCount = [[frequencies valueForKey:key] shortValue];
frequencyCount++;
value = [NSNumber numberWithShort:frequencyCount];
[frequencies setValue:value forKey:key];
}
else{
[frequencies setValue:value forKey:key];
}
}
To display occurrence of each character in string
[frequencies enumerateKeysAndObjectsUsingBlock:^(id _Nonnull key, id _Nonnull obj, BOOL * _Nonnull stop) {
NSString *ky = (NSString*)key;
NSNumber *value = (NSNumber*)obj;
NSLog(#"%c\t%d", ([ky intValue]), [value shortValue]);
}];
Related
I have a NSString *strName = #"JonnySmith";
What I want to do is get an NSArray of NSStrings with all possible combinations of a name, omitting certain characters. For example:
#"J";
#"Jo";
#"Jon";
but also combinations like:
#"JSmith";
#"JonSmith"
#"JonnSm";
#"JonSmt";
#"Smith";
#"th";
But they need to be in the order of the original name (the characters can't be out of order, just omitted). Basically traversing left to right in a loop, over and over again, until all possible combos are made.
What is the most efficient way to do this in Objective-C without make a mess?
Let's see if we can give you some pointers, everything here is abstract/pseudocode.
There are 2^n paths to follow, where n is the number of characters, as at each character you either add it or do not.
Taking your example after the first character you might produce #"" and #"J", then to each of these you either add the second character or not, giving: #"", #"J" (add nothing), #"o", "#Jo". Observe that if you have repeated characters anywhere in your input, in your sample you have two n's, this process may produce duplicates. You can deal with duplicates by using a set to collect your results.
How long is a character? Characters may consist of sequences of unicode code points (e.g. 🇧🇪 - Belgium flag, if it prints in SO! Letters can be similarly composed), and you must not split these composed sequences while producing your strings. NSString helps you here as you can enumerate the composed sequences invoking a block for each one in order.
The above give you the pseudocode:
results <- empty set
for each composed character in input do block:
add to results a copy of each of its members with the composed character appended
You cannot modify a collection at the same time you enumerate it. So "add to results" can be done by enumerating the set creating a new collection of strings to add, then adding them all at once after the enumeration:
new items <- empty collection
for every item in results
add to new items (item appending composed character)
results union new items
Optimising it slightly maybe: in (2) we had the empty string and in (4) we append to the empty string. Maybe you could not add the empty string to start and initialise new items to the composed character?
Hint: why did I write the non-specific collection in (4)?
Have fun. If you code something up and get stuck ask a new question, describe your algorithm, show what you've written, explain the issue etc. That will (a) avoid down/close votes and (b) help people to help you.
One possibility is to consider every combination to be a mask of bits, where 1 means the character is there and 0 means the character is missing, for example:
100010000 for JonnySmith will mean JS
000000001 for JonnySmith will mean h
It's simple to generate such masks because we can just iterate from 1 (or 000000001) to 111111111.
Then we only have to map that mask into characters.
Of course, some duplicates are generated because 1110... and 1101... will both be mapped to Jon....
Sample implementation:
NSString *string = #"JonnySmith";
// split the string into characters (every character represented by a string)
NSMutableArray<NSString *> *characters = [NSMutableArray array];
[string enumerateSubstringsInRange:NSMakeRange(0, string.length)
options:NSStringEnumerationByComposedCharacterSequences
usingBlock:^(NSString * _Nullable substring, NSRange substringRange, NSRange enclosingRange, BOOL * _Nonnull stop) {
[characters addObject:substring];
}];
// let's iterate over all masks
// start with zero if you want empty string to be included
NSUInteger min = 1;
NSUInteger max = (1 << characters.count) - 1;
NSMutableString *buffer = [[NSMutableString alloc] initWithCapacity:characters.count];
NSMutableSet *set = [NSMutableSet set];
for (NSUInteger mask = min; mask <= max; mask++) {
[buffer setString:#""];
// iterate over all bits in the generated mask, map it to string
for (NSInteger charIndex = 0; charIndex < characters.count; charIndex++) {
if ((mask & (1 << (characters.count - charIndex - 1))) != 0) {
[buffer appendString:[characters objectAtIndex:charIndex]];
}
}
// add the resulting string to Set, will handle duplicates
[set addObject:[buffer copy]];
}
NSLog(#"Count: %#", #(set.count)); // 767
The size for NSUInteger will give us the maximum number of characters we can use using this method.
Noticed the question is old but no answer is accepted. I think you can generate all permutations and then omit results which don't match your criteria (or tweak this code per your needs)
#interface NSString (Permute)
- (NSSet *)permutations;
#end
#implementation NSString (Permute)
- (NSSet *)permutations {
if ([self length] <= 1) {
return [NSSet setWithObject:self];
}
NSMutableSet *s = [NSMutableSet new];
[s addObject:[self substringToIndex:1]];
for (int i = 1; i < self.length; i++) {
char c = [self characterAtIndex:i];
s = [self words:s insertingLetterAtAllPositions:[NSString stringWithFormat:#"%C",c]];
}
return [s copy];
}
- (NSMutableSet *)words:(NSSet *)words insertingLetterAtAllPositions:(NSString *)letter {
NSMutableSet *collector = [NSMutableSet new];
for (NSString *word in words) {
[collector unionSet:[word allInsertionsOfLetterAtAllPositions:letter]];
}
return collector;
}
- (NSMutableSet *)allInsertionsOfLetterAtAllPositions:(NSString *)letter {
NSMutableSet *collector = [NSMutableSet new];
for (int i = 0; i < [self length] + 1; i++) {
NSMutableString *mut = [self mutableCopy];
[mut insertString:letter atIndex:i];
[collector addObject:[mut copy]];
}
return collector;
}
#end
// usage
[#"abc" permutations];
You can do it quite easily with a little recursion. It works like this:
Check if the length is 1, then return an array of 2 elements, the empty string and the string.
Call recursively with input string minus the first character and assign to sub-result.
Duplicate the sub-result, adding the first character to each string.
Return the result.
Remember to not call for empty string. If you want to omit the empty result string just remove the first element. Also, if you use the same letter several times, you will get some result strings several times. Those can be removed afterwards.
- (void)combinations:(NSString *)string result:(NSMutableArray *)result {
if (string.length == 1) {
[result addObjectsFromArray:#[ #"", string ]];
} else {
[self combinations:[string substringFromIndex:1] result:result];
for (NSInteger i = result.count - 1; i >= 0; --i)
[result addObject:[[string substringToIndex:1] stringByAppendingString:result[i]]];
}
}
// Call like this, for speed only one mutable array is allocated
NSString *test = #"0123456789";
NSMutableArray *result = [NSMutableArray arrayWithCapacity:1 << test.length];
[self combinations:test result:result];
I want to extract only the names from the following string
bob!33#localhost #clement!17#localhost jack!03#localhost
and create an array [#"bob", #"clement", #"jack"].
I have tried NSString's componentsseparatedbystring: but it didn't work as expected. So I am planning to go for regEx.
How can I extract strings between ranges and add it to an array
using regEx in objective C?
The initial string might contain more than 500 names, would it be a
performance issue if I manipulate the string using regEx?
You can do it without regex as below (Assuming ! sign have uniform pattern in your all words),
NSString *names = #"bob!33#localhost #clement!17#localhost jack!03#localhost";
NSArray *namesarray = [names componentsSeparatedByString:#" "];
NSMutableArray *desiredArray = [[NSMutableArray alloc] initWithCapacity:0];
[namesarray enumerateObjectsUsingBlock:^(id obj, NSUInteger idx, BOOL *stop) {
NSRange rangeofsign = [(NSString*)obj rangeOfString:#"!"];
NSString *extractedName = [(NSString*)obj substringToIndex:rangeofsign.location];
[desiredArray addObject:extractedName];
}];
NSLog(#"%#",desiredArray);
output of above NSLog would be
(
bob,
"#clement",
jack
)
If you still want to get rid of # symbol in above string you can always replace special characters in any string, for that check this
If you need further help, you can always leave comment
NSMutableArray* nameArray = [[NSMutableArray alloc] init];
NSArray* youarArray = [yourString componentsSeparatedByString:#" "];
for(NSString * nString in youarArray) {
NSArray* splitObj = [nString componentsSeparatedByString:#"!"];
[nameArray addObject:[splitObj[0]]];
}
NSLog(#"%#", nameArray);
I saw the other solutions and it seemed no one tried to use real regular expressions here, so I created a solution which uses it, maybe you or someone else can use it as a possible idea in the future:
NSString *_names = #"bob!33#localhost #clement!17#localhost jack!03#localhost";
NSError *_error;
NSRegularExpression *_regExp = [NSRegularExpression regularExpressionWithPattern:#" ?#?(.*?)!\\d{2}#localhost" options:NSRegularExpressionCaseInsensitive error:&_error];
NSMutableArray *_namesOnly = [NSMutableArray array];
if (!_error) {
NSLock *_lock = [[NSLock alloc] init];
[_regExp enumerateMatchesInString:_names options:NSMatchingReportProgress range:NSMakeRange(0, _names.length) usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) {
if (result.numberOfRanges > 1) {
if ([_lock tryLock]) [_namesOnly addObject:[_names substringWithRange:[result rangeAtIndex:1]]], [_lock unlock];
}
}];
} else {
NSLog(#"error : %#", _error);
}
the result can be logged...
NSLog(#"_namesOnly : %#", _namesOnly);
...and that will be:
_namesOnly : (
bob,
clement,
jack
)
Or even something as simple as this will do the trick:
NSString *strNames = #"bob!33#localhost #clement!17#localhost jack!03#localhost";
strNames = [[strNames componentsSeparatedByCharactersInSet:[[NSCharacterSet letterCharacterSet] invertedSet]]
componentsJoinedByString:#""];
NSArray *arrNames = [strNames componentsSeparatedByString:#"localhost"];
NSLog(#"%#", arrNames);
Output:
(
bob,
clement,
jack,
""
)
NOTE: Ignore the last element index while iterating or whatever
Assumption:
"localhost" always comes between names
I know it ain't so optimized but it's one way to do this
How can I get the unique characters in an NSString?
What I'm trying to do is get all the illegal characters in an NSString so that I can prompt the user which ones were inputted and therefore need to be removed. I start off by defining an NSCharacterSet of legal characters, separate them with every occurrence of a legal character, and join what's left (only illegal ones) into a new NSString. I'm now planning to get the unique characters of the new NSString (as an array, hopefully), but I couldn't find a reference anywhere.
NSCharacterSet *legalCharacterSet = [NSCharacterSet
characterSetWithCharactersInString:#"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLKMNOPQRSTUVWXYZ0123456789-()&+:;,'.# "];
NSString *illegalCharactersInTitle = [[self.titleTextField.text.noWhitespace
componentsSeparatedByCharactersInSet:legalCharacterSet]
componentsJoinedByString:#""];
That should help you. I couldn't find any ready to use function for that.
NSMutableSet *uniqueCharacters = [NSMutableSet set];
NSMutableString *uniqueString = [NSMutableString string];
[illegalCharactersInTitle enumerateSubstringsInRange:NSMakeRange(0, illegalCharactersInTitle.length) options:NSStringEnumerationByComposedCharacterSequences usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
if (![uniqueCharacters containsObject:substring]) {
[uniqueCharacters addObject:substring];
[uniqueString appendString:substring];
}
}];
Try with the following adaptation of your code:
// legal set
NSCharacterSet *legalCharacterSet = [NSCharacterSet
characterSetWithCharactersInString:#"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLKMNOPQRSTUVWXYZ0123456789-()&+:;,'.# "];
// test strings
NSString *myString = #"LegalStrin()";
//NSString *myString = #"francesco#gmail.com"; illegal string
NSMutableCharacterSet *stringSet = [NSCharacterSet characterSetWithCharactersInString:myString];
// inverts the set
NSCharacterSet *illegalCharacterSet = [legalCharacterSet invertedSet];
// intersection of the string set and the illegal set that modifies the mutable stringset itself
[stringSet formIntersectionWithCharacterSet:illegalCharacterSet];
// prints out the illegal characters with the convenience method
NSLog(#"IllegalStringSet: %#", [self stringForCharacterSet:stringSet]);
I adapted the method to print from another stackoverflow question:
- (NSString*)stringForCharacterSet:(NSCharacterSet*)characterSet
{
NSMutableString *toReturn = [#"" mutableCopy];
unichar unicharBuffer[20];
int index = 0;
for (unichar uc = 0; uc < (0xFFFF); uc ++)
{
if ([characterSet characterIsMember:uc])
{
unicharBuffer[index] = uc;
index ++;
if (index == 20)
{
NSString * characters = [NSString stringWithCharacters:unicharBuffer length:index];
[toReturn appendString:characters];
index = 0;
}
}
}
if (index != 0)
{
NSString * characters = [NSString stringWithCharacters:unicharBuffer length:index];
[toReturn appendString:characters];
}
return toReturn;
}
First of all, you have to be careful about what you consider characters. The API of NSString uses the word characters when talking about what Unicode refers to as UTF-16 code units, but dealing with code units in isolation will not give you what users think of as characters. For example, there are combining characters that compose with the previous character to produce a different glyph. Also, there are surrogate pairs, which only make sense when, um, paired.
As a result, you will actually need to collect substrings which contain what the user thinks of as characters.
I was about to write code very similar to Grzegorz Krukowski's answer. He beat me to it, so I won't but I will add that your code to filter out the legal characters is broken because of the reasons I cite above. For example, if the text contains "é" and it's decomposed as "e" plus a combining acute accent, your code will strip the "e", leaving a dangling combining acute accent. I believe your intent is to treat the "é" as illegal.
I am developing an iOS app using Xcode 4.6.2.
My app receives from the server lets say for example 1000 characters which is then stored in NSString.
What I want to do is: split the 1000 characters to multiple strings. Each string must be MAX 100 characters only.
The next question is how to check when the last word finished before the 100 characters so I don't perform the split in the middle of the word?
A regex-based solution:
NSString *string = // ... your 1000-character input
NSString *pattern = #"(?ws).{1,100}\\b";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern: pattern options: 0 error: &error];
NSArray *matches = [regex matchesInString:string options:0 range:NSMakeRange(0, [string length])];
NSMutableArray *result = [NSMutableArray array];
for (NSTextCheckingResult *match in matches) {
[result addObject: [string substringWithRange: match.range]];
}
The code for the regex and the matches part is taken directly from the docs, so the only difference is the pattern.
The pattern basically matches anything from 1 to 100 characters up to a word boundary. Being a greedy pattern, it will give the longest string possible while still ending with a whole word. This ensures that it won't split any words in the middle.
The (?ws) makes the word recognition work with Unicode's definition of word breaks (the w flag) and treat a line end as any other character (the s flag).
Notice that the algorithm doesn't handle "words" with more than 100 characters well - it will give you the last 100 characters and drop the first part, but that should be a corner case.
(assuming your words are separated by a single space, otherwise use rangeOfCharacterFromSet:options:range:)
Use NSString -- (NSRange)rangeOfString:(NSString *)aString options:(NSStringCompareOptions)mask range:(NSRange)aRange with:
aString as #" "
mask as NSBackwardsSearch
Then you need a loop, where you check that you haven't already got to the end of the string, then create a range (for use as aRange) so that you start 100 characters along the string and search backwards looking for the space. Once you find the space, the returned range will allow you to get the string with substringWithRange:.
(written freehand)
NSRange testRange = NSMakeRange(0, MIN(100, sourceString.length));
BOOL complete = NO;
NSMutableArray *lines = [NSMutableArray array];
while (!complete && (testRange.location + testRange.length) < sourceString.length) {
NSRange hitRange = [sourceString rangeOfString:#"" options:NSBackwardsSearch range:testRange];
if (hitRange.location != NSNotFound) {
[lines addObject:[sourceString substringWithRange:hitRange];
} else {
complete = YES;
}
NSInteger index = hitRange.location + hitRange.length;
testRange = NSMakeRange(index, MIN(100, sourceString.length - index));
}
This can help
- (NSArray *)chunksForString(NSString *)str {
NSMutableArray *chunks = [[NSMutableArray alloc] init];
double sizeChunk = 100.0; // or whatever you want
int length = 0;
int loopSize = ceil([str length]/sizeChunk);
for (int index = 0; index < loopSize; index++) {
NSInteger newRangeEndLimit = ([str length] - length) > sizeChunk ? sizeChunk : ([str length] - length);
[chunks addObject:[str substringWithRange:NSMakeRange(length, newRangeEndLimit)];
length += 99; // Minus 1 from the sizeChunk as indexing starts from 0
}
return chunks;
}
use NSArray *words = [stringFromServer componentsSeparatedBy:#" "];
this will give you words.
if you really need to make it nearest to 100 characters, start appending strings maintaining the total length of the appended strings and check that it should stay < 100.
I received an NSString from the server. Now I want to split it into the substring which I need.
How to split the string?
For example:
substring1:read from the second character to 5th character
substring2:read 10 characters from the 6th character.
You can also split a string by a substring, using NString's componentsSeparatedByString method.
Example from documentation:
NSString *list = #"Norman, Stanley, Fletcher";
NSArray *listItems = [list componentsSeparatedByString:#", "];
NSString has a few methods for this:
[myString substringToIndex:index];
[myString substringFromIndex:index];
[myString substringWithRange:range];
Check the documentation for NSString for more information.
I wrote a little method to split strings in a specified amount of parts.
Note that it only supports single separator characters. But I think it is an efficient way to split a NSString.
//split string into given number of parts
-(NSArray*)splitString:(NSString*)string withDelimiter:(NSString*)delimiter inParts:(int)parts{
NSMutableArray* array = [NSMutableArray array];
NSUInteger len = [string length];
unichar buffer[len+1];
//put separator in buffer
unichar separator[1];
[delimiter getCharacters:separator range:NSMakeRange(0, 1)];
[string getCharacters:buffer range:NSMakeRange(0, len)];
int startPosition = 0;
int length = 0;
for(int i = 0; i < len; i++) {
//if array is parts-1 and the character was found add it to array
if (buffer[i]==separator[0] && array.count < parts-1) {
if (length>0) {
[array addObject:[string substringWithRange:NSMakeRange(startPosition, length)]];
}
startPosition += length+1;
length = 0;
if (array.count >= parts-1) {
break;
}
}else{
length++;
}
}
//add the last part of the string to the array
[array addObject:[string substringFromIndex:startPosition]];
return array;
}