If I have a string in Objective-C called text that is a really long, such as:
NSString *text = [NSString stringWithFormat:#"%#", _sometext];
How can I scan _sometext and add a line break to the string every hundred characters (letters)?
So if I had the _sometext as
It was November. Although it was not yet late, the sky was dark when I
turned into Laundress Passage. Father had finished for the day,
switched off the shop lights and closed the shutters; but so I would
not come home to darkness he had left on the light over the stairs to
the flat.
How can I make it so it puts a line break after
It was November. Although it was not yet late, the sky was dark when I
turned into Laundress Passage
and after
. Father had finished for the day, switched off the shop lights and
closed the shutters; but so I wo
(since those were 100 characters)?
But instead of stopping in the middle of the word, I could it would skip the word and end at the previous word. Example: if the sentence ended at "but so I wo" and it cut of the word "would", it would stop at this instead of "but so I".
The easiest solution I can think of off the top of my head is a two step process.
Step one involves breaking the original string into an array of 100 character strings.
Step two involves joining that array of strings with the newline character.
NSMutableArray *lines = [NSMutableArray array];
while ([originalString length] > 100) {
[lines addObject:[originalString substringToIndex:100]];
originalString = [originalString substringFromIndex:100];
}
[lines addObject: originalString];
NSString *reformattedString = [lines componentsJoinedByString:#"\n"];
You could write code that would use the method rangeOfString:options:range:
to do that.
You'd create an NSRange for the first 100 characters of your string. Then you'd search backwards in that range for a space. When you found a space, you'd add from the beginning of the string to the space to your output string, plus a line break. Then you'd set your search range to the next 100 characters of your string after the space, and again search backwards for a space. Repeat until you've processed the entire string.
See the NSString class reference for the details on the rangeOfString:options:range: method.
Related
I have been having a lot of trouble with NSString's stringWithFormat: method as of late. I have written an object that allows you to align N lines of text (separated by new lines) either centered, right, or left. At the core of my logic is NSString's stringWithFormat. I use this function to pad my strings with spaces on the left or right of individual lines to produce the alignment I want. Here is an example:
NSString *str = #"$3.00" --> 3 dollars
[NSString stringWithFormat:#"%8s", [str cStringUsingEncoding:NSUnicodeStringEncoding]] --> returns --> " $3.00"
As you can see the above example works great, I padded 3 spaces on the left and the resulting text is right aligned/justified. Problems begin to arise when I start to pass in foreign currency symbols, the formatting just straight up does not work. It either adds extra symbols or just returns garbage.
NSString *str = #"Kč1.00" --> 3 Czech Koruna (Czech republic's currency)
[NSString stringWithFormat:#"%8s", [str cStringUsingEncoding:NSUnicodeStringEncoding]] --> returns --> " Kč1.00"
The above is just flat out wrong... Now I am not a string encoding expert but I do know NSString uses the international standardized unicode encoding for special characters well outside basic ASCII domain.
How can I fix my problem? What encoding should I use? I have tried so many different encoding enums I have lost count, everything from NSMACOSRomanEncoding to NSUTF32UnicodeBigEndian.. My last resort will be to just completely ditch using stringWithFormat all together, maybe it was only meant for simple UTF8Strings and basic symbols.
If you want to represent currency, is a lot better if you use a NSNumberFormatter with currency style (NSNumberFormatterCurrencyStyle). It reads the currentLocale and shows the currency based on it. You just need to ask its string representation and append to a string.
It will be a lot easier than managing unicode formats, check a tutorial here
This will give you the required result
NSString *str = #"Kč1.00";
str=[NSString stringWithFormat:#"%#%8#",[#" " stringByPaddingToLength:3 withString:#" " startingAtIndex:0],str];
Out Put : #" Kč1.00";
Just one more trick to achieve this -
If you like use it :)
[NSString stringWithFormat:#"%8s%#",[#"" cStringUsingEncoding:NSUTF8StringEncoding],str];
This will work too.
I have some NSString like :
test = #"this is %25test%25 string";
I am trying to replace test with some arabic text , but it is not replacing exactly as it is :
[test stringByReplacingOccurrencesOfString:#"test" withString:#"اختبار"];
and the result is :
this is %25 اختبار %25 string
Some where I read there could be some problem with encoding or text alignment.Is there extra adjustment needed to be done for arabic string operations .
EDIT : I have used NSMutable string insert property but still the same result .
EDIT 2:
One other thing that occurs to me that is causing most of your trouble in this specific example. You have a partially percent-encoded string above. You have spaces, but you also have %25. You should avoid doing that. Either percent-encode a string or don't. Convert it all at once when required (using stringByAddingPercentEscapesUsingEncoding:). Don't try to "hard-code" percent-encoding. If you just used "this is a %اختبار% string" (and then percent-encoded the entire thing at the end), all your directional problems would go away (see how that renders just fine?). The rest of these answers address the more general question when you really need to deal with directionality.
EDIT:
The original answer after the line relates to human-readable strings, and is correct for human-readable strings, but your actual question (based on your followups) is about URLs. URLs are not human-readable strings, even if they occasionally look like them. They are a sequence of bytes that are independent of how they are rendered to humans. "اختبار" cannot be in the path or fragment parts of an URL. These characters are not part of the legal set of characters for those sections (اختبار is allowed to be part of the host, but you have to follow the IDN rules for that).
The correct URL encoding for this is a %25<arabic>%25 string is:
this%20is%20a%20%2525%D8%A7%D8%AE%D8%AA%D8%A8%D8%A7%D8%B1%2525%20string
If you decode and render this string to the screen, it will appear like this:
this is a %25اختبار%25 string
But it is in fact exactly the string you mean (and it is the string you should pass to the browser). Follow the bytes (like the computer will):
this - this (ALPHA)
%20 - <space> (encoded)
is - is (ALPHA)
%20 - <space> (encoded)
a - a (ALPHA)
%20 - <space> (encoded)
%25 - % (encoded)
25 - 25 (DIGIT)
%D8%A7 - ا (encoded)
%D8%AE - خ (encoded)
%D8%AA - ت (encoded)
%D8%A8 - ب (encoded)
%D8%A7 - ا (encoded)
%D8%B1 - ر (encoded)
%25 - % (encoded)
25 - 25 (DIGIT)
%20 - <space> (encoded)
string - string (ALPHA)
The Unicode BIDI display algorithm is doing what it means to do; it just isn't what you expect. But those are the bytes and they're in the correct order. If you add any additional bytes (such as LRO) to this string, then you are modifying the URL and it means something different.
So the question you need to answer is, are you making an URL, or are you making a human-readable string? If you're making an URL, it should be URL-encoded, in which case you will not have this display problem (unless this is part of the host, which is a different set of rules, but I don't believe that's your problem). If this is a human-readable string, see below about how to provide hints and overrides to the BIDI algorithm.
It's possible that you really need both (a human-friendly string, and a correct URL that can be pasted). That's fine, you just need to handle the clipboard yourself. Show the string, but when the user goes to copy it, replace it with the fully encoded URL using UIPasteboard or by overriding copy:. See Copy, Cut, and Paste Operations. This is fairly common (note how in Safari, it displays just "stackoverflow.com" in the address bar but if you copy and paste it, it pastes "https://stackoverflow.com/" Same thing.
Original answer related to human-readable strings.
Believe it or not, stringByReplacingOccuranceOfString: is doing the right thing. It's just not displaying the way you expect. If you walk through characterAtIndex:, you'll find that it's:
% 2 5 ا ...
The problem is that the layout engine gets very confused around all the "neutral direction" characters. The engine doesn't understand whether you meant "%25" to be attached to the left to right part or right to left part. You have to help it out here by giving it some explicit directional characters to work with.
There are a few ways to go about this. First, you can do it the Unicode 6.3 tr9-29 way with Explicit Directional Isolates. This is exactly the kind of problem that Isolates are meant to solve. You have some piece of text whose direction you want to be considered completely independently of all other text. Unicode 6.3 isn't actually supported by iOS or OS X as best I can tell, but for many (though not all) uses, it "works."
You want to surround your Arabic with FSI (FIRST STRONG ISOLATE U+2068) and PDI (POP DIRECTIONAL ISOLATE U+2069). You could also use RLI (RIGHT-TO-LEFT ISOLATE) to be explicit. FSI means "treat this text as being in the direction of the first strong character you find."
So you could ideally do this:
NSString *test = #"this is a %25\u2068test\u2069%25 string";
NSString *arabic = #"اختبار";
NSString *result = [test stringByReplacingOccurrencesOfString:#"test" withString:arabic];
That works if you know what you're going to substitute before hand (so you know where to put the FSI and PDI). If you don't, you can do it the other way and make it part of the substitution:
NSString * const FSI = #"\u2068";
NSString * const PDI = #"\u2069";
NSString *test = #"this is %25test%25 string";
NSString *arabic = #"اختبار";
NSString *replaceString = [#[FSI, arabic, PDI] componentsJoinedByString:#""];
NSString *result = [test stringByReplacingOccurrencesOfString:#"test" withString:replaceString];
I said this "mostly" works. It's fine for UILabel, and it probably is fine for anything using Core Text. But in NSLog output, you'll get these extra "placeholder" characters:
You might get this other places, too. I haven't checked UIWebView for instance.
So there are some other options. You can use directional marks. It's a little awkward, though. LRM and RLM are zero-width strongly directional characters. So you can bracket the arabic with LRM (left to right mark) so that the arabic doesn't disturb the surrounding text. This is a little ugly since it means the substitution has to be aware of what it's substituting into (which is why isolates were invented).
NSString * const LRM = #"\u200e";
NSString *test = #"this is a %25test%25 string";
NSString *replaceString = [#[LRM, arabic, LRM] componentsJoinedByString:#""];
NSString *result = [test stringByReplacingOccurrencesOfString:#"test" withString:replaceString];
BTW, Directional Marks are usually the right answer. They should always be the first thing you try. This particular problem is just a little too tricky.
One more way is to use Explicit Directional Overrides. These are the giant "do what I tell you to do" hammer of the Unicode world. You should avoid them whenever possible. There are some security concerns with them that make them forbidden in certain places (<RLO>elgoog<PDF>.com would display as google.com for instance). But they will work here.
You bracket the whole string with LRO/PDF to force it to be left-to-right. You then bracket the substitution with RLO/PDF to force it to the right-to-left. Again, this is a last resort, but it lets you take complete control over the layout:
NSString * const LRO = #"\u202d";
NSString * const RLO = #"\u202e";
NSString * const PDF = #"\u202c";
NSString *test = [#[LRO, #"this is a %25test%25 string", PDF] componentsJoinedByString:#""];
NSString *arabic = #"اختبار";
NSString *replaceString = [#[RLO, arabic, PDF] componentsJoinedByString:#""];
NSString *result = [test stringByReplacingOccurrencesOfString:#"test" withString:replaceString];
I would think you could solve this problem with the Explicit Directional Embedding characters, but I haven't really found a way to do it without at least one override (for instance, you could use RLE instead of RLO above, but you still need the LRO).
Those should give you the tools you need to figure all of this out. See the Unicode TR9 for the gory details. And if you want a deeper introduction to the problem and solutions, see Cal Henderson's excellent Understanding Bidirectional (BIDI) Text in Unicode.
You should try like this:
NSString *test = #"this is %25test%25 string";
NSString *test2 = [[[test stringByReplacingPercentEscapesUsingEncoding:NSStringEncodingConversionAllowLossy] componentsSeparatedByString:#"test"] componentsJoinedByString:#"اختبار"];
creating a label this style :
CCLabelBMFont *label1_=
[CCLabelBMFont labelWithString:#"description: -" fntFile:#"comicsans.fnt" width:270 alignment:kCCTextAlignmentLeft];
and:
[label1_ setString:
#"someText\n and some newline \nand another new line too but this is last"];
this string have 2 escape characters for new line as seen.and when I set this Im losing last 2 words
its shown something like this
someText
and some newline
and another new line too but this is la
so last two letters lost somehow.
what could be a reason for this problem ?
a cocos2d v2.1(stable) bug or Im in a horror film ?if so what should I do ?
\r does same effect as \n
dont know why. may be you know.
if I dont use \r \n escape characters;CCLabelFont String shows correct string.without losing any amount of characters tailing.
so my temporal solution is removing escape characters from string fix problem.
but this not fixes bug for cocos2d v2.1 (stable).
I think CCLabel kind of classes cannot calculate doesnt work stable if there is \n escape characters.
I had the same problem since I was using CCLabelBMFont to animate text typing.
I realized that whenever the text to type has newlines \n, CCLabelBMFont will not type the trailing characters.
I resolved this issue through a simple hack.
First I count the number of newlines in the text to be displayed by CCLabelBMFont.
NSRegularExpression *regx = [NSRegularExpression regularExpressionWithPattern:#"\n"
options:0
error:nil];
NSUInteger newlinesCount = [regx numberOfMatchesInString:typeString
options:0
range:NSMakeRange(0, typeString.length)];
Then I just append some white spaces at the end of the string that I'm about to type. The number of white spaces to add equals to the number of newlines.
for (int i = 0; i < newlinesCount; i++) {
typeString = [typeString stringByAppendingString:#" "];
}
// This sets the string for the BMFont, it should now display all the characters
// that you wanted to type originally.
[self.labelBMFont setString:typeString];
Tested on cocos2d 2.1
So as I work my way through understanding string methods, I came across this useful class
NSCharacterSet
which is defined in this post quite well as being similar to a string excpet it is used for holding the char in an unordered set
What is differnce between NSString and NSCharacterset?
So then I came across the useful method invertedSet, and it bacame a little less clear what was happening exactly. Also I a read page a fter page on it, they all sort of glossed over the basics of what was happening and jumped into advanced explainations. So if you wanted to know what this is and why we use It SIMPLY put, it was not so easy instead you get statements like this from the apple documentation: "A character set containing only characters that don’t exist in the receiver." - and how do I use this exactly???
So here is what i understand to be the use. PLEASE provide in simple terms if I have explained this incorrectly.
Example Use:
Create a list of Characters in a NSCharacterSetyou want to limit a string to contain.
NSString *validNumberChars = #"0123456789"; //Only these are valid.
//Now assign to a NSCharacter object to use for searching and comparing later
validCharSet = [NSCharacterSet characterSetWithCharactersInString:validNumberChars ];
//Now create an inverteds set OF the validCharSet.
NSCharacterSet *invertedValidCharSet = [validCharSet invertedSet];
//Now scrub your input string of bad character, those characters not in the validCharSet
NSString *scrubbedString = [inputString stringByTrimmingCharactersInSet:invertedValidCharSet];
//By passing in the inverted invertedValidCharSet as the characters to trim out, then you are left with only characters that are in the original set. captured here in scrubbedString.
So is this how to use this feature properly, or did I miss anything?
Thanks
Steve
A character set is a just that - a set of characters. When you invert a character set you get a new set that has every character except those from the original set.
In your example you start with a character set containing the 10 standard digits. When you invert the set you get a set that has every character except the 10 digits.
validCharSet = [NSCharacterSet characterSetWithCharactersInString:validNumberChars];
This creates a character set containing the 10 characters 0, 1, ..., 9.
invertedValidCharSet = [validCharSet invertedSet];
This creates the inverted character set, i.e. the set of all Unicode characters without
the 10 characters from above.
scrubbedString = [inputString stringByTrimmingCharactersInSet:invertedValidCharSet];
This removes from the start and end of inputString all characters that are in
the invertedValidCharSet. For example, if
inputString = #"abc123d€f567ghj😄"
then
scrubbedString = #"123d€f567"
Is does not, as you perhaps expect, remove all characters from the given set.
One way to achieve that is (copied from NSString - replacing characters from NSCharacterSet):
scrubbedString = [[inputString componentsSeparatedByCharactersInSet:invertedValidCharSet] componentsJoinedByString:#""]
This is probably not the most effective method, but as your question was about understanding
NSCharacterSet I hope that it helps.
I am reading in 2-5 mb large txt files into nsstrings that I want to present in a uitextview.
Now I have experienced that it takes too long to read in the whole string especially since I am removing line breaks every time.
So I have decided separate the NSString into individual pages that I can navigate by two buttons ("previous" and "next"). The first thing I did was to decide to separate the NSString into 500 characters long substrings (then I remove line breaks before I present the string).
Now this works great and is fast enough, but there is one little problem that annoys me, the last word in the substring that is presented oftentimes gets cut off in the middle.
So what I did then was instead of using 500 characters to separate the substrings, to use 20 dots/periods ". ". This turned out to work very good also until I realized that not all txt files and texts that might be loaded will contain dots/periods because some languages might not have dots.
So I am looking for a solution where I can separate long text files into smaller substrings, about a page long, that I can navigate and that do not cut off the last word in the the substring in half. Any help would be appreciated.
Also I should add that I have tried to separate after x number words (i.e. x number white spaces) which I think might be the best solution but I cannot think of any other way but componentsSeparatedByString:#". " which takes to long because it goes through the whole string.
Is there some good way of enumerating a string but that still allows be to navigate through pages, perhaps by saving the substring range location or something?
Store all your data in a NSMutableArray, then enumerate by sentences in your string and add to a mutable array.
NSString * string = #"YOUR VERY LONG FILE";
NSMutableArray *array = [NSMutableArray arrayWithCapacity:100];
NSRange range = NSMakeRange(0, string.length);
[string enumerateSubstringsInRange:range options:NSStringEnumerationBySentences usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
NSLog(#"substring: %#", substring);
NSLog(#"substringRange len:%lu, loc:%lu", substringRange.length, substringRange.location);
NSLog(#"enclosingRange len:%lu loc:%lu",enclosingRange.length, enclosingRange.location);
[array addObject:substring];
}];
Also take a look at other NSStringEnumeration options like ByLines, ByParagraphs or ByWords.