Detect actual text direction in NSString or NSAttributedString - ios

Question
Is there a definitive way to detect if the text direction in NSString or NSAttributedString is right-to-left? Text is received from external resource, not entered in the app (can't use UITextInput protocol methods).
Bad solution
I have been detecting it using CFStringTokenizerCopyBestStringLanguage() and then +[NSLocale characterDirectionForLanguage:], but it is pretty unreliable for strings, where there are only a few arabic characters and many latin. They are processes as a RTL when displayed in the label, but incorrect direction detection makes inconsistent behaviour.
Investigation
During development, I have found out that when creating NSAttributedString with no attributes except for the font and displaying it using TTTAttributedLabel, RTL text aligns to the right edge. When using simple NSString, or when using standard UILabel with attributed string, alignment stays left.
So, there has to be something in the text that says "it is RTL" and there is some method to detect it.
I analysed what I am receiving from server, and there were no standard Unicode bidi chars. This is how it looks. JSON is sent in ascii with all unicode characters escaped:
"\u0643\u062a\u0628\u062a \u0648\u0648\u062c\u0650 .."
When unescaping, it looks like this, with \u0643 is on the far right before the dots:
"كتبت ووجِ .."
As far as I can tell, every character represents a single character in arabic language, and there are no U+202B or similar control characters which could be used to easily detect direction.
Yet, when rendering this text through TTTAttributedLabel, it aligns right. I started looking into the source code, and it doesn't do anything to detect the direction or set the alignment. It only created the framesetter using CTFramesetterCreateWithAttributedString() and then, when it gets lines from it using CTFrameGetLines() and line positions with CTFrameGetLineOrigins(), they are already aligned to the right.
So, does anyone know if direction can be detected solely from the text with publicly available (and preferably fast) API methods?

Check This link that helps to you Automatically align text in UILabel according to text language
Change this for right and left alignment
if (isCodePointStrongRTL(utf32chars[i]))
return 2;
if (isCodePointStrongLTR(utf32chars[i]))
return 0;

saw this comment on other question, and thought it's important for others to check this out. this was exactly what I was looking for RTL-LTR :
"
but I've been using a solution to explicitly set the direction based on known RTL languages, which used this as a starting point:
https://stackoverflow.com/a/16309559
"

Related

How can I standardize the the varying truncating dot characters of UILabel?

I have a plist file which I decode to load data onto my application.
This plist file contains String type values that gets mapped to UILabel's text property.
I noticed that the truncating behavior of the text in the label is not always the same.
To be more specific, the three dots that are added when the text is truncated are, as opposed to my expectation, two kinds: one being ... and the other being ⋯ which appears to be this unicode character in this link.
I checked UILabel's attribute settings but I was unable to find any settings related to this behavior.
Has anyone else experienced this problem and standardized the truncating character to be ...?
Here is the image describing the problem mentioned above. Both labels have 2 lines and have new line escape character inserted between the first line and the second line of text. I am posting a link to this image because apparently I don't have enough reputation to post an image.
varying truncating characters of UILabel
IMO this is a bug in UILabel, and it may be worth opening a Feedback about it.
TL;DR: I recommend using TTTAttributedLabel.
Long-winded answer, because this was such an interesting question:
UILabel uses a different ellipsis based on the language script being truncated. As you've noticed, for most scripts, they use HORIZONTAL ELLIPSIS (…), or something very similar. But for Chinese, Japanese, and Korean (CJK), they use MIDLINE HORIZONTAL ELLIPSIS (⋯), or again, something very similar. The only other exception I've found is Burmese, which uses three circles that I don't recognize.
In my tests, all the following used …: Latin, Cyrillic, Bengali, Arabic, Hebrew, Hindi, Thai, Kannada, Nepali, and Mongolian (I kid. iOS can't layout Mongolian. Nobody can layout Mongolian, but it still uses …). UILabel even uses … for Lao, even though I thought ຯ was specifically for that, but I guess eventually everything becomes Latin.
The problem with UILabel being so clever for CJK and Burmese is that it decides what character to use exclusively by looking at the first character being removed. And it thinks SPACE is Latin (or at least not "special").
So what to do? My recommendation is probably to use TTTAttributedLabel, since it lets you configure the truncation character, and more importantly, is open source so you can fix it if it's not working the way you want.
The second option would be to truncate the text by hand using techniques like the one described in How to change truncate characters in UILabel?. There are probably better ways to do it using CTFrameGetVisibleStringRange instead of constantly shrinking the string until it fits, but I don't know if it's worth the effort. (If that path sounds useful, I could probably write up something that does it. It's just probably not worth the trouble.)
And the final option I know is to replace the SPACE character with an "equivalent" CJK character. The closest I've found that works is HANGUL FILLER (U+3164), but I don't like it. It's too wide, and I expect that it will make Korean uncomfortable to read (but I rarely try to read Korean, so I may be wrong here):
With SPACE: 안녕 하세요
With FILLER: 안녕ㅤ하세요
There's also HALFWIDTH HANGUL FILLER (U+FFA0), which is better, but UILabel seems to make it zero width (this may be a font issue, so maybe worth trying):
With SPACE: 안녕 하세요
With HALF: 안녕ᅠ하세요
let string = "안녕 하세요"
let filler = "\u{3164}"
label.text = string.replacingOccurrences(of: " ", with: filler)
OTOH, you may run into the same problem if you use any other non-CJK characters, like Latin punctuation or Arabic numerals. So this solution may not scale. And you should make sure that Voice Over properly ignores it.

NSAttributedString: Strikethrough text with replacement text above it

I am trying to draw an NSAttributedString (actually, a constructed NSMutableAttributedString) where the "original" text has been struck and replacement text inserted above it (I'm trying to replicate the look/feel of an Ancient Greek manuscript).
My technique is a combination of NSBaselineOffsetAttributeName with NSKernAttributeName, but it appears that using a negative value for NSKernAttributeName "wipes away" the strikethrough of the text, even if the characters don't overlap.
If I put an extra space after the "A" character (in the original text), the "A" gets the strikethrough, but the "EI" is also offset to the right. So, it appears that the offset/kerning of the "EI" text affects how much of the strikethrough actually occurs.
Here's what I'd like to reproduce (I don't care about the angle; it's not about a picture-perfect reproduction; just the gist):
Here's what is currently happening:
This is when I add an extra space after the strikethrough:
So, the only other thing I can think of would be to render a separate NSAttributedString in the correct place, separate from the current one, but I have no idea how to calculate the location of a specific character in an NSAttributedString when it's drawn. I'm drawing to a PDF, not to any on-screen control like a UILabel. Alternatively, I could draw the "strikethrough" myself as a line, but that seems to still require knowing the coordinates for the text in question, which is calculated on-the-fly, and I hope to use this method to reproduce a large sample of ancient texts, which means doing it by hand just isn't a good answer here.
Anything I'm missing, or any out-of-the-box ideas to try?

Center, or otherwise format, unicode symbol in NSString w/Objective-C

DISCLAIMER: The information in the picture is completely fake and is for testing purposes.
I have a unicode character that I am using in titleForHeaderInSection prepended on to an NSString. Some mock text would look like this:
[NSString stringWithFormat:#"\U0001F512 %#", fullTitle]
Problem is, it's not centered with the rest of the text:
Is there a way to nudge that padlock up?
Unfortunately I can say with 99.9% certainty having tried this in the past that it won't work with standard strings, sadly. The way I wound up solving it was using constraints and getting the top and bottom alignment matching between the two labels. I would not recommend messing with frames because it'll only break on rotation (assuming you support it).
Attributed strings may fix it but like you found in the docs, probably not because attributed strings seem to only apply centering on text.
The upside to using the constraint approach is it makes it far easier to internationalize your strings later - can't tell you how much trouble I had with translators mangling the Unicode characters.

How to stop UILabel from replacing "..." with an ellipsis character

I have an iOS app which uses fixed width font label extensively.
After changing to the iOS 7 sdk and build target 6.1, all the label automagically replace occurences of three punctuation marks with an ellipsis character. This breaks a lot of stuff and looks weird, since the ellipsis character is not present in the font I use, and iOS sees fit to use one from a different font.
How do I stop this behaviour?
This is a ligature, and iOS seems to replace them automatically (like fl becomes fl). Seems like there are some options to disable them, see this question: Adjoining "f" and "l" characters
Alternative number three: insert a zero-width space (U+200B) between the dots.
(Posted as an answer per request of the OP)
One way around this is to replace the ASCII periods with a unicode 2024 character ("ONE DOT LEADER"). It looks exactly like a period but should not get converted automatically.
What you could do if this is widespread is to change all your UILabels to a subclass, MyLabel, and intercept messages to set the text, look for three dots, and if found change them to the unicode character above.
Yeah, this is a big PITA but I know of no other workaround.
EDIT
Another idea - find an open source UILabel (there must be at least one) and use it.
Another alternative : the ellipsis is a true character of its own. Why don't you try to add it yourself in your font (with Fontlab, FontForge or Glyphs) at the same width than the other characters?

How to fully justify texts programmatically (Delphi)?

How can I fully justify a block of text (like MS Word does, not only on the right and not only on the left but on both sides)?
I want to justify some texts (mainly arabic text) adjusted to certain screen size (some handheld device screen actually, and its text viewer doesn't have this function) and save this text as justified. So I can reload and reuse it again elsewhere.
(The problem with MS word is, that if you copy the justified text from MS Word and paste it to another editor it'll copy it un-justified).
Update : for now I'm thinking of doing it like this:
get-a-word
get-word-width
add-word-to-total-Word and add-Word-width-to-total-word-width
check if total-Word-width = myscreen-width then continue
else if total-Word-width is between myscree-wdith and (myscreen-width -3) then
add-spaces-To-total-word until it = myscreen-width
This is what I'm thinking now, but I put this question up and hope to see if there is a better solution, or somebody else already implemented it.
PS: I hope I have made my question clear and I'm sorry for bad expression if there is.
edit1 : changed the title to make it more clear.
If you want to justify plain text, you can only add extra spaces to the lines to get them align on the left and right. Unfortunately the character widths differ in fonts; so doing it this way will only work for a certain font, unless you limit yourself to monospaced fonts where all characters have the same size.
If you want a result like in Word, adding spaces won't cut it. Word will not add spaces, but stretch and shrink the existing spaces. This information is lost when you copy and paste it into another app.
Either way, justifying is an optimization problem. If you are interested in a good solution and its implementation: have a look a TeX. For an implementation that works on plain text with monospaced fonts have a look at par
There are some API calls that may help:
ExtTextOut and GetCharacterPlacement
Look at the GCP_JUSTIFY flag for GetCharacterPlacement
ExtTextOut is used by Canvas.TextRect
The problem you are going to face is always going to be differences in the rendering of the font. Word handles full justification by adjusting kerning as well as adjusting the number of pixels between words by a few (either way). The end result is lined up both margins. This pixel adjustment is done BOTH ways, and as evenly as possible.
To properly handle this in your portable device you will have to also perform the same algorithm for the display of the text there.
If this is not possible, then the ONLY way you can even get somewhat close would be to add whitespace between words.
As has been pointed out in other answers Word does full justification by stretching the existing spaces often by very small amounts. This is only possible if you have full control over how your text is drawn on the screen (which word - or any other windows program has).
You only real option in this regard would be to implement your own text viewer on the platform you are targeting. Eg you would need to draw the text on the screen yourself (any platform that allows games should allow you to draw on the screen). However this seems like an awful lot of trouble to get justified text.
Sorry couldn't be of more help.

Resources