Arabic text rendering with Skia (SkiaSharp)

Arabic text rendering with Skia (SkiaSharp) - arabic

I'm rendering text with SkiaSharp for various platforms. I have problem with arabic script. I've created simple shaping routine and I have nicely connected glyphs. But diacritics is off completely.
Is there an easy way how to render/position diacritics? Like another table, which tells that after this char, put this glyph which represent diacritics in middle position, or something like that.
I'have tried HarfBuzzSharp, it's working ok, except Android (where it is returning zeros to all glyph positions).

Related

How can I standardize the the varying truncating dot characters of UILabel?

I have a plist file which I decode to load data onto my application.
This plist file contains String type values that gets mapped to UILabel's text property.
I noticed that the truncating behavior of the text in the label is not always the same.
To be more specific, the three dots that are added when the text is truncated are, as opposed to my expectation, two kinds: one being ... and the other being ⋯ which appears to be this unicode character in this link.
I checked UILabel's attribute settings but I was unable to find any settings related to this behavior.
Has anyone else experienced this problem and standardized the truncating character to be ...?
Here is the image describing the problem mentioned above. Both labels have 2 lines and have new line escape character inserted between the first line and the second line of text. I am posting a link to this image because apparently I don't have enough reputation to post an image.
varying truncating characters of UILabel

IMO this is a bug in UILabel, and it may be worth opening a Feedback about it.
TL;DR: I recommend using TTTAttributedLabel.
Long-winded answer, because this was such an interesting question:
UILabel uses a different ellipsis based on the language script being truncated. As you've noticed, for most scripts, they use HORIZONTAL ELLIPSIS (…), or something very similar. But for Chinese, Japanese, and Korean (CJK), they use MIDLINE HORIZONTAL ELLIPSIS (⋯), or again, something very similar. The only other exception I've found is Burmese, which uses three circles that I don't recognize.
In my tests, all the following used …: Latin, Cyrillic, Bengali, Arabic, Hebrew, Hindi, Thai, Kannada, Nepali, and Mongolian (I kid. iOS can't layout Mongolian. Nobody can layout Mongolian, but it still uses …). UILabel even uses … for Lao, even though I thought ຯ was specifically for that, but I guess eventually everything becomes Latin.
The problem with UILabel being so clever for CJK and Burmese is that it decides what character to use exclusively by looking at the first character being removed. And it thinks SPACE is Latin (or at least not "special").
So what to do? My recommendation is probably to use TTTAttributedLabel, since it lets you configure the truncation character, and more importantly, is open source so you can fix it if it's not working the way you want.
The second option would be to truncate the text by hand using techniques like the one described in How to change truncate characters in UILabel?. There are probably better ways to do it using CTFrameGetVisibleStringRange instead of constantly shrinking the string until it fits, but I don't know if it's worth the effort. (If that path sounds useful, I could probably write up something that does it. It's just probably not worth the trouble.)
And the final option I know is to replace the SPACE character with an "equivalent" CJK character. The closest I've found that works is HANGUL FILLER (U+3164), but I don't like it. It's too wide, and I expect that it will make Korean uncomfortable to read (but I rarely try to read Korean, so I may be wrong here):
With SPACE: 안녕 하세요
With FILLER: 안녕ㅤ하세요
There's also HALFWIDTH HANGUL FILLER (U+FFA0), which is better, but UILabel seems to make it zero width (this may be a font issue, so maybe worth trying):
With SPACE: 안녕 하세요
With HALF: 안녕ﾠ하세요
let string = "안녕 하세요"
let filler = "\u{3164}"
label.text = string.replacingOccurrences(of: " ", with: filler)
OTOH, you may run into the same problem if you use any other non-CJK characters, like Latin punctuation or Arabic numerals. So this solution may not scale. And you should make sure that Voice Over properly ignores it.

MathJax overarrow too short

I am trying to place an overarrow over a piece of text in MathJax.
I am using a custom font that I declare in the code-
\(\overrightarrow{\style{font-family: mysans, TeX, Arial, sans-serif;}{\text{" + tString + "}}}\)"
It works ok for most letters- for capital W or M , using a couple in a row like "WWW" the overbar is too short.
For lowercase i , using a couple in a row, ie "iii" it is too long. My hunch is that MathJax is using a standard character width size to figure out the length of the overarrow and when the character is much longer or shorter than that size, it calculates the overarrow incorrectly. Is there any way around this?
Thanks!

First off, you generally cannot use custom fonts with MathJax. As the documentation says
Since browsers do not provide APIs to access font metrics, MathJax has to ship with the necessary font data; this font data is generated during development and cannot be generated on the fly. In addition, most fonts do not cover the relevant characters for mathematical layout. Finally, some fonts (e.g. Cambria Math) store important glyphs outside the Unicode range, making them inaccessible to JavaScript.
However, if you are only looking to use custom fonts in text elements, then there is a way to work around this: style the surrounding context and set mtextFontInherit:true for the output jax, cf. e.g. here for HTML-CSS.
Unfortunately, this won't actually help you right now. There's a minor regression in MathJax 2.5 (see this discussion leading to the result you describe). This will be fixed in 2.5.1 and in the mean time you could set noReflows:false for the HTML-CSS output.

Colored text in FireMonkey application

Since FMX doesn't have an equivalent of TRichEdit, how can I output a differently colored text? I'm writing a console (as in a Quake-style console, to clarify) output visual control for my application, and I don't see any way to solve it, except to draw the text myself, complicated by many factors (like scrolling).

Now after thinking about it, since TTextLayout doesn't work as intended, I think that it can be done by creating an array of colored as needed TLabels in a TFlowLayout, but there are some things to consider: performance and memory usage, copy-pasting and word wrap. When I add a string to log, split it into strings so each string is of one color and for each string create a TLabel with text and color set accordingly.

What is a vertical tab?

What was the original historical use of the vertical tab character (\v in the C language, ASCII 11)?
Did it ever have a key on a keyboard? How did someone generate it?
Is there any language or system still in use today where the vertical tab character does something interesting and useful?

Vertical tab was used to speed up printer vertical movement. Some printers used special tab belts with various tab spots. This helped align content on forms. VT to header space, fill in header, VT to body area, fill in lines, VT to form footer. Generally it was coded in the program as a character constant. From the keyboard, it would be CTRL-K.
I don't believe anyone would have a reason to use it any more. Most forms are generated in a printer control language like postscript.
#Talvi Wilson noted it used in python '\v'.
print("hello\vworld")
Output:
hello
world
The above output appears to result in the default vertical size being one line. I have tested with perl "\013" and the same output occurs. This could be used to do line feed without a carriage return on devices with convert linefeed to carriage-return + linefeed.

Microsoft Word uses VT as a line separator in order to distinguish it from the normal new line function, which is used as a paragraph separator.

In the medical industry, VT is used as the start of frame character in the MLLP/LLP/HLLP protocols that are used to frame HL-7 data, which has been a standard for medical exchange since the late 80s and is still in wide use.

It was used during the typewriter era to move down a page to the next vertical stop, typically spaced 6 lines apart (much the same way horizontal tabs move along a line by 8 characters).
In modern day settings, the vt is of very little, if any, significance.

The ASCII vertical tab (\x0B)is still used in some databases and file formats as a new line WITHIN a field. For example:
In the .mer file format to allow new lines within a data field,
FileMaker databases can use vertical tabs as a linefeed (see https://support.microsoft.com/en-gb/kb/59096).

I have found that the VT char is used in pptx text boxes at the end of each line shown in the box in oder to adjust the text to the size of the box.
It seems to be automatically generated by powerpoint (not introduced by the user) in order to move the text to the next line and fix the complete text block to the text box. In the example below, in the position of §:
"This is a text §
inside a text box"

A vertical tab was the opposite of a line feed i.e. it went upwards by one line. It had nothing to do with tab positions. If you want to prove this, try it on an RS232 terminal.

similar to R0byn's experience, i was experimenting with a Powerpoint slide presentation and dumped out the main body of text on the slide, finding that all the places where one would typically find carriage return (ASCII 13/0x0d/^M) or line feed/new line (ASCII 10/0x0a/^J) characters, it uses vertical tab (ASCII 11/0x0b/^K) instead, presumably for the exact reason that dan04 described above for Word: to serve as a "newline" while staying within the same paragraph. good question though as i totally thought this character would be as useless as a teletype terminal today.

I believe it's still being used, not sure exactly. There might be even a key combination of it.
As English is written Left to Right, Arabic Right to Left, there are languages in world that are also written top to bottom. In that case a vertical tab might be useful same as the horizontal tab is used for English text.
I tried searching, but couldn't find anything useful yet.

How to fully justify texts programmatically (Delphi)?

How can I fully justify a block of text (like MS Word does, not only on the right and not only on the left but on both sides)?
I want to justify some texts (mainly arabic text) adjusted to certain screen size (some handheld device screen actually, and its text viewer doesn't have this function) and save this text as justified. So I can reload and reuse it again elsewhere.
(The problem with MS word is, that if you copy the justified text from MS Word and paste it to another editor it'll copy it un-justified).
Update : for now I'm thinking of doing it like this:
get-a-word
get-word-width
add-word-to-total-Word and add-Word-width-to-total-word-width
check if total-Word-width = myscreen-width then continue
else if total-Word-width is between myscree-wdith and (myscreen-width -3) then
add-spaces-To-total-word until it = myscreen-width
This is what I'm thinking now, but I put this question up and hope to see if there is a better solution, or somebody else already implemented it.
PS: I hope I have made my question clear and I'm sorry for bad expression if there is.
edit1 : changed the title to make it more clear.

If you want to justify plain text, you can only add extra spaces to the lines to get them align on the left and right. Unfortunately the character widths differ in fonts; so doing it this way will only work for a certain font, unless you limit yourself to monospaced fonts where all characters have the same size.
If you want a result like in Word, adding spaces won't cut it. Word will not add spaces, but stretch and shrink the existing spaces. This information is lost when you copy and paste it into another app.
Either way, justifying is an optimization problem. If you are interested in a good solution and its implementation: have a look a TeX. For an implementation that works on plain text with monospaced fonts have a look at par

There are some API calls that may help:
ExtTextOut and GetCharacterPlacement
Look at the GCP_JUSTIFY flag for GetCharacterPlacement
ExtTextOut is used by Canvas.TextRect

The problem you are going to face is always going to be differences in the rendering of the font. Word handles full justification by adjusting kerning as well as adjusting the number of pixels between words by a few (either way). The end result is lined up both margins. This pixel adjustment is done BOTH ways, and as evenly as possible.
To properly handle this in your portable device you will have to also perform the same algorithm for the display of the text there.
If this is not possible, then the ONLY way you can even get somewhat close would be to add whitespace between words.

As has been pointed out in other answers Word does full justification by stretching the existing spaces often by very small amounts. This is only possible if you have full control over how your text is drawn on the screen (which word - or any other windows program has).
You only real option in this regard would be to implement your own text viewer on the platform you are targeting. Eg you would need to draw the text on the screen yourself (any platform that allows games should allow you to draw on the screen). However this seems like an awful lot of trouble to get justified text.
Sorry couldn't be of more help.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart